Workshop List

Monday Workshops (7th September 2015)

Room: D. Maria Half Day
14:00 - 15:30; 16:00 - 17:45

Description:

This workshop will provide a platform for discussing the recent developments in the area of algorithm selection and configuration which arises in many diverse domains, such as machine learning, data mining, optimization and satisfiability solving. Algorithm Selection and configuration are increasingly relevant today. Researchers and practitioners from all branches of science and technology face a large choice of parameterized machine learning algorithms, with little guidance as to which techniques to use. Moreover, data mining challenges frequently remind us that algorithm selection and configuration are crucial in order to achieve the best performance, and drive industrial applications. Meta-learning leverages knowledge of past algorithm applications to select the best techniques for future applications, and offers effective techniques that are superior to humans both in terms of the end result and especially in the time required to achieve it. In this workshop we will discuss different ways of exploiting meta-learning techniques to identify the potentially best algorithm(s) for a new task, based on meta-level information and prior experiments. We also discuss the prerequisites for effective meta-learning systems, for example infrastructure such as OpenML.org.Many contemporary problems also require that solutions be elaborated in the form of complex systems or workflows which include many different processes or operations. Constructing such complex systems or workflows requires extensive expertise, and could be greatly facilitated by leveraging planning, meta-learning and intelligent system design. This task is inherently interdisciplinary, as it builds on expertise in various areas of AI.

Website:
http://metasel2015.inesctec.pt/

Workshop Organizers:
Pavel Brazdil
Joaquin Vanschoren
Lars Kotthoff
Christophe Giraud-Carrier
Room: Porto Full Day
09:00 - 10:30; 11:00 - 12:45; 14:00 - 15:30; 16:00 - 17:45

Description:

The number of very large data repositories (big data) is increasing in a rapid pace. Analysis of such repositories requires, using the "traditional" sequential implementations of ML and statistical algorithms, expensive computational resources and long running times. Parallel or distributed computing is one possible approaches that can make analysis of very large repositories feasible.  Taking advantage of a parallel or a distributed execution a ML/statistical system may: i) increase its speed; ii) search a larger space and reach a better solution or; iii) increase the range of applications where it can be used (because it can process more data, for example).  Parallel and distributed computing is therefore of high importance for Knowledge Discovery in Databases (KDD) practitioners.

The workshop will be concerned with the exchange of experience among researchers that use parallel or distributed computing within KDD. Researchers will present recently developed algorithms/systems, on going work and applications taking advantage of such parallel or distributed environments.

Website:
http://pdckdd.fe.up.pt/

Workshop Organizers:
Rui Camacho
André Carvalho
Nuno Fonseca
Room: Arrábida Half Day
14:00 - 15:30; 16:00 - 17:45

Description:

DMNLP'15 will be the second edition of the Data Mining and Natural Language Processing (DMNLP) workshop and will be held in conjunction with ECML-PKDD 2015 in Porto, Portugal. The previous edition,DMNLP'14, was held in conjunction with the ECML-PKDD 2014 in Nancy, France. On the one hand, in the field of Natural Language Processing (NLP), numerical Machine Learning methods (e.g., SVM, CRF) have been intensively explored and applied. Despite the good results obtained by the numerical methods, one major drawback is that they do not provide a human readable model. A promising direction is the integration of symbolic knowledge. On the other hand, research in Data Mining has progressed significantly in the last decades, through the development of advanced algorithms and techniques to extract knowledge from data in different forms. In particular, for two decades Pattern Mining has been one of the most active field in Knowledge Discovery. Recently, a new field has emerged taking benefit of both domains: Data Mining and NLP. The objective of DMNLP is thus to provide a forum to discuss how Data Mining can be interesting for NLP tasks, providing symbolic knowledge, but also how NLP can enhance data mining approaches by providing richer and/or more complex information to mine and by integrating linguistics knowledge directly in the mining process.

The workshop aims at bringing together researchers from both communities in order to stimulate discussions about the cross-fertilization of those two research fields. The idea of this workshop is to discuss future directions and new challenges emerging from the cross-fertilization of Data Mining and NLP and in the same time initiate collaborations between researchers of both communities.

Website:
http://dmnlp.loria.fr/

Workshop Organizers:
Peggy Cellier
Thierry Charnois
Andreas Hotho
Stan Matwin
Marie-Francine Moens
Yannick Toussaint
Room: São João Full Day
09:00 - 10:30; 11:00 - 12:45; 14:00 - 15:30; 16:00 - 17:45

Description:

Modern automatic systems are able to collect huge volumes of data, often with a complex structure (e.g. multi-table data, XML data, web data, time series and sequences, graphs and trees). This fact poses new challenges for current information systems with respect to storing, managing and mining these big sets of complex data. The purpose of this workshop is  to bring together researchers and practitioners of data mining who are interested in the advances and latest developments in the area of extracting patterns from complex data sources like blogs, event or log data, medical data, spatio-temporal data, social networks, mobility data, sensor data and streams,  and so on. The workshop aims at integrating recent results from existing fields such as data mining, statistics, machine learning and relational databases to discuss and introduce new algorithmic foundations and representation formalisms in pattern discovery. We are interested in advanced techniques which preserve the informative richness of data and allow us to efficiently and efficaciously identify complex information units present in such data. 

Website:
http://www.di.uniba.it/~loglisci/nfMCP15/

Workshop Organizers:
Michelangelo Ceci
Corrado Loglisci
Giuseppe Manco
Elio Masciari
Zbigniew Ras
Room: Miragaia Full Day
09:00 - 10:30; 11:00 - 12:45; 14:00 - 15:30; 16:00 - 17:45

Description:

The emergence of ubiquitous computing has started to create new environments consisting of small, heterogeneous, and distributed devices that foster the social interaction of users in several dimensions. Similarly, the upcoming social web also integrates the user interactions in social networking environments. In typical ubiquitous settings, the mining system can be implemented inside the small devices and sometimes on central servers, for real-time applications, similar to common mining approaches. However, the characteristics of ubiquitous and social mining in general are quite different from the current mainstream data mining and machine learning. Unlike in traditional data mining scenarios, data does not emerge from a small number of (heterogeneous) data sources, but potentially from hundreds to millions of different sources. Often there is only minimal coordination and thus these sources can overlap or diverge in many possible ways. Steps into this new and exciting application area are the analysis of this new data, the adaptation of well known data mining and machine learning algorithms and finally the development of new algorithms. Mining big data in ubiquitous and social environments is an emerging area of research focusing on advanced systems for data mining in such distributed and network-organized systems. Therefore, for this workshop, we aim to attract researchers from all over the world working in the field of data mining and machine learning with a special focus on analyzing big data in ubiquitous and social environments. The goal of this workshop is to promote an interdisciplinary forum for researchers working in the fields of ubiquitous computing, mobile sensing, social web, Web 2.0, and social networks which are interested in utilizing data mining in a ubiquitous setting. The workshop seeks for contributions adopting state-of-the-art mining algorithms on ubiquitous social data. Papers combining aspects of the two fields are especially welcome. In short, we want to accelerate the process of identifying the power of advanced data mining operating on data collected in ubiquitous and social environments, as well as the process of advancing data mining through lessons learned in analyzing these new data.

Website:
http://www.kde.cs.uni-kassel.de/ws/muse2015

Workshop Organizers:
Martin Atzmueller
Florian Lemmerich

Friday Workshops (11th September 2015)

Room: Porto Half Day
14:30 - 16:00; 16:15 - 18:00

Description:

Temporal data are frequently encountered in a wide range of domains such as bio-informatics, medicine, finance and engineering, among many others. They are naturally present in applications covering language, motion and vision analysis, or more emerging ones as energy ecient building, smart cities, dynamic social media or sensor networks. Contrary to static data, temporal data are of complex nature, they are generally noisy, of high dimensionality, they may be non stationary (i.e. rst order statistics vary with time) and irregular (involving several time granularities), they may have several invariant domain-dependent factors as time delay, translation, scale or tendency eects. These temporal peculiarities make limited the majority of standard statistical models and machine learning approaches, that mainly assume i.i.d data, homoscedasticity, normality of residuals, etc. To tackle such challenging temporal data, one appeals for new advanced approaches at the bridge of statistics, time series analysis, signal processing and machine learning. Defining new approaches that transcend boundaries between several domains to extract valuable information from temporal data is undeniably a hot topic in the near future, that has been yet the subject of active research this last decade. The aim of this workshop is to bring together researchers and experts in machine learning, data mining, pattern analysis and statistics to share their challenging issues and advance researches on temporal data analysis. Analysis and learning from temporal data cover a wide scope of tasks including learning metrics, learning representations, unsupervised feature extraction, clustering and classication.

Website:
http://ama.liglab.fr/aaltd_ecml2015/

Workshop Organizers:
Ahlame Douzal-Chouakria
José Vilar Fernández
Pierre-François Marteau
Ann Maharaj
Andrés Alonso
Edoardo Otranto
Room: São João Half Day
14:30 - 16:00; 16:15 - 18:00

Description:

Adaptive reuse of learnt knowledge is of critical importance in the majority of knowledge-intensive application areas, particularly when the context in which the learnt model operates can be expected to vary from training to deployment. In machine learning this has been studied, for example, in relation to variations in class and cost skew in (binary) classification, leading to the development of tools such as ROC analysis to adjust decision thresholds to operating conditions concerning class and cost skew. More recently, considerable effort has been devoted to research on transfer learning, domain adaptation, and related approaches.

Given that the main business of predictive machine learning is to generalise from training to deployment, there is clearly scope for developing a general notion of operating context. Without such a notion, a model predicting sales in Prague for this week may perform poorly in Nancy for next Wednesday. The operating context has changed in terms of location as well as resolution. While a given predictive model may be sufficient and highly specialised for one particular operating context, it may not perform well in other contexts. If sufficient training data for the new context is available it might be feasible to retrain a new model; however, this is generally not a good use of resources, and one would expect it to be more cost-effective to learn one general, versatile model that effectively generalizes over multiple and possibly previously unseen contexts.

The aim of this workshop is to bring together people working in areas related to versatile models and model reuse over multiple contexts. Given the advances made in recent years on specific approaches such as transfer learning, an attempt to start developing an overarching theory is now feasible and timely, and can be expected to generate considerable interest from the machine learning community.

LMCE 2015 follows our previous workshop at ECML-PKDD 2014 (http://users.dsic.upv.es/~flip/LMCE2014/). Finally, we would like to point out that we also organise a challenge (http://reframe-d2k.org/index.php/Challenge) associated with this workshop.

Website:
http://users.dsic.upv.es/~flip/LMCE2015/

Workshop Organizers:
Nicolas Lachiche
Adolfo Martínez-Usó
Meelis Kull
Room: Arrábida Half Day
10:00 - 11:15; 11:30 - 13:00

Description:

Linked Data have attracted a lot of attention in recent years, as the underlying technologies and principles provide new ways, following the Semantic Web standards, to overcome typical data management and consumption issues such as reliability, heterogeneity, provenance or completeness.

Many different areas of research, from social media analysis to biomedical research, have adopted these principles both for the management and dissemination of their own data and for the combined reuse of external data sources. However, the way in which Linked Data can be applicable and beneficial to the Knowledge Discovery (KDD) process is still not completely understood. It is therefore worth exploring the question of the benefit of Linked Data principles and technologies for knowledge discovery, together with addressing the new challenges that will emerge from joining the two fields, beyond the traditional data management and consumption issues in KDD.

The LD4KD2015 workshop is meant to be an opportunity to improve your own knowledge on this new interdisciplinary topic,  raising your own questions, and present new challenges and issues you have experienced in your research. We intend to create communication and collaboration channels, in order to be able to share experiences and reduce the gap between these overlapping, but still isolated communities.

Website:
http://events.kmi.open.ac.uk/ld4kd2015/

Workshop Organizers:
Ilaria Tiddi
Mathieu d'Aquin
Claudia d'Amato
Room: D. Luís Full Day
10:00 - 11:15; 11:30 - 13:00; 14:30 - 16:00; 16:15 - 18:00

Description:

Sports Analytics has been a steadily growing and rapidly evolving area over the last decade, especially in the context of US professional sports leagues but also in connection with European football leagues. While there has been some interest in the Machine Learning and Data Mining community, the majority of techniques used in the field so far are statistical. We believe that this workshop offers a great opportunity to bring people from outside of the Machine Learning community into contact with typical ECML/PKDD contributors as well as to highlight what the community has done and can do in the field of Sports Analytics.

Website:
https://dtai.cs.kuleuven.be/events/MLSA15/

Workshop Organizers:
Jan Van Haaren
Albrecht Zimmermann
Jesse Davis
Room: Miragaia Full Day
10:00 - 11:15; 11:30 - 13:00; 14:30 - 16:00; 16:15 - 18:00

Description:

Multi-target prediction (MTP) is concerned with the simultaneous prediction of multiple target variables of diverse type, such as binary, nominal, ordinal, real-valued or even mixed. Often, these multiple target variables are related either explicitly, for example they could represent a ranking, be nodes of a graph or have a spatial, temporal or spatiotemporal relationship, or implicitly, for example via hidden mutual exclusion or parent-child relationships.

While some progress in developing efficient and effective MTP methods has been achieved, we are still far from being able to successfully apply them to big data, an area which is expected to bring new and smart growth opportunities for Europe. Big data are often described via a number of "Vs", the most important ones for research being Volume, Velocity, Variety and Veracity. Complementary to existing big data efforts that address the "Vs" of input variables, the BigTargets workshop will focus on the "Vs" of the target (output) variables.

Website:
http://www.kermit.ugent.be/big-multi-target-prediction/index.php

Workshop Organizers:
Willem Waegeman
Grigorios Tsoumakas
Krzysztof Dembczynski
Tapio Pahikkala
Antti Airola
Giorgio Valentini
Massih-Reza Amini
Room: Arrábida Half Day
14:30 - 16:00; 16:15 - 18:00

Description:

Climate change, the depletion of natural resources and rising energy costs have led to an increasing focus on renewable sources of energy. A lot of research has been devoted to the technologies used to extract energy from these sources; however, equally important is the storage and distribution of this energy in a way that is efficient and cost effective. Achieving this would generally require integration with existing energy infrastructure.  The challenge of renewable energy integration is inherently multidisciplinary and is particularly dependant on the use of techniques from the domains of data analytics, pattern recognition and machine learning. Examples of relevant research topics include the forecasting of electricity supply and demand, the detection of faults, demand response applications and many others. This workshop will provides a forum where interested researchers from the various related domains will be able to present and discuss their findings. 

Website:
http://dare2015.dnagroup.org/

Workshop Organizers:
Wei Lee Woon
Zeyar Aung
Stuart Madnick
Room: Porto Half Day
10:00 - 11:15; 11:30 - 13:00

Description:

Life sciences, ranging from medicine, biology and genetics to biochemistry and pharmacology have developed rapidly in previous years. Computerization of those domains allowed to gather and store enormous collections of data. Analysis of such vast amounts of information without any support is impossible for human being. Therefore recently machine learning and pattern recognition methods have attracted the attention of broad spectrum of experts from life sciences domain.

The aim of this Workshop is to stress the importance of interdisciplinary collaboration between life and computer sciences and to provide an international forum for both practitioners seeking new cutting-edge tools for solving their domain problems and theoreticians seeking interesting and real-life applications for their novel algorithms. We are interested in novel machine learning technologies, designed to tackle complex medical, biological, chemical or environmental data that take into consideration the specific background knowledge and interactions between the considered problems. We look for novel applications of machine learning and pattern recognition tools to contemporary life sciences problems, that will shed light on their strengths and weaknesses. We are interested in new methods for data visualization and methods for accessible presentation of results of machine learning analysis to life scientists. We welcome new findings in the intelligent processing of non-stationary medical, biological and chemical data and in proposals for efficient fusion of information coming from multiple sources. Papers on efficient analysis and classification of bid data (understood as both massive volumes and high-dimensionality problems) will be of special interest to this Workshop. 

Website:
http://sisk.kssk.pwr.wroc.pl/mlls/

Workshop Organizers:
Bartosz Krawczyk
Michal Wozniak