After the high-dimensional data symposium, Katrijn van Deun of Tilburg University, will give an interactive talk on ‘Finding joint and specific sources of variation in linked high-dimensional data’ for the MSDSlab members.
Attention: This presentation is on another day and time than usual.
In this meeting, Emmeke will introduce us to Hidden Markov Models.
The HMM is a very flexible model and as such is applicable to a wide variety of longitudinally collected data. For example, one can extract student behaviour states from MOOC data and investigate the composition of the different learning states, and the transitions between the different learning states. Or one can extract sleep states based on EEG measurements, and subsequently compare the duration of, and transitions between, different sleep states for patients which do and do not suffer from insomnia.
You can find the slides of the presentation Georg gave here (.pdf, 3MB).
Thursday 08/03/2018 at 15:00 in room B1.09
The speaker for this meeting will be Georg Krempl, who will talk about an approach for learning from data with limited supervision. Here is a shortened abstract:
Machine learning has become widely used throughout commerce, science, and technology. However, the ever increasing volumes of data are contrasted by various constraints, such as limited supervision, processing or storage capacities. This requires techniques to optimise the allocation of these capacities.
Active machine learning aims to provide techniques for selecting the most insightful information (like label annotations of data instances) to be queried from oracles (like human supervisors).
In this talk, I will present our recently developed probabilistic active learning approach PAL. This decision-theoretic approach combines the fast asymptotic runtime of popular heuristics like uncertainty sampling with a direct optimisation of the expected gain in classification performance.
I will conclude this presentation by demonstrating the use of PAL in different active learning scenarios, ranging from label selection in large data pools and evolving data streams to broader settings such as active class selection.
The meeting was a success! Click here (.pdf, 2MB) to download the updated slides with a worked-out example of how the PC algorithm works, and contact Oisín Ryan (mailto:email@example.com) if you’d like to know more about the Causality Reading Group.
Thursday 22/02/2018 at 15:00 in room B1.09
For this meeting, MSDSlab is teaming up with its sister Causality Reading Group, organized by Oisín Ryan. Oisín will explain the background and implementation of the PC algorithm and show how it can discover causal structure in a network of variables through smart use of conditional independence rules.
Activity: We will also be trying out the PC algorithm and its interpretation on a real dataset, if you’d like to join this activity, bring your laptop!
Preparation: install R, Rstudio, the pcalg package
This meeting, Ayoub Bagheri (M&S, UMCU) gave an introduction to text mining in general and some of his work in particular. Here’s a quote:
Text mining is the process of analyzing natural language text looking for useful and unknown patterns. In other words, text mining is the art of turning free text into numerical variables and then mining them with statistical techniques and learning algorithms.
Data science is not only attending talks and discussing methods, but also analysing datasets and practical problem-solving. A true data scientist knows which methods to use as well as how to use them in an efficient way. This is why we are introducing MSDSlab Sessions.
In the MSDSlab Sessions, a session leader works on their own data science related project, such as web scraping, analysis building, or text mining, and others can simply join in. The session is entirely up to the participants: they may all want to work on the same thing together, or perhaps they divide a project into different parts. The sessions last one hour, with some extra time reserved for wrapping up the results.
Thursdays at 11:00
Sjoerd Groenmangebouw B1.01