Programs | Number of hours | ECTS | Lecturers |
---|---|---|---|
Research methodology and case study - Professional insertion: research & development processes in company processes. Goals and organisation of the scientific research community - Writing a scientific bibliography - Writing and presenting for research - Designing and interpreting experimental work Ethics of research The student will also carry out a project (miniature internship) in the field of data science (various topics will be proposed by the faculty teaching in the master), with emphasis on problem formalisation and bibliography, and light experimentation | 17.5 | 2 | Yannick Prié Hoel Le Capitaine |
Data economics, law and ethics - economics of data - open data - models and techniques for recommender systems - models and techniques for crowdsourcing - anonymization techniques for privacy-preserving data publishing Detecting and preventing discrimination | 24 | 3 | Guillaume Raschia Marc Gelgon |
Data dependencies and data integration Part I: - a short review of the relational data model: SQL, RA, RC, CQ, FO - functional dependencies and Inclusion Dependencies: Armstrong's Axioms, the Implication Problem - database design: BCNF, 3NF, Decomposition, Chase test Part II: - FD discovery: TANE, FD_Mine, Dep-Miner, CORDS, FastFDs * Extension to Approximate FD's discovery - conditional FD's and 33 other Relaxations! Part III: - data integration: egds, tgds, G/L-AV - schema mapping: GLAV - data exchange: universal instance, certain answers - query-Answering Using Views | 21.5 | 3 | Guillaume Raschia |
Visual analytics Part I: - introduction to data visualization - human factors, marks and visual channels, mappings, common errors, classical data visualisations, tools, etc. Part II : - data representation techniques - clustering and dimensionality reduction, trees and network representations, time series representations, 3D representations, etc. Part III : - designing and evaluating interactive visual analytics systems - analytics loop, types of interactions in visual analytics, design methods, evaluation methods, etc. Part IV : Projects | 24 | 3 | Yannick Prié Fabien Picarougne |
Pattern mining and social network analysis Pattern mining is concerned with the part of data mining models which are unsupervised and related to categorical variables. The course introduce the general principles and goals in the context of the whole data mining process, then describes the different kinds of patterns depending on the nature of data (transactional, sequential, spatio-temporal, graph-based, text). Social network analysis is considered as a specialization of of graph mining. It focuses on algorithm principles, evaluation of the model, practice, and their usage in 2 paradigms; relational and big data | 24 | 3 | Fabrice Guillet Pascale Kuntz |
Text and sequential pattern mining This teaching unit explores techniques for discovering and assessing patterns and structures from sequential data (temporal event sequences, texts, biological sequences, etc.). It focuses on applications for text mining and process mining. Outline: sequential pattern mining - episode mining - sequential pattern mining - constraint-based mining - pattern assessment Text Mining - preprocessing methods - similarities - emerging patterns process mining - process model - process discovery - conformance checking | 24 | 3 | Julien Blanchard Solen Quiniou Antoine Pigeau |
Clustering Analysis and Indexing - advanced clustering, co-clustering, semi-supervised classification - probabilistic mixture models, topic models - matrix factorization - typical applications and experiments | 24 | 3 | José Martinez Marc Gelgon |
Classification, representation learning and dimensionality reduction - introduction (motivation, definitions, terminology, review linear algebra, probability and optimization, regression) - subspace learning (principal component analysis (PCA), statistical and geometrical viewpoint) indep. component analysis (ICA)), - manifold learning (MDS, ISOMAP, t-sne and other unsupervised manifold methods) - deep learning (restricted Boltzmann machines, auto encoders, deep belief networks, convolutional neural networks, recurrent neural networks), - metric learning ((non)-linear, global/local, constraints setting, structured data), Project: students should form groups of 2-4 members. A list of candidate papers will be posted, and each group should pick one from the list. Each group is required to give an oral presentation about the content of the paper in the last two weeks, and submit a report at the end. The report should include at the minimum a summary of the method/framework, and experimental results obtained by playing the code published along with the paper. Division of work should be determined by the members | 24 | 3 | Hoel Le Capitaine |
Probabilistic Graphical Models and statistical relational learning Probabilistic graphical models (PGMs) are an interesting framework for encoding probability distributions over complex domains. These representations sit at the intersection of statistics and computer science, relying on concepts from probability theory, graph algorithms, machine learning, and more. This course describes two basic PGM representations: Bayesian Networks, which rely on a directed graph; and Markov networks, which use an undirected graph. One last part of the course is dedicated to various extensions of these models (dynamic Bayesian networks, probabilistic relational models, Markov logic networks) | 24 | 3 | Philippe Leray Hoel Le Capitaine |
Semantic knowledge representation In the frame of web 3.0, semantic knowledge representation is concerned with logic modeling of knowledge with ontologies (vocabulary for concepts and properties), model instantiation on entities (data annotation), and performing logic computation and inference (reasonings) depending on a goal. The course introduce the principles and models and tools to model, annotate and make reasonings. These concepts are applied on linked data in order to process the data and knowledge stored into the web | 24 | 3 | Mounira Harzallah Hala Skaff-Moli Fabrice Guillet |
Conferences and invited courses (DS) This series of presentations and discussions will open their mind of students to new topics, applications and speakers and stimulate them for choosing their way into the field of data science | 24 | 1 |
You can only apply to courses within one field of study up to 30 ECTS.
Autumn courses start in September and end in January.