Data science (master 2)

ProgramsNumber of hoursECTSLecturers
Research methodology and case study
- Professional insertion: research & development processes in company processes. Goals and organisation of the scientific research community
- Writing a scientific bibliography
- Writing and presenting for research
- Designing and interpreting experimental work Ethics of research

The student will also carry out a project (miniature internship) in the field of data science (various topics will be proposed by the faculty teaching in the master), with emphasis on problem formalisation and bibliography, and light experimentation

17.52Yannick Prié
Hoel Le Capitaine
Data economics, law and ethics
- economics of data
- open data
- models and techniques for recommender systems
- models and techniques for crowdsourcing
- anonymization techniques for privacy-preserving data publishing Detecting and preventing discrimination

243Guillaume Raschia
Marc Gelgon
Data dependencies and data integration
Part I:
- a short review of the relational data model: SQL, RA, RC, CQ, FO
- functional dependencies and Inclusion Dependencies: Armstrong's Axioms, the Implication Problem
- database design: BCNF, 3NF, Decomposition, Chase test
Part II:
- FD discovery: TANE, FD_Mine, Dep-Miner, CORDS, FastFDs * Extension to Approximate FD's discovery
- conditional FD's and 33 other Relaxations!
Part III:
- data integration: egds, tgds, G/L-AV
- schema mapping: GLAV
- data exchange: universal instance, certain answers
- query-Answering Using Views

21.53Guillaume Raschia
Visual analytics
Part I:
- introduction to data visualization
- human factors, marks and visual channels, mappings, common errors, classical data visualisations, tools, etc.
Part II :
- data representation techniques
- clustering and dimensionality reduction, trees and network representations, time series representations, 3D representations, etc.
Part III :
- designing and evaluating interactive visual analytics systems
- analytics loop, types of interactions in visual analytics, design methods, evaluation methods, etc. Part IV : Projects

243Yannick Prié
Fabien Picarougne
Pattern mining and social network analysis
Pattern mining is concerned with the part of data mining models which are unsupervised and related to categorical variables. The course introduce the general principles and goals in the context of the whole data mining process, then describes the different kinds of patterns depending on the nature of data (transactional, sequential, spatio-temporal, graph-based, text). Social network analysis is considered as a specialization of of graph mining. It focuses on algorithm principles, evaluation of the model, practice, and their usage in 2 paradigms; relational and big data

243Fabrice Guillet
Pascale Kuntz
Text and sequential pattern mining
This teaching unit explores techniques for discovering and assessing patterns and structures from sequential data (temporal event sequences, texts, biological sequences, etc.). It focuses on applications for text mining and process mining.

Outline: sequential pattern mining - episode mining
- sequential pattern mining - constraint-based mining
- pattern assessment
Text Mining
- preprocessing methods
- similarities
- emerging patterns process mining
- process model
- process discovery
- conformance checking

243Julien Blanchard
Solen Quiniou
Antoine Pigeau
Clustering Analysis and Indexing
- advanced clustering, co-clustering, semi-supervised classification
- probabilistic mixture models, topic models
- matrix factorization
- typical applications and experiments

243José Martinez
Marc Gelgon
Classification, representation learning and dimensionality reduction
- introduction (motivation, definitions, terminology, review linear algebra, probability and optimization, regression)
- subspace learning (principal component analysis (PCA), statistical and geometrical viewpoint)
indep. component analysis (ICA)),
- manifold learning (MDS, ISOMAP, t-sne and other unsupervised manifold methods)
- deep learning (restricted Boltzmann machines, auto encoders, deep belief networks, convolutional neural networks, recurrent neural networks),
- metric learning ((non)-linear, global/local, constraints setting, structured data),

Project: students should form groups of 2-4 members. A list of candidate papers will be posted, and each group should pick one from the list. Each group is required to give an oral presentation about the content of the paper in the last two weeks, and submit a report at the end. The report should include at the minimum a summary of the method/framework, and experimental results obtained by playing the code published along with the paper. Division of work should be determined by the members


243Hoel Le Capitaine
Probabilistic Graphical Models and statistical relational learning
Probabilistic graphical models (PGMs) are an interesting framework for encoding probability distributions over complex domains. These representations sit at the intersection of statistics and computer science, relying on concepts from probability theory, graph algorithms, machine learning, and more.

This course describes two basic PGM representations: Bayesian Networks, which rely on a directed graph; and Markov networks, which use an undirected graph. One last part of the course is dedicated to various extensions of these models (dynamic Bayesian networks, probabilistic relational models, Markov logic networks)

243Philippe Leray
Hoel Le Capitaine
Semantic knowledge representation
In the frame of web 3.0, semantic knowledge representation is concerned with logic modeling of knowledge with ontologies (vocabulary for concepts and properties), model instantiation on entities (data annotation), and performing logic computation and inference (reasonings) depending on a goal.

The course introduce the principles and models and tools to model, annotate and make reasonings. These concepts are applied on linked data in order to process the data and knowledge stored into the web


243Mounira Harzallah
Hala Skaff-Moli
Fabrice Guillet
Conferences and invited courses (DS)
This series of presentations and discussions will open their mind of students to new topics, applications and speakers and stimulate them for choosing their way into the field of data science
241

You can only apply to courses within one field of study up to 30 ECTS.
Autumn courses start in September and end in January.

 

Contact
master-datascience@univ-nantes.fr