T6 – Data Science

Data is the new oil, as it powers an increasing number of key aspects of our society. Be it for online recommender systems, environmental monitoring, simulation and modeling, or smart-city infrastructures, data has become a key ingredient in deploying and optimizing large-scale services. While data production is booming – driven by online applications, mobile devices and the Internet of Things – legacy infrastructures like relational databases or numerical computing frameworks are reaching their limits and are being rapidly replaced by a new ecosystem of software and methods for storing, manipulating and analyzing steadily growing amounts of data often referred to as Big Data.

This track covers both theoretical foundations as well as practical aspects of dealing with large quantities of potentially heterogeneous and noisy data. Core courses belonging to this track cover systems and techniques to store, process, and make sense of Big Data. Several courses focus on conceptual and architectural issues related to the design and deployment of modern data management infrastructures, with an emphasis on recent systems developed to solve large-scale problems using clusters of commodity machines. Further courses address data analysis and knowledge discovery from a number of different perspectives, including pattern recognition, online recommendation, or machine-learning using both unsupervised and supervised models. A wide set of applications ranging from targeted advertising to social network analysis or financial stream modelling are covered throughout the courses.

Involved research groups