Big Data Infrastructures
This course focuses on conceptual and architectural issues related to the design and deployment of modern data management infrastructures in a Big Data context. It starts with a review of distributed transaction processing techniques, classical parallel databases systems and ACID-style semantics in shared-nothing architectures. The course then delves into modern wide-area data processing, with an emphasis on recent systems developed to solve large-scale problems using clusters of commodity machines. In this second part, the course covers distributed storage systems (such as Google’s BigTable), wide-area hash-tables (like Cassandra), data-intensive computing platforms (Hadoop) and nosql systems. Hands on programming exercises using those platforms will be an important part of this course.
T6 – Data Science |
Students will learn about classical distributed transaction processing and parallel database systems. They will get exposed to modern data management infrastructures deployed by current Web giants like Google or Yahoo! to power a wide range of Web services. Finally, they will understand the fundamental tradeoffs between consistency, availability and fault-tolerance for wide-area data processing on the Internet.
Philippe Cudré-Mauroux |
The course page in ILIAS can be found at https://ilias.unibe.ch/goto_ilias3_unibe_crs_2793343.html.
Schedules and Rooms
|Schedule||Wednesday, 14:15 - 17:00|