Big Data Infrastructures

This course focuses on conceptual and architectural issues related to the design and deployment of modern data management infrastructures in a Big Data context. It starts with a review of distributed transaction processing techniques, classical parallel databases systems and ACID-style semantics in shared-nothing architectures. The course then delves into modern wide-area data processing, with an emphasis on recent systems developed to solve large-scale problems using clusters of commodity machines. In this second part, the course covers distributed storage systems (such as Google’s BigTable), wide-area hash-tables (like Cassandra), data-intensive computing platforms (Hadoop) and nosql systems. Hands on programming exercises using those platforms will be an important part of this course.

Details

Code 63021
Type Course
ECTS 5
Site Fribourg
Track(s) T6 – Data Science
Semester A2024

Teaching

Learning Outcomes

Students will learn about classical distributed transaction processing and parallel database systems. They will get exposed to modern data management infrastructures deployed by current Web giants like Google or Yahoo! to power a wide range of Web services. Finally, they will understand the fundamental tradeoffs between consistency, availability and fault-tolerance for wide-area data processing on the Internet.

Lecturer(s) Philippe Cudré-Mauroux
Alberto Lerner
Language english
Course Page

The course page in ILIAS can be found at https://ilias.unibe.ch/goto_ilias3_unibe_crs_3102194.html.

Schedules and Rooms

Period Weekly
Schedule Wednesday, 14:15 - 17:00
Location UniFR, PER21

Additional information

Comment

First Lecture
The first lecture will be announced later.