Distributed Deep Learning Systems
Machine learning systems are often conventionally designed for centralized processing in that they first collect data from distributed sources and then execute algorithms on a single server. Due to the limited scalability of processing large amount of data and the long latency delay, there is a strong demand for a paradigm shift to distributed or decentralized ML systems which execute ML algorithms on multiple and in some cases even geographically dispersed nodes.
The aim of this course is to let students learn how to design and build distributed ML systems via paper reading, presentation, and discussion; We provide a broad overview on the design of the state-of-the-art distributed ML systems, with a strong focus on the scalability, resource efficiency, data requirements, and robustness of the solutions. We will present an array of methodologies and techniques that can efficiently scale ML analysis to a large number of distributed nodes against all operation conditions, e.g., system failures and malicious attacks. The specific course topics are listed below.
The course materials will be based on a mixture of classic and recently published papers.
Details
Code | 62122 |
Type | Course |
ECTS | 5 |
Site | Neuchâtel |
Track(s) |
T6 – Data Science |
Semester | S2025 |
Teaching
Learning Outcomes |
|
Lecturer(s) |
Lydia Chen |
Language | english |
Course Page | The course page in ILIAS can be found at https://ilias.unibe.ch/goto_ilias3_unibe_crs_3102287.html. |
Schedules and Rooms
Period | Weekly |
Schedule | Monday, 08:15 - 12:00 |
Location | UniNE, Unimail |
Room | E213 |
Additional information
Comment | First Lecture |