Seminar Distributed Learning Systems

Machine learning systems are often conventionally designed for centralized processing in that they first collect data from distributed sources and then execute algorithms on a single server. Due to the limited scalability of processing large amount of data and the long latency delay, there is a strong demand for a paradigm shift to distributed or decentralized ML systems which execute ML algorithms on multiple and in some cases even geographically dispersed nodes. The aim of this seminar course is to let students learn how to design and build distributed ML systems via paper reading, presentation, discussion, and project prototyping. We provide a broad overview on the design of the state-of-the-art distributed ML systems, with a strong focus on the scalability, resource efficiency, data requirements, and robustness of the solutions. We will present an array of methodologies and techniques that can efficiently scale ML analysis to a large number of distributed nodes against all operation conditions, e.g., system failures and malicious attacks. The specific course topics are listed below. The course materials will be based on a mixture of classic and recently published papers. For each topic, the basic concepts and technology landscape will be first provided and then two state-of-the art of papers will be presented and discussed by students. Course topics include:

Distributed machine learning systems
Federated machine learning systems
Performance and scalability of state-of-the-art systems

Details

Code	12614 62614
Type	Seminar
ECTS	5
Site	Neuchâtel
Track(s)	T1 – Distributed Software Systems T6 – Data Science
Semester	A2024

Teaching

Learning Outcomes	Students are able to argue and reason about distributed ML from a systems perspective. Students understand the behavior and tradeoffs of distributed ML in terms of performance and scalability. Students can estimate the importance of data inputs via different techniques, i.e., core set and decomposition methods, for distributed ML systems. Students understand data poison attacks and design defense strategy for distributed ML systems. Students can analyze the state-of-the art federated machine learning systems and design the failure-resilient communication protocols. Students are able to design and implement methods and techniques for making distributed ML systems more efficient.
Lecturer(s)	Lydia Chen
Language	english
Course Page	The course page in ILIAS can be found at https://ilias.unibe.ch/goto_ilias3_unibe_crs_3136588.html.

Schedules and Rooms

Period	Weekly
Schedule	Tuesday, 08:45 - 10:00
Location	UniNE, Unimail
Room	A017

Evaluation

Evaluation type

continuous evaluation

Additional information

Comment

First Lecture
The first lecture will take place on Tuesday, 17.09.2024 at 08:45 in UniNE, Unimail, room A017.