Seminar Distributed Learning Systems

Machine learning systems are often conventionally designed for centralized processing in that they first collect data from distributed sources and then execute algorithms on a single server. Due to the limited scalability of processing large amount of data and the long latency delay, there is a strong demand for a paradigm shift to distributed or decentralized ML systems which execute ML algorithms on multiple and in some cases even geographically dispersed nodes. The aim of this seminar course is to let students learn how to design and build distributed ML systems via paper reading, presentation, discussion, and project prototyping. We provide a broad overview on the design of the state-of-the-art distributed ML systems, with a strong focus on the scalability, resource efficiency, data requirements, and robustness of the solutions. We will present an array of methodologies and techniques that can efficiently scale ML analysis to a large number of distributed nodes against all operation conditions, e.g., system failures and malicious attacks. The specific course topics are listed below. The course materials will be based on a mixture of classic and recently published papers. For each topic, the basic concepts and technology landscape will be first provided and then two state-of-the art of papers will be presented and discussed by students. Course topics include:

  • Distributed machine learning systems
  • Federated machine learning systems
  • Performance and scalability of state-of-the-art systems

Details

Code 12614
62614
Type Seminar
ECTS 5
Site Neuchâtel
Track(s) T1 – Distributed Software Systems
T6 – Data Science
Semester A2024

Teaching

Learning Outcomes
  • Students are able to argue and reason about distributed ML from a systems perspective.
  • Students understand the behavior and tradeoffs of distributed ML in terms of performance and scalability.
  • Students can estimate the importance of data inputs via different techniques, i.e., core set and decomposition methods, for distributed ML systems.
  • Students understand data poison attacks and design defense strategy for distributed ML systems.
  • Students can analyze the state-of-the art federated machine learning systems and design the failure-resilient communication protocols.
  • Students are able to design and implement methods and techniques for making distributed ML systems more efficient.
Lecturer(s) Lydia Chen
Language english
Course Page

The course page in ILIAS can be found at https://ilias.unibe.ch/goto_ilias3_unibe_crs_3136588.html.

Schedules and Rooms

Period Weekly
Schedule Tuesday, 08:45 - 10:00
Location UniNE, Unimail
Room A017

Evaluation

Evaluation type continuous evaluation

Additional information

Comment

First Lecture
The first lecture will take place on Tuesday, 17.09.2024 at 08:45 in UniNE, Unimail, room A017.