04-801-A1   Deep Learning Systems: Hardware, Compilers, and Algorithms

Location: Africa

Units: 6

Semester Offered: Fall

Course description

This course examines the algorithms, compilers, and processor components to efficiently train and deploy deep learning (DL) models for commercial applications. The course details advancements and adoption of DL models in industry, explains the training and deployment process, describes the essential hardware architectural features needed for today’s and future models, and details advances in DL compilers to efficiently execute algorithms across various hardware targets.

Learning objectives

The course's primary goal is for students to gain a solid understanding of:

  • The design, training, and applications of DL algorithms in industry
  • The compiler techniques to map deep learning code to hardware targets
  • The critical hardware features that accelerate DL systems

Content details

  • DL primitive functions, models, and commercial applications
  • Designing, debugging, and training a model; distributed training
  • Numerical formats and model compression
  • Hardware (HW) architectural features: transistors, processors, physical networks
  • Commercial DL HW platforms
  • DL compiler techniques and existing commercial DL compilers
  • Using ML to improve DL systems
  • Challenges and open areas of research


At the end of this course, the students will be ready to engage engineers in industry working across the DL system stack. 


Andres Rodriguez