Carnegie Mellon University Africa

04-801-A1 Deep Learning Systems: Hardware, Compilers, and Algorithms

Location: Africa

Units: 6

Semester Offered: Fall

Course description

This course examines the algorithms, compilers, and processor components to efficiently train and deploy deep learning (DL) models for commercial applications. The course details advancements and adoption of DL models in industry, explains the training and deployment process, describes the essential hardware architectural features needed for today’s and future models, and details advances in DL compilers to efficiently execute algorithms across various hardware targets.

Learning objectives

The course's primary goal is for students to gain a solid understanding of:

The design, training, and applications of DL algorithms in industry
The compiler techniques to map deep learning code to hardware targets
The critical hardware features that accelerate DL systems

Content details

DL primitive functions, models, and commercial applications
Designing, debugging, and training a model; distributed training
Numerical formats and model compression
Hardware (HW) architectural features: transistors, processors, physical networks
Commercial DL HW platforms
DL compiler techniques and existing commercial DL compilers
Using ML to improve DL systems
Challenges and open areas of research

Outcomes

At the end of this course, the students will be ready to engage engineers in industry working across the DL system stack.