04-650 Mathematical Foundations of Machine Learning

Location: Africa

Units: 12

Semester Offered: Fall

Course description

This course offers a comprehensive mathematical foundation for machine learning, covering essential topics from linear algebra, calculus, probability theory, and optimization to advanced concepts including information theory, statistical inference, regularization, and kernel methods. The course aims to equip students with the necessary mathematical tools to understand, analyze, and implement various machine-learning algorithms and models at a deeper level.

Learning objectives

In this course, students will:

Learn the foundational concepts and techniques of linear algebra, including vector and matrix operations, eigenvectors, and eigenvalues, with a focus on their application in machine learning

Learn calculus concepts, such as derivatives and optimization techniques, and apply them to solve machine-learning problems

Gain a comprehensive understanding of probability theory and statistics, including multivariate random variables and maximum likelihood estimation, and their role in machine learning

Learn various optimization methods, including gradient descent and convex optimization, and their application in machine learning

Learn information theory and its relevance to machine learning

Outcomes

Upon the completion of this course, students will be able to:

Use linear algebra concepts such as matrices, vectors, and eigenvalues to represent and manipulate data

Students will be able to use calculus concepts such as differentiation and gradients to optimize machine learning models

Use probability and statistical concepts to model and infer from data

Use optimization techniques such as gradient descent and convex optimization to optimize machine learning models

Explain the role of information entropy for assessing model accuracy

Content details

This course includes:

Linear algebra: vectors and matrices, vector spaces, systems of linear equations, eigenvalue decomposition, singular value decomposition, least-squares

Calculus: Chain Rule and Jacobians, gradient

Probability: probability axioms, Bayes rule, random variables, probability distributions

Statistics: descriptive stats, inferential stats, sampling and MCMC Methods, statistical tests

Optimization: Convex functions and convex optimization problems, duality, and Lagrange Multipliers

Information theory: Entropy and Mutual Information, KL Divergence, and Cross-Entropy

Course description

Learning objectives

Outcomes

Content details

Prerequisites