Instructor's consent (no course prerequisites).

MSO201A/equivalent, ESO207A, familiarity with programming in MATLAB/Octave, Python, or R, or instructor’s consent.

Machine Learning is the discipline of designing algorithms that allow machines (e.g., a computer) to learn patterns and concepts from data without being explicitly programmed. This course will be an introduction to the design (and some analysis) of machine learning algorithms, with a modern outlook focusing on recent advances, and examples of real-world applications of machine learning algorithms.

- Preliminaries
- Multivariate calculus: gradient, Hessian, Jacobian, chain rule
- Linear algebra: determinants, eigenvalues/vectors, SVD
- Probability theory: conditional probability, marginal probability, Bayes rule

- Supervised Learning
- Local/proximity-based methods: nearest-neighbors, decision trees
- Learning by function approximation
- Linear models: (multiclass) support vector machines, ridge regression
- Non-linear models: kernel methods, neural networks (feedforward)

- Learning by probabilistic modeling
- Discriminative methods: (multiclass) logistic regression, generalized linear models
- Generative methods: naive Bayes

- Unsupervised Learning
- Discriminative Models:k-means (clustering), PCA (dimensionality reduction)
- Generative Models
- Latent variable models: expectation-maximization for learning latent variable models
- Applications: Gaussian mixture models, probabilistic PCA

- Practical Aspects
- Concepts of over-fitting and generalization, bias-variance tradeoffs
- Model and feature selection using the above concepts
- Optimization for machine learning: (stochastic/mini-batch) gradient descent

- Additional Topics (a subset to be covered depending on interest)
- Deep learning: CNN, RNN, LSTM, autoencoders
- Structured output prediction: multi-label classification, sequence tagging, ranking
- Ensemble methods: boosting, bagging, random forests
- Recommendation systems: ranking methods, collaborative filtering via matrix completion
- Reinforcement learning and applications
- Kernel extensions for PCA, clustering, spectral clustering, manifold learning
- Probability density estimation and anomaly detection
- Time-series analysis and modeling sequence data
- Sparse modeling and estimation
- Online learning algorithms: perceptron, Widrow-Hoff, explore-exploit
- Statistical learning theory: PAC learning, VC dimension, generalization bounds
- A selection from some other advanced topics such as semi-supervised learning, active learning, inference in graphical models, Bayesian learning and inference

There will not be any dedicated textbook for this course. In lieu of that, we will have lecture slides/notes and monographs, tutorials, and papers for the topics that will be covered in this course. Some recommended (although not required) books are:

- Christopher Bishop, Pattern Recognition and Machine Learning, Springer, 2007
- Hal Daume III, A Course in Machine Learning, 2015 (freely available online)
- Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning, Springer, 2009
- John Hopcroft, Ravindran Kannan, Foundations of Data Science, 2014 (freely available online)
- Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of Machine Learning, The MIT Press, 2012
- Kevin Murphy, Machine Learning: A Probabilistic Perspective, The MIT Press, 2012