Probabilistic models for data are ubiquitous in many areas of science and engineering, and specific domains such as visual and language understanding, finance, healthcare, biology, climate informatics, etc. This course will be an advanced introduction to probabilistic models of data (often through case studies from these domains) and a deep-dive into advanced inference and optimization methods used to learn such probabilistic models. This is an advanced course and ideally suited for student who are doing research in this area or are interested in doing research in this area.

Instructor’s consent. The course expects students to have a strong prior background in machine learning and probabilistic machine learning (ideally through formal coursework), probability and statistics, linear algebra, and optimization. The students must also be proficient in programming in MATLAB, Python, or R.

A tentative list of topics to be covered in this course includes

- Fundamentals of probabilistic modeling
- Basics of probability distributions and their properties
- Basics of probabilistic inference: MLE/MAP/Bayesian inference
- Hierarchical modeling, multi-parameter models
- Bayesian vs frequentist statistics
- Probabilistic graphical models (directed and undirected models)

- Probabilistic approaches for linear modeling, Sparse Bayesian Learning
- Latent variable models
- Mixture models and latent factor models
- Latent variable models for dynamic/sequential data
- Latent variable models for networks and relational data
- Latent variable models with covariates

- Approximate Inference
- Inference in probabilistic graphical models
- MCMC methods
- Variational methods
- Scalable inference with stochastic optimization
- Other methods: Likelihood-free methods, spectral methods, etc.

- Nonparametric Bayesian methods
- Gaussian Process for function approximation
- â€‹Dirichlet process and beta processes
- Other stochastic processes (gamma/point processes, etc., and their applications)

- Bayesian Optimization
- Theory of Bayesian statistics
- Probabilistic programming
- Other topics based on students’ interests

Treatment of the above topics will be via several case-studies/running-examples, which include generalized linear models, finite/infinite mixture models, finite/infinite latent factor models, matrix factorization of real/discrete/count data, sparse linear models, linear Gaussian models, linear dynamical systems and time-series models, topic models for text data, etc.

We will primarily use lecture notes/slides from this class. In addition, we will refer to monographs and research papers (from top Machine Learning conferences and journals) for some of the topics. Some recommended, although not required, books are:

- Christopher Bishop, Pattern Recognition and Machine Learning, Springer, 2007.
- Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012
- Carl Rasmussen and Chris Williams. Gaussian Processes for Machine Learning. The MIT Press, 2006.
- David Mackay. Information Theory, Inference, and Learning Algorithms. Cambridge Univ. Press, 2003.
- David Barber. Bayesian Reasoning and Machine Learning Cambridge Univ. Press, 2012.
- Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, Donald B. Rubin. Bayesian Data Analysis, Chapman \& Hall/CRC, 2013.
- Papers from conference/journals in machine learning and Bayesian statistics (e.g., ICML, NIPS, AISTATS, Journal of Machine Learning Research, Machine Learning Journal, Bayesian Analysis, Biometrika, Annals of Statistics, etc.)