STA414 / STA2104 Winter 2022
Statistical Methods for Machine Learning II
The language of probability allows us to coherently and automatically account for uncertainty. This course will teach you how to build, fit, and do inference in probabilistic models. These models let us generate novel images and text, find meaningful latent representations of data, take advantage of large unlabeled datasets, and even let us do analogical reasoning automatically. This course will teach the basic building blocks of these models and the computational tools needed to use them.
What you will learn:
- Standard statistical learning algorithms, when to use them, and their limitations.
- The main elements of probabilistic models (distributions, expectations, latent variables, neural networks) and how to combine them.
- Standard computational tools (Monte Carlo, Stochastic optimization, regularization, automatic differentiation).
Missed Assessment Form
Online for now. Zoom link will be sent by Quercus.
No required textbooks.
- (PRML) Christopher M. Bishop (2006) Pattern Recognition and Machine Learning
- (DL) Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016), Deep Learning
- (MLPP) Kevin P. Murphy (2013), Machine Learning: A Probabilistic Perspective
- (ESL) Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009) The Elements of Statistical Learning
- (ISL) Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani (2017) Introduction to Statistical Learning
- (ITIL) David MacKay (2003) Information Theory, Inference, and Learning Algorithms
Week 1 - Jan 10th & 11th - Course Overview and Graphical Model Notation
- Class Intro
- Topics covered
- Quick review of probabilistic models
- Graphical model notation: going from graphs to factorized joint probs and back
Week 2 - Jan 17th & 18th - Decision Theory and Parametrizing Probabilistic Models
- Basic decision theory
- Understand basics of Directed Graphical Models
- Become comfortable working with conditional probabilities
- Decision Theory
- Conditional Probability Tables
- Numbers of parameters in different tables
- Plate notation.
- Examples of meaningful graphical models
- D-Separation, conditional independence in directed models
- Bayes Ball algorithm for determining conditional independence
- Worked examples of decision theory
- Worked examples of Directed Graphical Models
Week 3 - Jan 24th & 25th - Latent variables and Exact Inference
- How to write the joint factorization implied by UGMs
- How to reason about conditional indepdencies in UGMs
- How to do exact inference in joint distributions over discrete variables
- The time complexity of exact inference
- Worked example of MLE for binary models
- More examples of UGMs
- Worked examples of variable elimination in chains and trees
- Simple code for neural networks
Week 4 - Jan 31st & Feb 1st - Message Passing + Sampling
- Assignment 1 due Feb 6th.
- Understand Trueskill model
- Understand the Message Passing algorithm on trees
- Understand Loopy Belief Propagation
- Understand Sampling methods and why we need them (Simple MC, Importance, Rejection, Ancestral)
Week 5 - Feb 7th & 8th - MCMC
- Assignment 2 released Feb 7th.
- Metropolis Hastings
- Hamiltonian Monte Carlo
- Gibbs Sampling
- MCMC Diagnostics
Strongly Suggested Reading Conceptual Intorduction to Hamiltonian Monte Carlo
Week 6 - Feb 14th & 15th - Variational Inference
- Assignment 2 due Feb 20th.
- Optimizing distributions
- Optimizing expectations with simple Monte Carlo
- The reparameterization trick
- The evidence lower bound
- David Blei’s lecture notes on VI
- Video of NIPS 2016 Tutorial on Variational Inference
- Advances in Variational Inference; Zhang, C., Butepage, J., Kjellstrom, H., & Mandt, S. (2017)
- Automatic Differentiation Variational Inference; Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2016)
Week 7 - Feb 21st & 22nd - No classes - Reading week
Week 8 - Feb 28th & Feb 29th - Midterm
Week 9 - March 7th & 8th - Language Models and attention
- NLP Basics
- Embeddings from BoW to Skip-Gram
- Tasks in modern Natural Language Processing
- Autoregressive models
Week 10 - March 14th & 15th - Amortized inference and Variational Autoencoders
- Amortized inference and Variational Autoencoders
Week 11 - March 21st & 22nd - Kernel methods and Gaussian Processes
Week 12 - March 28th & 29th - Neural Networks
- Assignment 4 due April 3rd
- Neural Networks intro
- Building Blocks of NNs
- Common Architectures
- Attention and Transformers
Week 13 - April 4th & 5th - Final Exam Review, Contrastive Learning, Interpretability