STA414 / STA2104 Winter 2022
Statistical Methods for Machine Learning II

The language of probability allows us to coherently and automatically account for uncertainty. This course will teach you how to build, fit, and do inference in probabilistic models. These models let us generate novel images and text, find meaningful latent representations of data, take advantage of large unlabeled datasets, and even let us do analogical reasoning automatically. This course will teach the basic building blocks of these models and the computational tools needed to use them.
What you will learn:
  - Standard statistical learning algorithms, when to use them, and their limitations.
- The main elements of probabilistic models (distributions, expectations, latent variables, neural networks) and how to combine them.
- Standard computational tools (Monte Carlo, Stochastic optimization, regularization, automatic differentiation).
Instructors:
Syllabus
Missed Assessment Form
Piazza
Teaching Assistants:
Location:
Online for now.  Zoom link will be sent by Quercus.
Reading
No required textbooks.
  - (PRML) Christopher M. Bishop (2006) Pattern Recognition and Machine Learning
- (DL) Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016), Deep Learning
- (MLPP) Kevin P. Murphy (2013), Machine Learning: A Probabilistic Perspective
- (ESL) Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009) The Elements of Statistical Learning
- (ISL) Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani (2017) Introduction to Statistical Learning
- (ITIL) David MacKay (2003) Information Theory, Inference, and Learning Algorithms
Tentative Schedule
Week 1 - Jan 10th & 11th - Course Overview and Graphical Model Notation
Coverage:
  - Class Intro
- Topics covered
- Quick review of probabilistic models
- Graphical model notation: going from graphs to factorized joint probs and back
Materials:
Week 2 - Jan 17th & 18th - Decision Theory and Parametrizing Probabilistic Models
Learning Outcomes:
  - Basic decision theory
- Understand basics of Directed Graphical Models
- Become comfortable working with conditional probabilities
Coverage:
  - Decision Theory
- Conditional Probability Tables
- Numbers of parameters in different tables
- Plate notation.
- Examples of meaningful graphical models
- D-Separation, conditional independence in directed models
- Bayes Ball algorithm for determining conditional independence
Materials:
Lecture recordings:
Lecture notes:
Tutorial:
  - Worked examples of decision theory
- Worked examples of Directed Graphical Models
Helpful materials:
Week 3 - Jan 24th & 25th - Latent variables and Exact Inference
Learning Outcomes:
  - How to write the joint factorization implied by UGMs
- How to reason about conditional indepdencies in UGMs
- How to do exact inference in joint distributions over discrete variables
- The time complexity of exact inference
Materials:
Lecture:
Tutorial:
  - Worked example of MLE for binary models
- More examples of UGMs
- Worked examples of variable elimination in chains and trees
- Simple code for neural networks
Week 4 - Jan 31st & Feb 1st - Message Passing + Sampling
  - Assignment 1 due Feb 6th.
Learning Outcomes:
  - Understand Trueskill model
- Understand the Message Passing algorithm on trees
- Understand Loopy Belief Propagation
- Understand Sampling methods and why we need them (Simple MC, Importance, Rejection, Ancestral)
Materials:
Week 5 - Feb 7th & 8th - MCMC
  - Assignment 2 released Feb 7th.
    Coverage:
  - Metropolis Hastings
- Hamiltonian Monte Carlo
- Gibbs Sampling
- MCMC Diagnostics
Materials:
Suggested Reading:
Strongly Suggested Reading Conceptual Intorduction to Hamiltonian Monte Carlo
Week 6 - Feb 14th & 15th - Variational Inference
  - Assignment 2 due Feb 20th.
Materials:
Learning Outcomes:
  - Optimizing distributions
- Optimizing expectations with simple Monte Carlo
- The reparameterization trick
- The evidence lower bound
Tutorial:
  - David Blei’s lecture notes on VI
- Video of NIPS 2016 Tutorial on Variational Inference
- Advances in Variational Inference; Zhang, C., Butepage, J., Kjellstrom, H., & Mandt, S. (2017)
- Automatic Differentiation Variational Inference; Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2016)
Week 7 - Feb 21st & 22nd - No classes - Reading week
Enjoy!
Week 8 - Feb 28th & Feb 29th - Midterm
Week 9 - March 7th & 8th - Language Models and attention
Coverage
  - NLP Basics
- Embeddings from BoW to Skip-Gram
- Tasks in modern Natural Language Processing
- Autoregressive models
Materials
Readings
Week 10 - March 14th & 15th - Amortized inference and Variational Autoencoders
Coverage:
  - Amortized inference and Variational Autoencoders
Week 11 - March 21st & 22nd - Kernel methods and Gaussian Processes
Week 12 - March 28th & 29th - Neural Networks
  - Assignment 4 due April 3rd
Content
  - Neural Networks intro
- Building Blocks of NNs
- Common Architectures
- Attention and Transformers
Materials
Week 13 - April 4th & 5th - Final Exam Review, Contrastive Learning, Interpretability
Other resources: