STA414 / STA2104 Winter 2019
Statistical Methods for Machine Learning II
This course introduces machine learning to students with a statistical background. Besides teaching standard methods such as logistic and ridge regression, kernel density estimation, and random forests, this course course will try to offer a broader view of model-building and optimization using probabilistic building blocks.
What you will learn:
- Standard statistical learning algorithms, when to use them, and their limitations.
- The main elements of probabilistic models (distributions, expectations, latent variables, neural networks) and how to combine them.
- Standard computational tools (Monte Carlo, Stochastic optimization, regularization, automatic differentiation).
Instructors:
- Murat A. Erdogdu, Office: 5016b Sidney Smith
- Email: erdogdu@cs.toronto.edu (put “STA414” in the subject)
- Office hours: Mondays 11-12am at Sidney Smith Room 5016b
- David Duvenaud, Office: 384 Pratt
- Email: duvenaud@cs.toronto.edu (put “STA414” in the subject)
- Office hours: Wednesdays 10-11am in D.L. Pratt room 384.
- The two instructors won’t stick strictly to lecturing in their own sections.
Syllabus
Piazza
Teaching Assistants:
- Cedric Beaulac
- Office hours: Tuesday 11am-12pm at Pratt 286b (tentative)
- Hannes Bretschneider
- Office hours: Thursday 2pm-3pm at Pratt 286b (tentative)
- Tianle Chen
Location:
This course has two identical sections each week
- Monday 2pm-5pm at Wilson Hall -New College Room 1017,
- Tuesday 7pm-10pm at Sidney Smith Room 2118.
Reading
No required textbooks.
- (PRML) Christopher M. Bishop (2006) Pattern Recognition and Machine Learning
- (DL) Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016), Deep Learning
- (MLPP) Kevin P. Murphy (2013), Machine Learning: A Probabilistic Perspective
- (ESL) Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009) The Elements of Statistical Learning
- (ISL) Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani (2017) Introduction to Statistical Learning
- (ITIL) David MacKay (2003) Information Theory, Inference, and Learning Algorithms
Tentative Schedule
Week 1 - Jan 7th & 8th - Overview
- Supervised vs unsupervised learning (ISL Sec~2.1.4)
- Least squares (ESL Sec~2.3.1)
- Polynomial curve fitting (ISL Sec~7.1)
- Overfitting and generalization (ISL Sec~2.1.2)
- Effect of regularization (ISL Sec~6.2.1)
- Cross validation (ISL Sec~5.1)
- Slides here. Suggested reading is next to each topic.
Week 2 - Jan 14th & 15th - Probabilistic Models
- Maximum likelihood estimation (Sec~2.6.3 ESL)
- Important distributions (Sec~2 MLPP)
- Exponential families (Sec~3.1 and ~3.3.1 of this)
- Geometry of linear regression (Sec~3.2 ESL)
- Slides here. Suggested reading is next to each topic.
Week 3 - Jan 21st & 22nd - Regularization and Bayesian Methods
- Basis function models (Sec~3.1 PRML)
- Geometry of least squares (Sec~3.1.2 PRML)
- Lasso and ridge regression (Sec~3.1.4 PRML)
- Bayesian Linear Regression (Sec~3.3 PRML)
- Generative and discriminative models
- Assignment 1 and data released. Due Jan 29 10pm.
- Assignment 1 solutions here.
- Slides here. Suggested reading is next to each topic.
Week 4 - Jan 28th & 29th - Classification
- Generative and discreminative models (Sec~4.1 PRML)
- Logistic regression (Sec~4.3 PRML)
- Multinomial regression (Sec~4.3 PRML)
- Linear/quadratic discriminant analysis (Sec~4.1 PRML)
- Perceptron (Sec~4.1.7 PRML)
- Naive Bayes (Sec~6.6.3 ESL)
- Slides here. Suggested reading is next to each topic.
Week 5 - Feb 4th & 5th - Decision Theory, and Optimization
- Statistical decision theory (Sec~1.5 PRML)
- Bias-variance tradeoff (Sec~3.2 PRML)
- Matrix calculus (Sec~C PRML, Wikipedia)
- Gradient based optimization (Sec~4.3.3 and ~5.2.4 PRML)
- Stochastic optimization methods (Sec~3.1.3 PRML)
- Reading assignment: Curse of dimensionality (Sec~1.4 PRML)
- Slides here. Suggested reading is next to each topic.
Week 6 - Feb 11th & 12th - Unsupervised learning, Latent variable models
- Clustering (Sec~9.1\&9.2 PRML)
- Mixture models (Sec~9.1\&9.2 PRML)
- K-means (Sec~9.1 PRML)
- EM algorithm (Sec~9.2\&9.3\&9.4 PRML)
- Principle component analysis (PCA) (Sec~12.1 PRML)
- Probabilistic PCA (Sec~12.2 PRML)
- Slides here. Suggested reading is next to each topic.
- Assignment 2 is released (data loaders: python, R). Due March 12 in class.
Week 7 - Feb 18th & 19th - No class - Family Day
Week 8 - Feb 25th & March 5th - Graphical models and Neural networks
- Slides for Graphical Models and Basic Neural Nets
- Practice midterm here. Midterm covers first five lectures.
- Midterm date March 1 7pm-9pm, Mon sec in MS 3153 \& Tue sec in MS 3154 (MS:Medical Sciences)
- Graphical Models notation
- Hidden Markov models
- Exact inference
- Numerical stability and parameterization
- Neural networks
Readings:
Week 9 - March 4th & 7th - Autodiff and Markov chain Monte Carlo
For the evening session, this lecture will be held on Thursday March 7th from 7pm to 10pm, in Lash-Miller room 161.
Readings:
Week 10 - March 11th & 12th - Variational inference
Demos:
Readings and watchings:
- David Blei’s lecture notes on VI
- Video of NIPS 2016 Tutorial on Variational Inference
- Advances in Variational Inference; Zhang, C., Butepage, J., Kjellstrom, H., & Mandt, S. (2017)
- Automatic Differentiation Variational Inference; Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2016)
Week 11 - March 18th & 19th - Reinforcement learning and gradient estimation
Week 12 - March 25th & 26th - Variational autoencoders and time series models
Related reading:
Week 13 - April 1st & 2nd - Generative models II
Related reading: