CSC 2547 Fall 2019: Learning to Search


In planning, search, active learning, programming, and approximate inference, we usually face a series of similar tasks. We can often generalize from the problems solved so far, or even combine parts of previous solutions to solve a new one. This course will survey foundational ideas, recent work, and applications in this area. Specifically, it will cover self-improving tree-search methods such as alphazero, meta-learning, hypernetworks, self-tuning gradient estimators, amortized inference, self-improving theorem provers, and planning in POMDPs. Evaluation will be based mainly on a project involving original research by the students. Students should already be familiar with the basics of machine learning such as linear algebra, optimization, and probability.

The class will have a major project component, and will be run in a similar manner to Learning Discrete Latent Structure


This course is designed to bring students to the current state of the art, so that ideally, their course projects can make a novel contribution. A previous course in machine learning such as CSC321, CSC411, CSC412, STA414, or ECE521 is strongly recommended. However, the only hard requirements are linear algebra, basic multivariate calculus, the basics of probability, and programming skills.

To check if you have the background for this course, try taking this Quiz. If more than half the questions are too difficult, you might want to put some extra work into preparation.

Where and When

Course Structure

Aside from the first two and last two lectures, each week a different group of students will present on a set of related papers covering an aspect of these methods. I’ll provide guidance to each group about the content of these presentations.

In-class discussion will center around understanding the strengths and weaknesses of these methods, their relationships, possible extensions, and experiments that might better illuminate their properties.

The hope is that these discussions will lead to actual research papers, or resources that will help others understand these approaches.

Grades will (tentatively) be based on:


Students can work on projects individually, or in groups of up to four. The grade will depend on the ideas, how well you present them in the report, how clearly you position your work relative to existing literature, how illuminating your experiments are, and well-supported your conclusions are. Full marks will require a novel contribution.

Each group of students will write a short (around 2 pages) research project proposal, which ideally will be structured similarly to a standard paper. It should include a description of a minimum viable project, some nice-to-haves if time allows, and a short review of related work. You don’t have to do what your project proposal says - the point of the proposal is mainly to have a plan and to make it easy for me to give you feedback.

Towards the end of the course everyone will present their project in a short, roughly 5 minute, presentation.

At the end of the class you’ll hand in a project report (around 4 to 8 pages), ideally in the format of a machine learning conference paper such as NIPS.

Tentative Schedule

This lecture will set the scope of the course, the different settings where amortized search can be useful, and the main existing approaches.

Week 2 - SAT Solving

Week 3 - Deep Reinforcement learning

Recommended reading:

Week 4 - Differentiable Computation

Attempts learn programs using gradient-based methods, and program induction in general.

Recommended reading:

Week 5 - Active Learning

Week 6 - Learning to program

Week 7 - Meta-reasoning

Week 8 - Asypmtotically Optimal Algorithms

Week 9 - Search trees

Weeks 10 and 11 - Project presentations

The last two weeks will be a series of 5-minute project presentations.

Discrete variables makes gradient estimation hard, but there has been a lot of recent progress on developing unbiased gradient estimators.