Reinforcement learning PhD course (5+3 hp)
Spring 2020, Period 4
Description
Reinforcement learning is a family of modern machine learning techniques which has
obtained unprecedented successes in artificial intelligence benchmarks, see for
instance Google’s AlphaGo’s successes against humans. Using reinforcement
learning techniques, computers can autonomously learn to make decisions using
feed-back from real and/or simulated environment/data. This course will give a PhD
level introduction to these techniques.
In particular, the general course objective is the following:
- Evaluate the applicability and limitations of reinforcement learning (RL) approaches to a given problem. Choose and implement basic forms of a suitable RL method.
Pre-course assignment
The pre-course assignment will be available in February, and it is recommended to solve this assignment as soon as possible. The assignments in the course will require Python programming, and the purpose of the pre-course assignment is to ensure that you have started using Python. If you are new to python the following introductory crash course is also very well suited for the content of this course.
Tentative schedule
The dates may change.
Week | Date | Lecture |
---|---|---|
15 | April 6 | Deadline: Homework 0 |
16 | April 14 | Introduction |
16 | April 15 | Markov Decision Processes, Dynamic Programming |
16 | April 17 | Reinforcement learning algorithms I |
17 | April 22 | Reinforcement learning algorithms II |
17 | April 24 | Planning and learning |
17 | Deadline: Homework 1 | |
19 | May 5 | Approximation methods |
19 | May 7 | Learning with approximations |
19 | Deadline: Homework 2 | |
20 | May 13 | Model-free vs model-based |
20 | May 15 | Additional methods |
21 | Deadline: Homework 3 |
Content
Markov Decision Processes, Dynamic Programming (Policy Evaluation, Policy
Iteration, Value Iteration), Model-free RL (Monte-Carlo Learning, Temporal-Difference
Methods, On-Policy and Off Policy Methods), Model-based RL, Deep Learning,
Approximation Methods for RL, Policy Gradient Methods.
Prerequisites
Programming experience, basic courses in linear algebra, probability and optimization.
Registration
Fill out the form at here.
Examination
3 hand-in assignments and one pre-course assignment (5 credits)
It is also possible to do a project in RL that is relevant to your research for 3 extra credtis.
Lecturers
Ayca Özcelikkale, André Teixeira, Per Mattsson
Contact Person
Per Mattsson, email: per.mattsson_at_it.uu.se