Course Schedule

The following tentative course schedule will be followed. The schedule is flexible and might change due to progression paste, students' feedback, and unforeseen issues.

EventWeekDescriptionRequired reading
Lecture1Introduction to Reinforcement LearningSB (Sutton and Barton) Chp 1
Lecture1Multi-armed banditsSB Chp 2
Lecture2Markov-decision processesSB Chp 3
Assignment3Value Iteration
Lecture3Dynamic programmingSB Chp 4
Assignment4Policy Iteration
Lecture4Monte Carlo methodsSB Chp 5
Assignment4Monte Carlo methods
Lecture5Temporal-difference learningSB Chp 6
Assignment5Q-learning
Lecture6n-step bootstrappingSB Chp 7
Lecture6Tabular methodsSB Chp 8
Project6Project proposal due
Lecture7Function approximationSB Chp 9
Lecture7Deep Neural network approximationDNN tutorial
Lecture8On-policy control with approximationSB Chp 10
Lecture9Eligibility tracesSB Chp 12
Lecture9Deep Q-learningDQN, DDQN
Assignment9DQN
Lecture10Policy gradient methodsSB Chp 13
Assignment10REINFORCE
Lecture10Actor-critic methodsSB Chp 13.5
Lecture11Trust regionsTRPO, PPO
Assignment11A2C
Lecture11Soft Actor-CriticSAC
Project11Literature review due
Lecture12Imitation learningDAgger
Lecture12Inverse Reinforcement LearningApprenticeship Learning, GAIL
Lecture13Transfer and Multi-task LearningProgressive Neural Networks
Lecture13Derivative free optimizationCMA-ES
Lecture14Curriculum learning
Lecture14Advanced topics TBD
Project14Final paper due