Course Description

Reinforcement learning (RL) is a subfield of machine learning that focuses on training an agent to make sequential decisions in an environment to maximize cumulative rewards. It is inspired by how humans and animals learn from their interactions with the world. An RL agent interacts with an environment in a series of discrete time steps. At each step, the agent observes the current state of the environment and takes an action based on its learned policy. The environment transitions to a new state, and the agent receives a reward signal that indicates the desirability of its action. The goal of the agent is to learn an optimal policy that maximizes the long-term expected cumulative reward. RL agents learn through trial and error. Initially, the agent explores the environment, taking random or exploratory actions to gather experience. As it receives feedback in the form of rewards, it updates its policy and value function to gradually improve its decision-making. RL has been uccessfully applied to various domains, including robotics, game playing, recommendation systems, autonomous vehicles, and more. We will learn how to formalize RL problems as Markov Decision Process as well as various techniques and approaches for optimizing the agent's behavior given such problems. Assignments will cover the basics of reinforcement learning as well as deep reinforcement learning -- a promising new area that combines deep learning techniques with reinforcement learning.

Learning Outcomes

Following this class, participating students will be able to:

  • Identify if a given domain can be effectively addressed with RL. If so, formulate it as an MDP and suggest/justify appropriate RL techniques.
  • Characterize different classes of RL algorithms according to their advantages and drawbacks with respect to various domain characteristics.
  • Efficiently implement common RL and deep RL algorithms.
  • Describe common evaluation metrices for RL algorithms e.g., regret, sample complexity, computational complexity, expected sum of rewards, discounted sum of rewards, convergence, etc.
  • Identify which metric is most relevant to a given application.

Expected Background Knowledge

Participating students should be familiar with the following topics.

  • Proficiency in Python. Class assignments will require coding in Python. For those who aren't as familiar with Python, please follow this tutorial. Go over all topics under "Python Tutorial" and "Python NumPy". It will also be beneficial to go over all topics under "Machine Learning".
  • College Level Calculus and Linear Algebra. You should be comfortable taking derivatives and understanding matrix operations and notation. You are encouraged to go over the "Essence of linear algebra" and "Essence of calculus" playlists by "3Blue1Brown".
  • Basic Probability and Statistics. You should be familiar with basics of probabilities, Gaussian, Beta, and Binomial distributions, mean, standard deviation, etc. You are encouraged to go over the "Probabilities of probabilities" playlist by "3Blue1Brown".
  • Foundations of Machine Learning. We will be formulating loss functions, taking derivatives, and performing optimization with gradient descent. Some optimization tricks will be more intuitive with background in convex optimization and Machine Learning.
  • Recommended Background Courses. CSCE 110 Programming, MATH 304 Linear Algebra, MATH 308 Differential Equations, MATH 411 Mathematical Probability, CSCE 625/425 Artificial Intelligence, CSCE 633/421 Machine Learning, CSCE 636 Deep Learning.

Staff

Instructor Guni Sharon
E-Mail guni@tamu.edu
Office Hours Tuesday, 16:00 -- 17:00
Office PETR 316
TAs Sheel Dey, James Ault
E-Mail sheelabhadra@tamu.edu jault@tamu.edu
Office Hours Wednesday, 12:00 -- 13:00
Office EAB-C 107B