Course Project

Your Course project can take one of two forms.

  1. Practice (preferred): An implementation of RL in some domain of your choice - ideally one that you are using for research or in some other class. In this case, please describe the domain and your initial plans on how you intend to implement learning. What will the states and actions be? What algorithm(s) do you expect will be most effective?
  2. Theory: A proposal, implementation and testing of an algorithmic modification to an RL algorithm presented in class. In this case, please describe the modification you propose to investigate and on what type of domain (possibly a toy domain) it is likely to show an improvement over things considered in the book.

You may try to build on some of the chapters or research papers you have read (or will read) in the class; you may try to reimplement something you've found interesting that others have done; you can try to do something that has never been done before; you may write new code from scratch; you may modify existing code. It's up to you!

Our lab (Pi-Star) also has a few high-impact project ideas that you may consider working on for your course project. If you are interested, please review the projects description and attend the TA's office hours.

You are required to work in teams of 2 and are strongly encouraged to work on all aspects together (i.e. pair programming rather than divide and conquer). Teams should only turn in one submission (only one team member shoudl upload each file). However, each person must turn in an independently written summary of each person's contribution to the final product.

You may build on existing work for this project and utilize existing code (your own or code found on the web), but you must give proper attribution to all existing work that you build on and make clear what your new contribution is. Any unattributed or uncited work that you use will be considered a breach of academic honesty and dealt with according to the course policy in the syllabus. Furthermore, you may not claim your own existing work as a new contribution. You may build on your own work, but it must be clearly cited as existing work and you must do new work for the class project.

Submission

Submit your reports through Canvas, on or before the specified deadline. Include the full name and UIN for each team member. Submit only single copy of your report (submitted by one of the two team members).

The project timeline is as follows.

Project Proposal

Project Proposal due on Monday, September 30 at 11:59pm. The proposal structure should be as follows:

  • Team members
  • Proposed starting point. That is, what code base will you be starting with? Is there anything you need us to provide you with that you don't already have? (No guarantees you'll get what you ask for, but we'll try)
  • What you plan to do and how you plan to do it?

The proposal should be written with the goal of convincing us that what you are proposing to do is interesting and non-trivial (though not necessarily completely original - see below).

It is completely legitimate to propose to do something based on something you read about provided that you are going to do the coding yourself. Just make sure to acknowledge any ideas (and code) that you borrow and be sure to clearly identify what you are going to do.

We encourage you to look ahead to topics that will be covered later in the course that may interest you, or to focus on a topic of interest that will not be covered in this course.

Be as specific as you can at this point. The more specific you are, the more detailed feedback you will get. For example, if you are doing an "applications" project:

  • In what sense is your problem sequential?
  • What is your problem's state space?
  • What is your problem's action space?
  • What is your problem's transition function?
  • What reward function will you use?
  • What is the simplest possible first result that you will try to get? What RL algorithm will you use? What will be the baseline you compare against?
  • What will be the stretch goal for your project?

Even if you are doing more of an algorithm-based or theory-based project, try to be similarly specific about what you intend to study. What is your main question? And of course if you are reimplementing an existing technique or replicating a prior experiment, say exactly which.

Literature Survey

Literature Survey due on Monday, November 4 at 11:59pm. Submit a literature survey of the work most closely related to your project. It should begin with a summary of your current plans for the final project. If nothing has changed since the proposal, you can use the same text. If you have changed your plans in some way and would like further feedback, please make it very clear by placing the changes in bold or in a clearly marked separate section. The survey should include at least 10 references, some of which can be from the class readings. For each, you should discuss how it differs from or is similar to the work you plan to turn in for your final project. A good survey discusses each of the references at a technical level - not just what is done, but also how. Please put the references at the end, with full reference information (authors, title, date, publication venue, etc.). We suggest writing this as if it were a section of a research paper, so that you can then use it directly in your final report.

Final Report

Your final report is due on Monday, December 2 at 11:59pm. Your final submission should include:

  • Source code, executable and README . We recommend you using a github repository to hold the source code, executable and a README file that provides a brief guide to run your code. In this case, you just need to provide the github link within your final report. If you want to keep your repository private for any reason, please zip your project folder including the source code, executable and the README file and attach to the final submission.
  • A 5-minute mp4 YouTube video summarizing your project and the main results (make sure to choose unlist option so that only people with the URL can access your video). The URL should be included in your final report PDF, at the beginning of your report, along with your github link.
  • A detailed written report describing your project, including its merits, and its deficiencies. As much as possible, you should relate your approach to the readings from throughout the course. View this report as a term paper. It is in place of a final exam and will be a large factor in your final grade for the project and for the course. The report should be roughly in the style of a conference paper, including introduction, motivation, related work, etc. All writing should be your own -- all quotes must be clearly attributed.
  • Recall the points from the proposal and literature survey above. In particular, for applications projects be very clear about how you model your problem, and in what sense it's sequential.
  • Include at least 10 citations with full bibliographic references to acknowledge where your ideas came from.
  • Be very clear about what code you've used from other sources, if any. Clear citations are essential. Failure to credit ideas and code from external sources is cheating.
  • Make sure you evaluate both the good and bad points of your approach.
  • Show results of at least one experiment evaluating some aspect of or your entire approach, preferably showing error bars or some sort of statistical measure of the significance. Even if you didn't accomplish your goal, evaluate what you did do.
  • A single well-analyzed experiment in a simple domain that compares clearly against a baseline is preferable to a shallow set of experiments across many domains.
  • If any parameters are mentioned in the report, be sure to mention how you arrived at their values. Was it the first thing you tried? Trial and error? Roughly how many trials? etc.
  • Remember to proofread and spell-check!
  • Each team member should individually (and privately) identify what was your role in the overall project, and what was your partner's role. If everything was done together, a short statement to that effect is sufficient. If you feel that your partner has not contributed adequately, this is the opportunity to let us know. Please submit a single PDF to Canvas. This time, each team member should individually upload this file. Do not forget to include the project title, your name and UIN in this file.



The project instructions are based of text by Dr. Peter Stone from UT-Austin (with his permission).