Topics and Objectives (Apr 22 – Apr 26)
- Fitted Q-Iteration
- Functional approximation
- Linear Bellman complete
- Linear MDP
- Reproducing kernel Hilbert space
- Policy gradient theorem
- REINFORCE algorithm
- Actor-critic and soft actor-critic
- Regularization and pessimism
Lecture Notes
Homework