LEC # | TOPICS | KEY DATES |
---|---|---|
1 | Markov Decision Processes Finite-Horizon Problems: Backwards Induction Discounted-Cost Problems: Cost-to-Go Function, Bellman's Equation |
|
2 | Value Iteration Existence and Uniqueness of Bellman's Equation Solution Gauss-Seidel Value Iteration |
|
3 | Optimality of Policies derived from the Cost-to-go Function Policy Iteration Asynchronous Policy Iteration |
Problem set 1 out |
4 | Average-Cost Problems Relationship with Discounted-Cost Problems Bellman's Equation Blackwell Optimality |
Problem set 1 due |
5 | Average-Cost Problems Computational Methods |
|
6 | Application of Value Iteration to Optimization of Multiclass Queueing Networks Introduction to Simulation-based Methods Real-Time Value Iteration |
Problem set 2 out |
7 | Q-Learning Stochastic Approximations |
|
8 | Stochastic Approximations: Lyapunov Function Analysis The ODE Method Convergence of Q-Learning |
|
9 | Exploration versus Exploitation: The Complexity of Reinforcement Learning | |
10 | Introduction to Value Function Approximation Curse of Dimensionality Approximation Architectures |
|
11 | Model Selection and Complexity | Problem set 3 out |
12 | Introduction to Value Function Approximation Algorithms Performance Bounds |
|
13 | Temporal-Difference Learning with Value Function Approximation | |
14 | Temporal-Difference Learning with Value Function Approximation (cont.) | |
15 | Temporal-Difference Learning with Value Function Approximation (cont.) Optimal Stopping Problems General Control Problems |
|
16 | Approximate Linear Programming | Problem set 4 out |
17 | Approximate Linear Programming (cont.) | |
18 | Efficient Solutions for Approximate Linear Programming | |
19 | Efficient Solutions for Approximate Linear Programming: Factored MDPs | |
20 | Policy Search Methods | Problem set 5 out |
21 | Policy Search Methods (cont.) | |
22 | Policy Search Methods for POMDPs Application: Call Admission Control Actor-Critic Methods |
|
23 | Guest Lecture: Prof. Nick Roy Approximate POMDP Compression |
|
24 | Policy Search Methods: PEGASUS Application: Helicopter Control |