# EE807 Mathematical Foundations of Reinforcement Learning

Home
Schedule
Lecture Note
Assignments

## Course Description

The subject of this course is sequential decision making under uncertainty in a system whose evolution is influenced by decisions. The decision made at any given time depends on the state of the system and the objective is to select a decision making rule that optimizes a certain performance criterion. Such problems can be solved, in principle, using the classical methods of dynamic programming. In practice, however, the applicability of dynamic programming to many important problems is limited by the enormous size of the underlying state spaces. “Neuro-dynamic programming” or "Reinforcement Learning" which is the term used in the Artificial Intelligence literature, uses neural networks and other approximation architectures to overcome such bottlenecks to the applicability of dynamic programming.
The methodology allows systems to learn about their behavior through simulation, and to improve their performance through iterative reinforcement. The focus of this course is to understand the mathematical foundations of this methodology in light of the convergence and degree of suboptimality of different algorithms.

## Anouncement

- 2017-03-16 : HW #1 uploaded, note that due date is postponed to 2017/03/27 link

## General Information

- Syllabus

- Lecture: Mon/Wed 14:30-16:00pm at Kim Beang-Ho & Kim Sam-Youl ITC Building (N1) #102.

- Professor:
**Name** |
**Office Hours** |
**Office** |
**Tel** |
**Email** |

Song Chong |
TBA |
IT Center (N1)-913 |
042-350-3473 |
songchong@kaist.edu |

- Teaching Assistants:
**Name** |
**Office Hours** |
**Office** |
**Tel** |
**Email** |

Seyeon Kim |
TBA |
IT Center (N1)-918 |
042-350-5473 |
sy.kim@netsys.kaist.ac.kr |

Donghoon Lee |
TBA |
IT Center (N1)-918 |
042-350-5473 |
dhlee@netsys.kaist.ac.kr |

- Textbook:
- Neuro-Dynamic Programming, Dimitri P. Bertsekas and John Tsitsiklis, Athena Scientific, 1996
- Dynamic Programming and Optimal Control, Vol. II: Approximate Dynamic Programming, Dimitri P. Bertsekas, Athena Scientific, 2012
- Approximate Dynamic Programming: Solving the Curse of Dimensionality, Warren B. Powell, Wiley 2011
- Lecture Notes.

## Grading

- Midterm 35%
- Final 35%
- Homework 20%
- Attendance 10%

EE807 Mathematical Foundations of Reinforcement Learning