A Model-Based Approach for Continuous-Time Policy Evaluation with Unknown Lévy Process Dynamics
科研大讨论系列报告
报告题目(Title):A Model-Based Approach for Continuous-Time Policy Evaluation with Unknown Lévy Process Dynamics
报告人(Speaker):田小川(University of California, San Diego)
地点(Place):后主楼1124
时间(Time):2024年5月31日下午2:00-3:00
邀请人(Inviter):熊云丰
报告摘要
Reinforcement learning (RL) is an active branch of machine learning focused on learning optimal policies to maximize cumulative rewards through interaction with the environment. While traditional RL research primarily deals with Markov decision processes in discrete time and space, we explore RL in a continuous-time framework, essential for high-frequency interactions such as stock trading and autonomous driving.
Our research introduces a PDE-based framework for policy evaluation in continuous-time environments, where dynamics are modeled by Lévy processes. We also formulate the Hamilton-Jacobi-Bellman (HJB) equation for the corresponding stochastic optimal control problem governed by Lévy dynamics. Our approach includes two primary components: 1) Estimating parameters of Lévy processes from observed data, and 2) Evaluating policies by solving the associated integro-PDEs. In the first step, we use a fast solver for the fractional Fokker-Planck equation to accurately approximate transition probabilities. We demonstrate that combining this method with importance sampling techniques is vital for parameter recovery in heavy-tailed data distributions. In the second step, we offer a theoretical guarantee on the accuracy of policy evaluation considering modeling error. Our work establishes a foundation for continuous-time RL in environments characterized by complex, heavy-tailed dynamics.
主讲人简介
Xiaochuan Tian is an Assistant Professor of Mathematics at the University of California, San Diego. She received her Ph.D. from Columbia University and was a Bing Instructor in the Department of Mathematics at the University of Texas, Austin. Her areas of research interest include numerical PDEs, nonlocal integral models, multiscale and stochastic modeling, and most recently, intersections of PDEs and data analysis. Her research is partially funded by the National Science Foundation CAREER grant and the Alfred P. Sloan Fellowship.