1. 首页
  2. 移动开发
  3. 其他
  4. Experience Replay for Least Squares Policy Iteration

Experience Replay for Least Squares Policy Iteration

上传者: 2021-04-08 15:12:25上传 PDF文件 1.17MB 热度 5次
Policy iteration, which evaluates and improves the control policy iteratively, is a reinforcement learning method. Policy evaluation with the least-squares method can draw more useful information from the empirical data and therefore improve the data validity. However, most existing online least-squ
用户评论