Experience Replay for Least Squares Policy Iteration
Policy iteration, which evaluates and improves the control policy iteratively, is a reinforcement learning method. Policy evaluation with the least-squares method can draw more useful information from the empirical data and therefore improve the data validity. However, most existing online least-squ
用户评论