全部

Reinforcement Learning for Stochastic Control Problems

  • 演讲者:李爽(南科大)

  • 时间:2024-01-18 19:40-20:20

  • 地点:理学院大楼M5024

Abstract: In reinforcement learning, methods for problem approximation include forced decomposition and probability approximation. Forced decomposition breaks complex problems down into simpler sub-problems, simplifying the solution process, while probability approximation uses probability models for problem approximation, improving learning efficiency. Deterministic equivalent control replaces probabilistic strategies with deterministic strategies, reducing computational complexity and accelerating learning. The rollout method pre-generates action trajectories to guide policy improvement and avoid local optima. The policy improvement principle continuously optimizes the current policy to improve performance, gradually enabling the agent to learn optimal decision-making strategies. These methods and principles together promote the broad application of reinforcement learning.