摘要:This lecture mainly introduces how to use reinforcement learning to solve a pulse control problem. First, the impulse control problem is converted into the optimal stopping time problem. Then the reinforcement learning method is used to solve the optimal stopping time problem, and then the verification theorem is used to obtain the optimal impulse control.