Smooth Actor-Critic Algorithm for End-to-End Autonomous Driving

2020 
For the intelligent sequential decision-making tasks like autonomous driving, decisions or actions made by the agent in a short period of time should be smooth enough or not too choppy. In order to help the agent learn smooth actions (steering, accelerating, braking) for autonomous driving, this paper proposes the smooth actor-critic algorithm for both deterministic policy and stochastic policy systems. Specifically, a regularization term is added to the objective function of actorcritic methods to constrain the difference between neighbouring actions in a small region without affecting the convergence performance of the whole system. Then, the theoretical analysis and proof for the modified methods are conducted so that it can be theoretically guaranteed in terms of iterative improvements. Moreover, experiments in different simulation systems also prove that the methods can generate much smoother actions and obtain more robust performance for reinforcement learning-based End-to-End autonomous driving.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []