\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

A Sim2real method based on DDQN for training a self-driving scale car

  • * Corresponding author: Tao Du

    * Corresponding author: Tao Du 
Abstract / Introduction Full Text(HTML) Figure(11) / Table(2) Related Papers Cited by
  • The self-driving based on deep reinforcement learning, as the most important application of artificial intelligence, has become a popular topic. Most of the current self-driving methods focus on how to directly learn end-to-end self-driving control strategy from the raw sensory data. Essentially, this control strategy can be considered as a mapping between images and driving behavior, which usually faces a problem of low generalization ability. To improve the generalization ability for the driving behavior, the reinforcement learning method requires extrinsic reward from the real environment, which may damage the car. In order to obtain a good generalization ability in safety, a virtual simulation environment that can be constructed different driving scene is designed by Unity. A theoretical model is established and analyzed in the virtual simulation environment, and it is trained by double Deep Q-network. Then, the trained model is migrated to a scale car in real world. This process is also called a sim2real method. The sim2real training method efficiently handles these two problems. The simulations and experiments are carried out to evaluate the performance and effectiveness of the proposed algorithm. Finally, it is demonstrated that the scale car in real world obtains the capability for autonomous driving.

    Mathematics Subject Classification: Primary: 58F15, 58F17; Secondary: 53C35.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  The reinforcement learning scale car based on DDQN

    Figure 2.  One 1:16 scale car. There is an opensource DIY self-driving platform for small scale cars called donkeycar [18]

    Figure 3.  The process of reinforcement learning

    Figure 4.  The architecture of the network

    Figure 5.  The examples of raw images transfer to the segmented images

    Figure 6.  The learning curve of "average rewards - train episodes"

    Figure 7.  The scale vehicle car in the Unity Simulation

    Figure 8.  The road for self-driving scale vehicle car, which contains two fast curves and two gentle curves

    Figure 9.  The trained self-driving scale vehicle car

    Figure 10.  An obstacle is added on the road. The angle of view of the car in the lower left of the figure

    Figure 11.  There are five obstacles on the left figure. And there are three obstacles on the right figure

    Table 1.  Performance of CNN, DDQN in the same road. The number means the times of the car outside

    5(night) 3 0
    10(daylight) 4 1
    10(night) 8 2
    15(daylight) 9 1
    15(night) 12 3
     | Show Table
    DownLoad: CSV

    Table 2.  Performance of CNN, DDQN in the same road with obstacle(s) and in five laps. The number means the times of the car hitting the obstacle(s)

    3 3 0
    5 4 0
     | Show Table
    DownLoad: CSV
  • [1] H. Abraham, C. Lee, S. Brady, C. Fitzgerald, B. Mehler, B. Reimer and J. F. Coughlin, Autonomous vehicles, trust, and driving alternatives: a survey of consumer preferences, Massachusetts Inst. Technol, AgeLab, Cambridge, (2016), 1–16.
    [2] K. J. Aditya, Working model of self-driving car using Convolutional Neural Network, Raspberry Pi and Arduino, in 2018 Second International Conference on Electronics, Communication and Aerospace Technology, IEEE, 2018, 1630–1635.
    [3] P. Andhare and S. Rawat, Pick and place industrial robot controller with computer vision, in 2016 International Conference on Computing Communication Control and automation, 2016, 1–4. doi: 10.1109/ICCUBEA.2016.7860048.
    [4] C. Chen, A. Seff, A. Kornhauser and J. Xiao, Deepdriving: Learning affordance for direct perception in autonomous driving, in IEEE International Conference on Computer Vision, 2015, 2722–2730. doi: 10.1109/ICCV.2015.312.
    [5] Z. Chen and X. Huang, End-to-end learning for lane keeping of self-driving cars, in IEEE Intelligent Vehicles Symposium, IEEE, 2018, 1856–1860. doi: 10.1109/IVS.2017.7995975.
    [6] F. Codevilla, M. Miiller, A. Lopez, V. Koltun and A. Dosovitskiy, End-to-end driving via conditional imitation learning, in IEEE International Conference on Robotics and Automation, IEEE, 2018, 4693–4700. doi: 10.1109/ICRA.2018.8460487.
    [7] D. Dorr, D. Grabengiesser and F. Gauterin, Online driving style recognition using fuzzy logic, in 17th International IEEE Conference on Intelligent Transportation Systems, IEEE, 2014, 1021–1026. doi: 10.1109/ITSC.2014.6957822.
    [8] X. Liang, T. Wang, L. Yang and E. Xing, CIRL: controllable imitative reinforcement learning for vision-based self-driving, in Proceedings of the European Conference on Computer Vision, 2018, 604–620. doi: 10.1007/978-3-030-01234-2_36.
    [9] L. J. Lin, Reinforcement Learning for Robots Using Neural Networks, Ph.D thesis, Carnegie Mellon University in Pittsburgh, 1993.
    [10] R. R. Meganathan, A. A. Kasi and S. Jagannath, Computer vision based novel steering angle calculation for autonomous vehicles, in 2018 Second IEEE International Conference on Robotic Computing, 2018, 143–146.
    [11] https://github.com/naokishibuya/car-behavioral-cloning
    [12] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra and M. Riedmiller, Playing atari with deep reinforcement learning, preprint, arXiv: 1312.5602.
    [13] V. Mnih and et al., Human-level control through deep reinforcement learning, Nature, 518 (2015), 529-533.  doi: 10.1038/nature14236.
    [14] C. J. Pretorius, M. C. du Plessis and J. W. Gonsalves, The transferability of evolved hexapod locomotion controllers from simulation to real hardware, in 2017 IEEE International Conference on Real-time Computing and Robotics, 2017, 567–574. doi: 10.1109/RCAR.2017.8311923.
    [15] Understanding the Fatal Tesla Accident on Autopilot and the NHTSA Probe, Electrek, 2016. Available from: https://electrek.co/2016/07/01/understanding-fatal-tesla-accident-autopilot-nhtsa-probe/.
    [16] M. SadeghzadehD. Calvert and H. A. Abdullah, Self-learning visual servoing of robot manipulator using explanation-based fuzzy neural networks and Q-learning, Journal of Intelligent and Robotic Systems, 78 (2015), 83-104.  doi: 10.1007/s10846-014-0151-5.
    [17] R. S. Sutton and  A. G. BartoReinforcement Learning: An Introduction, $2^nd$ edition, Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, 2018. 
    [18] https://docs.donkeycar.com/
    [19] https://github.com/autorope/donkeycar
    [20] H. Van, A Guez and D Silver, Deep reinforcement learning with double Q-Learning, in National Conference on Artificial Intelligence, 2016, 2094–2100.
    [21] D. WangJ. WenY. WangX. Huang and F. Pei, End-to-end self-driving using deep neural networks with multi-auxiliary tasks, Automotive Innovation, 2 (2019), 127-136.  doi: 10.1007/s42154-019-00057-1.
    [22] C. J. Watkins and P. Dayan, Q-learning, Machine Learning, 8 (1992), 279-292.  doi: 10.1007/BF00992698.
    [23] T. Yamawaki and M. Yashima, Application of adam to iterative learning for an in-hand manipulation task, ROMANSY 22 Robot Design, Dynamics and Control, 584 (2019), 272-279.  doi: 10.1007/978-3-319-78963-7_35.
  • 加载中

Figures(11)

Tables(2)

SHARE

Article Metrics

HTML views(3365) PDF downloads(1130) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return