Skip to main content
Log in

Heuristic dynamic programming-based learning control for discrete-time disturbed multi-agent systems

  • Research Article
  • Published:
Control Theory and Technology Aims and scope Submit manuscript

Abstract

Owing to extensive applications in many fields, the synchronization problem has been widely investigated in multi-agent systems. The synchronization for multi-agent systems is a pivotal issue, which means that under the designed control policy, the output of systems or the state of each agent can be consistent with the leader. The purpose of this paper is to investigate a heuristic dynamic programming (HDP)-based learning tracking control for discrete-time multi-agent systems to achieve synchronization while considering disturbances in systems. Besides, due to the difficulty of solving the coupled Hamilton–Jacobi–Bellman equation analytically, an improved HDP learning control algorithm is proposed to realize the synchronization between the leader and all following agents, which is executed by an action-critic neural network. The action and critic neural network are utilized to learn the optimal control policy and cost function, respectively, by means of introducing an auxiliary action network. Finally, two numerical examples and a practical application of mobile robots are presented to demonstrate the control performance of the HDP-based learning control algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. Abouheaf, M., Lewis, F. L., Vamvoudakis, K. G., Haesaert, S., & Babuska, R. (2014). Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica, 50(12), 3038–3053.

    Article  MathSciNet  Google Scholar 

  2. Song, R., Lewis, F. L., & Wei, Q. (2017). Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 704–713.

    Article  MathSciNet  Google Scholar 

  3. Mu, C., Zhao, Q., & Sun, C. (2020). Optimal model-free output synchronization of heterogeneous multi-agent systems under switching topologies. IEEE Transactions on Industrial Electronics, 67(12), 10951–10964.

    Article  Google Scholar 

  4. Yan, J., Guan, X., Luo, X., & Tan, F. (2011). Formation and obstacle avoidance control for multi-agent systems. Journal of Control Theory and Applications, 9(2), 141–147.

    Article  MathSciNet  Google Scholar 

  5. Wu, Y., Hu, B., & Guan, Z. (2019). Exponential consensus analysis for multi-agent networks based on time-delay impulsive systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 49(6), 1073–1080.

    Article  Google Scholar 

  6. Lu, Y., Guo, Y., & Dong, Z. (2010). Multi-agent flocking with formation in a constrained environment. Journal of Control Theory and Applications, 8(2), 151–159.

    Article  MathSciNet  Google Scholar 

  7. Mu, C., & Wang, K. (2018). Single-network ADP for near optimal control of continuous-time zero-sum games without using initial stabilising control laws. IET Control Theory and Applications, 12(18), 2449–2458.

    Article  Google Scholar 

  8. Liu, L., Wang, Z., & Zhang, H. (2017). Adaptive fault-tolerant tracking control for MIMO discrete-time systems via reinforcement learning algorithm with less learning parameters. IEEE Transactions on Automation Science and Engineering, 14(1), 299–313.

    Article  Google Scholar 

  9. Chen, X., Chen, G., Cao, W., & Wu, M. (2013). Cooperative learning with joint state value approximation for multi-agent systems. Journal of Control Theory and Applications, 11(2), 149–155.

    Article  MathSciNet  Google Scholar 

  10. Sokolov, Y., Kozma, R., Werbos, L. D., & Werbos, P. J. (2015). Complete stability analysis of a heuristic approximate dynamic programming control design. Automatica, 59, 9–18.

    Article  MathSciNet  Google Scholar 

  11. Sun, X., Mao, T., Laura, R., Shi, D., & Jerald, K. (2011). Hierarchical state-abstracted and socially augmented Q-learning for reducing complexity in agent-based learning. Journal of Control Theory and Applications, 9(3), 440–450.

    Article  MathSciNet  Google Scholar 

  12. Mu, C., Zhao, Q., Sun, C., & Gao, Z. (2019). An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics. Applied Soft Computing. https://doi.org/10.1016/j.asoc.2019.105593.

    Article  Google Scholar 

  13. Wang, F., Zhang, H., & Liu, D. (2009). Adaptive dynamic programming: an introduction. IEEE Computational Intelligence Magazine, 4(2), 39–47.

    Article  Google Scholar 

  14. Mu, C., Ni, Z., Sun, C., & He, H. (2017). Air-breathing hypersonic vehicle tracking control based on adaptive dynamic programming. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 584–598.

    Article  MathSciNet  Google Scholar 

  15. Wei, Q., & Liu, D. (2011). Finite horizon optimal control of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming. Journal of Control Theory and Applications, 9(3), 381–390.

    Article  MathSciNet  Google Scholar 

  16. Wen, S., Zeng, Z., Chen, M. Z. Q., & Huang, T. (2017). Synchronization of switched neural networks with communication delays via the event-triggered control. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2334–2343.

    Article  MathSciNet  Google Scholar 

  17. Ding, J., & Balakrishnan, S. N. (2011). Approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems. Journal of Control Theory and Applications, 9(3), 370–380.

    Article  MathSciNet  Google Scholar 

  18. Zhang, H., Luo, Y., & Liu, D. (2009). Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Transactions on Neural Networks, 20(9), 1490–1503.

    Article  Google Scholar 

  19. Vrabie, D., Pastravanu, O., Abu-Khalaf, M., & Lewis, F. L. (2009). Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 45(2), 477–484.

    Article  MathSciNet  Google Scholar 

  20. Vamvoudakis, K. G., & Lewis, F. L. (2010). Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 46(5), 878–888.

    Article  MathSciNet  Google Scholar 

  21. Jiang, Y., & Jiang, Z. (2012). Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica, 48(10), 2699–2704.

    Article  MathSciNet  Google Scholar 

  22. Vamvoudakis, K. G., Lewis, F. L., & Hudas, G. R. (2012). Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica, 48(8), 1598–1611.

    Article  MathSciNet  Google Scholar 

  23. Zhang, H., Zhang, J., Yang, G., & Luo, Y. (2015). Leader-based optimal coordination control for the consensus problem of multi-agent differential games via fuzzy adaptive dynamic programming. IEEE Transactions on Fuzzy Systems, 23(1), 152–163.

    Article  Google Scholar 

  24. Yang, H., Wang, F., Zhang, Z., & Zong, G. (2010). Consensus of multi-agent systems based on disturbance observer. Journal of Control Theory and Applications, 8(2), 145–150.

    Article  MathSciNet  Google Scholar 

  25. Li, Y., Liu, L., & Feng, G. (2018). Robust adaptive output feedback control to a class of non-triangular stochastic nonlinear systems. Automatica, 89, 325–332.

    Article  MathSciNet  Google Scholar 

  26. Xiao, B., Yang, X., Karimi, H. R., & Qiu, J. (2019). Asymptotic tracking control for a more representative class of uncertain nonlinear systems with mismatched uncertainties. IEEE Transactions on Industrial Electronics, 66(12), 9417–9427.

    Article  Google Scholar 

  27. Jiang, Y., & Jiang, Z. (2012). Robust adaptive dynamic programming for large-scale systems with an application to multimachine power systems. IEEE Transactions on Circuits and Systems: Express Briefs, 59(10), 693–697.

    Google Scholar 

  28. Lin, F. (2010). An optimal control approach to robust control design. International Journal of Control, 73(3), 177–186.

    Article  MathSciNet  Google Scholar 

  29. Mu, C., Zhang, Y., Gao, Z., & Sun, C. (2020). ADP-based robust tracking control for a class of nonlinear systems with unmatched uncertainties. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 50(11), 4056–4067.

    Article  Google Scholar 

  30. Cao, W., Zhang, J., & Ren, W. (2015). Leader-follower consensus of linear multi-agent systems with unknown external disturbances. Systems & Control Letters, 82, 64–70.

    Article  MathSciNet  Google Scholar 

  31. Wang, Q., Sun, C., Chai, X., & Yu, Y. (2018). Disturbance observer-based sliding mode control for multi-agent systems with mismatched uncertainties. Assembly Automation, 38, 606–614.

    Article  Google Scholar 

  32. Wang, X., Hong, Y., Yi, P., Ji, H., & Kang, Y. (2017). Distributed optimization design of continuous-time multi-agent systems with unknown-frequency disturbances. IEEE Transactions on Cybernetics, 47(8), 2058–2066.

    Article  Google Scholar 

  33. Wang, X., Hong, Y., & Ji, H. (2016). Distributed optimization for a class of nonlinear multi-agent systems with disturbance rejection. IEEE Transactions on Cybernetics, 46(7), 1655–1666.

    Article  Google Scholar 

  34. Lin, F., Brandt, R. D., & Sun, J. (1992). Robust control of nonlinear systems: compensating for uncertainty. International Journal of Control, 56(6), 1453–1459.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by Tianjin Natural Science Foundation under Grant 20JCYBJC00880, Beijing key Laboratory Open Fund of Long-Life Technology of Precise Rotation and Transmission Mechanisms, and Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chaoxu Mu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Mu, C., Zhang, Y. et al. Heuristic dynamic programming-based learning control for discrete-time disturbed multi-agent systems. Control Theory Technol. 19, 339–353 (2021). https://doi.org/10.1007/s11768-021-00049-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11768-021-00049-9

Keywords

Navigation