Abstract
Owing to extensive applications in many fields, the synchronization problem has been widely investigated in multi-agent systems. The synchronization for multi-agent systems is a pivotal issue, which means that under the designed control policy, the output of systems or the state of each agent can be consistent with the leader. The purpose of this paper is to investigate a heuristic dynamic programming (HDP)-based learning tracking control for discrete-time multi-agent systems to achieve synchronization while considering disturbances in systems. Besides, due to the difficulty of solving the coupled Hamilton–Jacobi–Bellman equation analytically, an improved HDP learning control algorithm is proposed to realize the synchronization between the leader and all following agents, which is executed by an action-critic neural network. The action and critic neural network are utilized to learn the optimal control policy and cost function, respectively, by means of introducing an auxiliary action network. Finally, two numerical examples and a practical application of mobile robots are presented to demonstrate the control performance of the HDP-based learning control algorithm.
Similar content being viewed by others
References
Abouheaf, M., Lewis, F. L., Vamvoudakis, K. G., Haesaert, S., & Babuska, R. (2014). Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica, 50(12), 3038–3053.
Song, R., Lewis, F. L., & Wei, Q. (2017). Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 704–713.
Mu, C., Zhao, Q., & Sun, C. (2020). Optimal model-free output synchronization of heterogeneous multi-agent systems under switching topologies. IEEE Transactions on Industrial Electronics, 67(12), 10951–10964.
Yan, J., Guan, X., Luo, X., & Tan, F. (2011). Formation and obstacle avoidance control for multi-agent systems. Journal of Control Theory and Applications, 9(2), 141–147.
Wu, Y., Hu, B., & Guan, Z. (2019). Exponential consensus analysis for multi-agent networks based on time-delay impulsive systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 49(6), 1073–1080.
Lu, Y., Guo, Y., & Dong, Z. (2010). Multi-agent flocking with formation in a constrained environment. Journal of Control Theory and Applications, 8(2), 151–159.
Mu, C., & Wang, K. (2018). Single-network ADP for near optimal control of continuous-time zero-sum games without using initial stabilising control laws. IET Control Theory and Applications, 12(18), 2449–2458.
Liu, L., Wang, Z., & Zhang, H. (2017). Adaptive fault-tolerant tracking control for MIMO discrete-time systems via reinforcement learning algorithm with less learning parameters. IEEE Transactions on Automation Science and Engineering, 14(1), 299–313.
Chen, X., Chen, G., Cao, W., & Wu, M. (2013). Cooperative learning with joint state value approximation for multi-agent systems. Journal of Control Theory and Applications, 11(2), 149–155.
Sokolov, Y., Kozma, R., Werbos, L. D., & Werbos, P. J. (2015). Complete stability analysis of a heuristic approximate dynamic programming control design. Automatica, 59, 9–18.
Sun, X., Mao, T., Laura, R., Shi, D., & Jerald, K. (2011). Hierarchical state-abstracted and socially augmented Q-learning for reducing complexity in agent-based learning. Journal of Control Theory and Applications, 9(3), 440–450.
Mu, C., Zhao, Q., Sun, C., & Gao, Z. (2019). An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics. Applied Soft Computing. https://doi.org/10.1016/j.asoc.2019.105593.
Wang, F., Zhang, H., & Liu, D. (2009). Adaptive dynamic programming: an introduction. IEEE Computational Intelligence Magazine, 4(2), 39–47.
Mu, C., Ni, Z., Sun, C., & He, H. (2017). Air-breathing hypersonic vehicle tracking control based on adaptive dynamic programming. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 584–598.
Wei, Q., & Liu, D. (2011). Finite horizon optimal control of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming. Journal of Control Theory and Applications, 9(3), 381–390.
Wen, S., Zeng, Z., Chen, M. Z. Q., & Huang, T. (2017). Synchronization of switched neural networks with communication delays via the event-triggered control. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2334–2343.
Ding, J., & Balakrishnan, S. N. (2011). Approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems. Journal of Control Theory and Applications, 9(3), 370–380.
Zhang, H., Luo, Y., & Liu, D. (2009). Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Transactions on Neural Networks, 20(9), 1490–1503.
Vrabie, D., Pastravanu, O., Abu-Khalaf, M., & Lewis, F. L. (2009). Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 45(2), 477–484.
Vamvoudakis, K. G., & Lewis, F. L. (2010). Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 46(5), 878–888.
Jiang, Y., & Jiang, Z. (2012). Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica, 48(10), 2699–2704.
Vamvoudakis, K. G., Lewis, F. L., & Hudas, G. R. (2012). Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica, 48(8), 1598–1611.
Zhang, H., Zhang, J., Yang, G., & Luo, Y. (2015). Leader-based optimal coordination control for the consensus problem of multi-agent differential games via fuzzy adaptive dynamic programming. IEEE Transactions on Fuzzy Systems, 23(1), 152–163.
Yang, H., Wang, F., Zhang, Z., & Zong, G. (2010). Consensus of multi-agent systems based on disturbance observer. Journal of Control Theory and Applications, 8(2), 145–150.
Li, Y., Liu, L., & Feng, G. (2018). Robust adaptive output feedback control to a class of non-triangular stochastic nonlinear systems. Automatica, 89, 325–332.
Xiao, B., Yang, X., Karimi, H. R., & Qiu, J. (2019). Asymptotic tracking control for a more representative class of uncertain nonlinear systems with mismatched uncertainties. IEEE Transactions on Industrial Electronics, 66(12), 9417–9427.
Jiang, Y., & Jiang, Z. (2012). Robust adaptive dynamic programming for large-scale systems with an application to multimachine power systems. IEEE Transactions on Circuits and Systems: Express Briefs, 59(10), 693–697.
Lin, F. (2010). An optimal control approach to robust control design. International Journal of Control, 73(3), 177–186.
Mu, C., Zhang, Y., Gao, Z., & Sun, C. (2020). ADP-based robust tracking control for a class of nonlinear systems with unmatched uncertainties. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 50(11), 4056–4067.
Cao, W., Zhang, J., & Ren, W. (2015). Leader-follower consensus of linear multi-agent systems with unknown external disturbances. Systems & Control Letters, 82, 64–70.
Wang, Q., Sun, C., Chai, X., & Yu, Y. (2018). Disturbance observer-based sliding mode control for multi-agent systems with mismatched uncertainties. Assembly Automation, 38, 606–614.
Wang, X., Hong, Y., Yi, P., Ji, H., & Kang, Y. (2017). Distributed optimization design of continuous-time multi-agent systems with unknown-frequency disturbances. IEEE Transactions on Cybernetics, 47(8), 2058–2066.
Wang, X., Hong, Y., & Ji, H. (2016). Distributed optimization for a class of nonlinear multi-agent systems with disturbance rejection. IEEE Transactions on Cybernetics, 46(7), 1655–1666.
Lin, F., Brandt, R. D., & Sun, J. (1992). Robust control of nonlinear systems: compensating for uncertainty. International Journal of Control, 56(6), 1453–1459.
Acknowledgements
This work was supported by Tianjin Natural Science Foundation under Grant 20JCYBJC00880, Beijing key Laboratory Open Fund of Long-Life Technology of Precise Rotation and Transmission Mechanisms, and Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, Y., Mu, C., Zhang, Y. et al. Heuristic dynamic programming-based learning control for discrete-time disturbed multi-agent systems. Control Theory Technol. 19, 339–353 (2021). https://doi.org/10.1007/s11768-021-00049-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11768-021-00049-9