quotation:   [Copy] 
  M. I. Abouheaf,F. L. Lewis,M. S. Mahmoud,D. G. Mikulski.[en_title][J].Control Theory and Technology,2015,13(1):55~69.[Copy] 



This Paper:Browse 2365 Download 2113 
码上扫一扫！ 
Discretetime dynamic graphical games: modelfree reinforcement learning solution 
M.I.Abouheaf,F.L.Lewis,M.S.Mahmoud,D.G.Mikulski 

(Systems Engineering Department, King Fahd University of Petroleum & Mineral) 

摘要: 
This paper introduces a modelfree reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from multiagent dynamical systems, where pinning control is used to make all the agents synchronize to the state of a command generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled HamiltonJacobiBellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled HamiltonJacobiBellman equations. An online modelfree policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents’ dynamics. A proof of convergence for this multiagent learning algorithm is given under mild assumption about the interconnectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in realtime. 
关键词: Dynamic graphical games, Nash equilibrium, discrete mechanics, optimal control, modelfree reinforcement learning, policy iteration 
DOI： 
Received:December 31, 2014Revised:January 15, 2015 
基金项目: 

Discretetime dynamic graphical games: modelfree reinforcement learning solution 
M. I. Abouheaf,F. L. Lewis,M. S. Mahmoud,D. G. Mikulski 
(Systems Engineering Department, King Fahd University of Petroleum & Mineral;UTA Research Institute, University of Texas at Arlington; State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University;Ground Vehicle Robotics (GVR), U.S. Army TARDEC) 
Abstract: 
This paper introduces a modelfree reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from multiagent dynamical systems, where pinning control is used to make all the agents synchronize to the state of a command generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled HamiltonJacobiBellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled HamiltonJacobiBellman equations. An online modelfree policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents’ dynamics. A proof of convergence for this multiagent learning algorithm is given under mild assumption about the interconnectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in realtime. 
Key words: Dynamic graphical games, Nash equilibrium, discrete mechanics, optimal control, modelfree reinforcement learning, policy iteration 

