Data-Based Optimal Control of Multiagent Systems: A Reinforcement Learning Design Approach

IEEE Trans Cybern. 2019 Dec;49(12):4441-4449. doi: 10.1109/TCYB.2018.2868715. Epub 2018 Sep 26.

Abstract

This paper studies an optimal consensus tracking problem of heterogeneous linear multiagent systems. By introducing tracking error dynamics, the optimal tracking problem is reformulated as finding a Nash-equilibrium solution to multiplayer games, which can be done by solving associated coupled Hamilton-Jacobi equations. A data-based error estimator is designed to obtain the data-based control for the multiagent systems. Using the quadratic functional to approximate every agent's value function, we can obtain the optimal cooperative control by the input-output (I/O) Q -learning algorithm with a value iteration technique in the least-square sense. The control law solves the optimal consensus problem for multiagent systems with measured I/O information, and does not rely on the model of multiagent systems. A numerical example is provided to illustrate the effectiveness of the proposed algorithm.