Video URL

https://simons.berkeley.edu/talks/multi-agent-reinforcement-learning-stochastic-games-alphago-robus…

Multi-Agent Reinforcement Learning In Stochastic Games: From Alphago To Robust Control

(2022). Multi-Agent Reinforcement Learning In Stochastic Games: From Alphago To Robust Control. The Simons Institute for the Theory of Computing. https://simons.berkeley.edu/talks/multi-agent-reinforcement-learning-stochastic-games-alphago-robust-control

Multi-Agent Reinforcement Learning In Stochastic Games: From Alphago To Robust Control. The Simons Institute for the Theory of Computing, Feb. 11, 2022, https://simons.berkeley.edu/talks/multi-agent-reinforcement-learning-stochastic-games-alphago-robust-control

          @misc{ scivideos_19620,
            doi = {},
            url = {https://simons.berkeley.edu/talks/multi-agent-reinforcement-learning-stochastic-games-alphago-robust-control},
            author = {},
            keywords = {},
            language = {en},
            title = {Multi-Agent Reinforcement Learning In Stochastic Games: From Alphago To Robust Control},
            publisher = {The Simons Institute for the Theory of Computing},
            year = {2022},
            month = {feb},
            note = {19620 see, \url{https://scivideos.org/Simons-Institute/19620}}
          }

Kaiqing Zhang (MIT)

February 11, 2022

Source Repository Simons Institute

Subject

Computer Science

Abstract

Reinforcement learning (RL) has recently achieved tremendous successes in several artificial intelligence applications. Many of the forefront applications of RL involve "multiple agents", e.g., playing chess and Go games, autonomous driving, and robotics. In this talk, I will introduce several recent works on multi-agent reinforcement learning (MARL) with theoretical guarantees. Specifically, we focus on solving the most basic multi-agent RL setting: infinite-horizon zero-sum stochastic games (Shapley 1953), using three common RL approaches: model-based, value-based, and policy-based ones. We first show that for the tabular setting, "model-based multi-agent RL" (estimating the model first and then planning) can achieve near-optimal sample complexity when a generative model of the game environment is available. Second, we show that a simple variant of "Q-learning" (value-based) can find the Nash equilibrium of the game, even if the agents run it independently/in a "fully decentralized" fashion. Third, we show that "policy gradient" methods (policy-based) can solve zero-sum stochastic games with linear dynamics and quadratic costs, which equivalently solves a robust and risk-sensitive control problem. With this connection to robust control, we discover that our policy gradient methods automatically preserve the robustness of the system during iterations, some phenomena we referred to as "implicit regularization". Time permitting, I will also discuss some ongoing and future directions along these lines.

Supported by

Video URL

Multi-Agent Reinforcement Learning In Stochastic Games: From Alphago To Robust Control

Abstract

Perspectives on Communicating Physics to the Public

Analytical Methods for Inflation Correlators.

Black holes and gravitational waves

Cosmology from Galaxy Surveys

Gyroscopic gravitational memory from binary systems

Video URL

Multi-Agent Reinforcement Learning In Stochastic Games: From Alphago To Robust Control

APA

MLA

BibTex

Abstract