Multi-agent Reinfocement Learning (AlphaStar)

Posted Oct 16, 2024

By Zheng Yuan

1 min read

Multi-agent Reinfocement Learning (AlphaStar)

Key Components

V-Trace

TD(\(\lambda\))

Architecture

\[\pi_{\theta}(a_t \vert s_t, z) = \mathbb{P}[a_t \vert s_t, z]\]

General-purpose Neural Network Components

Observation of Units

Self-attention mechanism
Spactial and non-spatial information –> Scatter connections
Partial observability –> Deep LSTM
Structured, combinatorial action space —> Auto-regressive policy and Recurrent Pointer Network

Machine Learning, Reinforcement Learning

machine-learning multi-agent-reinforcement-learning alpha-star

This post is licensed under CC BY 4.0 by the author.