TensorLearn
Back to Course
Reinforcement Learning: Agents
Module 7 of 8

7. Model-Based RL (AlphaZero)

1. Planning to Win

Standard RL guesses. AlphaZero simulates. It uses MCTS (Monte Carlo Tree Search) to look ahead into the future.

2. Self-Play

It starts knowing nothing. It plays millions of games against itself. The "Winner" teaches the "Loser".

Mark as Completed

TensorLearn - AI Engineering for Professionals