TensorLearn
Back to Course
Reinforcement Learning: Agents
Module 8 of 8

8. Offline RL

1. Learning from History

In robotics, exploration is dangerous (crashing serves no one). Offline RL learns from a static dataset of previous logs (Replay Buffer) without interacting with the world.

2. Conservative Q-Learning (CQL)

Assumption: "If I haven't seen this action in the data, it's probably bad."

Mark as Completed

TensorLearn - AI Engineering for Professionals