DreamerV1

This repository contains my implementation of DreamerV1

The implementation was built to better understand how latent imagination works in practice and to experimentally evaluate its behavior in environments outside the original DeepMind Control Suite, particularly Gridworld-style tasks.

I personally wrote a detailed breakdown of the implementation, experiments, and insights here:

👉 My Medium article:
Planning by Dreaming: Why DreamerV1 Breaks in Simple Worlds

Overview

DreamerV1 learns:

a world model (encoder, decoder, RSSM, reward model)
an actor-critic policy
entirely from latent imagination rollouts rather than real environment interaction

Key ideas implemented:

Recurrent State Space Model (deterministic + stochastic latent state)
Latent imagination rollouts
Actor-critic training inside the learned latent space
Offline planning without environment interaction during policy updates

Architecture

The implementation follows the original DreamerV1 structure:

Encoder / Decoder – compress observations into a latent representation and reconstruct them
RSSM
- Deterministic state (h) for memory
- Stochastic state (s) for uncertainty
Transition Model – predicts future latent states
Reward Model – predicts rewards from latent states
Actor & Critic – trained purely on imagined trajectories

World model components are trained on real trajectories sampled from a replay buffer, while the policy and value function are trained using imagination rollouts.

Experiments

The primary experimental focus of this repository is evaluating DreamerV1 on Gridworld-like environments with:

partial observability
discrete state transitions
long-horizon dependencies
randomized layouts

Despite strong reconstruction quality, the agent fails to learn meaningful behavior due to breakdowns in latent imagination over long horizons. Detailed results, reconstructions, and analysis are discussed in the linked article.

References

Hafner et al., Dream to Control: Learning Behaviors by Latent Imagination, 2019

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
__pycache__		__pycache__
README.md		README.md
actor_critic.py		actor_critic.py
agent_gameplay_global.gif		agent_gameplay_global.gif
environment_variables.py		environment_variables.py
explore_sample.py		explore_sample.py
fitter.py		fitter.py
gridworldenv.py		gridworldenv.py
gridworldenvtest.py		gridworldenvtest.py
metrics.py		metrics.py
metrics_hooks.py		metrics_hooks.py
planner.py		planner.py
playtest.py		playtest.py
playtestn.py		playtestn.py
replaybuffer.py		replaybuffer.py
requirements.txt		requirements.txt
rssm.py		rssm.py
rssm_final.pth		rssm_final.pth
rssm_sanity.pth		rssm_sanity.pth
trainer.py		trainer.py
validate_bfs.py		validate_bfs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DreamerV1

Overview

Architecture

Experiments

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DreamerV1

Overview

Architecture

Experiments

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages