Exploration-Driven Generative Interactive Environments

1INSAIT, Sofia University, 2 ETH Zurich, 3 TU Munich
CVPR 2025

We present a framework for training multi-environment world models, equipped with our RetroAct dataset of annotated virtual environments, our open implementation of Genie - GenieRedux, and our novel environment exploration method - AutoExplore Agent.

Abstract

Modern world models require costly and time-consuming collection of large video datasets with action demonstrations by people or by environment-specific agents. To simplify training, we focus on using many virtual environments for inexpensive, automatically collected interaction data. Genie, a recent multi-environment world model, demonstrates simulation abilities of many environments with shared behavior. Unfortunately, training their model requires expensive demonstrations. Therefore, we propose a training framework merely using a random agent in virtual environments. While the model trained in this manner exhibits good controls, it is limited by the random exploration possibilities. To address this limitation, we propose AutoExplore Agent - an exploration agent that entirely relies on the uncertainty of the world model, delivering diverse data from which it can learn the best. Our agent is fully independent of environment-specific rewards and thus adapts easily to new environments. With this approach, the pretrained multi-environment model can quickly adapt to new environments achieving video fidelity and controllability improvement.

Our Framework

Teaser

Multi-Environment Control with GenieRedux-G

We avoid the costly human demonstration and video curation usually needed for training multi-environment world models, by instead annotating the behavior and controls of hundreds of virtual environments - our RetroAct dataset. Each annotated environment can give us many action demonstrations without the need for extra labeling.

We train our GenieRedux-G model on interactions from many environments at once and, even with a random agent, the model learns to accurately execute the actions.

Exploring Environments with AutoExplore Agent

To replace the limited random agent, we design AutoExplore Agent - an exploration agent that navigates virtual environments only driven by a reward based on the uncertainty of our world model. AutoExplore Agent is independen of environment-specific reward and can be applied to any environment.

We demonstrate that the interaction data obtained from AutoExplore Agent is much more diverse than with a random agent.

Better Environment Simulation with AutoExplore Agent

Finetuning our pretrained GenieRedux-G Model with the data from AutoExplore Agent leads to a better visual fidelity and controllability.

Our framework combines diverse annotated environments with automatic exploration, and we have demonstrated that this approach achievs strong visual fidelity and controllability without relying on human demonstrations.

BibTeX


        @inproceedings{savov2024exploration,
        title={Exploration-Driven Generative Interactive Environments},
        author={Savov, Nedko and Kazemi, Naser and Mahdi, Mohammad and Paudel, Danda Pani and Wang, Xi and Gool, Luc Van},
        booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
        year={2025}
      }