Google DeepMind introduced Genie 3 on 5 August 2025, the first real-time interactive general-purpose world model capable of generating multi-minute 3D environments from text instructions. According to the research team, this technology represents a crucial step towards artificial general intelligence (AGI) by providing unlimited simulation environments for training AI agents.
Genie 3 represents a significant technical breakthrough compared to its predecessor Genie 2, which could only create environments lasting 10-20 seconds. DeepMind researcher Jack Parker-Holder stated that the new model enables embodied agents to simulate real-world scenarios, which poses a particularly great challenge. The model's auto-regressive architecture allows it to remember previously generated content, enabling it to reference information from up to one minute ago to maintain consistency. Research director Shlomi Fruchter emphasised that Genie 3 surpasses previous narrow world models as it is not tied to any single environment and can create both photorealistic and imaginary worlds.
Genie 3's application possibilities range from education to game development, however researchers primarily see the real breakthrough in training AI agents for general-purpose tasks. The model is available in a limited preview version, with Google DeepMind providing early access to a small number of academics and creators. Current limitations include restricted action space, difficulties in modelling interactions between multiple independent agents, and support for only a few minutes of continuous interaction instead of the hours required for extensive training.Sources:

