Google DeepMind has introduced Genie 3, the latest evolution of its experimental AI “world” model, designed to create fully interactive 3D digital environments where both human users and AI agents can move, explore, and engage with objects and surroundings in real time. This cutting-edge model represents a significant leap forward in building persistent, AI-generated virtual spaces. Unlike traditional game design, which relies on manually created 3D assets and environments, Genie 3 can dynamically construct entire scenes based solely on a user's prompt, making it a versatile tool for applications ranging from education and entertainment to training autonomous systems and robots.
One of the most notable upgrades in Genie 3 is its enhanced ability to sustain immersive interaction over time. According to DeepMind, the model can now support continuous engagement for several minutes, compared to the brief 10–20 seconds of interaction allowed by its predecessor, Genie 2, which was launched in December 2024. Additionally, Genie 3 can retain visual memory for around a minute, allowing for a more realistic and seamless experience. For example, if a user walks away from a painted wall or a chalkboard and returns later, the visual details are expected to remain exactly as they were left, making the environment feel more stable and lifelike.
In terms of output quality, the model renders worlds at a resolution of 720p and supports 24 frames per second, which contributes to smoother visuals and a more engaging experience. These improvements are key for maintaining immersion in applications like simulated learning environments, storytelling platforms, and AI-human interaction labs.
Another innovative addition to Genie 3 is a capability known as “promptable world events”. This feature enables users to alter the conditions of the environment simply by typing in a command. For instance, one could change the weather, add new characters, or trigger specific in-world events without needing to manipulate complex settings or code. This type of intuitive interaction brings the vision of "living, prompt-driven virtual worlds" one step closer to reality.
Genie 2, though revolutionary at the time, had its limitations. It could generate interactive environments based on a single input image, but the stability of those visuals was limited, and the interactions were extremely short-lived, which affected usability and immersion. Other similar models developed by different organizations have also faced difficulties in maintaining consistent imagery, often generating flickering, warped, or shifting visuals that make it difficult to stay engaged or to use the environments for practical purposes.
However, Genie 3 is not being made publicly available at this stage. Google DeepMind is initially releasing it as a limited research preview, granting access to a select group of academic researchers and creative professionals. The goal is to study the system’s potential risks, identify any misuse concerns, and begin formulating appropriate safety protocols before broader deployment. Some functionality, including certain types of user interaction, will be deliberately restricted during this trial phase. Additionally, any readable text that appears inside the generated environments will be visible only if it was part of the initial prompt given to the system.
Interestingly, the world models team behind Genie 3 is being led by a former co-lead from OpenAI’s Sora project, which focused on video generation using AI, suggesting that Genie 3 combines expertise from both real-time rendering and long-form content generation domains.
As of now, there is no confirmed timeline for wider release, but Google has indicated that access could expand in the future, depending on the outcomes of this initial testing phase. If Genie 3 proves successful, it may pave the way for a new era of AI-generated virtual realities, blurring the lines between simulated and real-world experiences.