Runway, a pioneer in AI-powered multimedia creation tools, introduced Gen-4 Turbo this week, an enhanced version of its model Gen-4 introduced in late March. Allowing the creation of 5 to 10-second video sequences from an input image and a text description, the Gen-4 family caters to a wide range of users, from independent creators to audiovisual professionals and advertisers.
The Gen-4 series is designed to produce coherent and expressive visual sequences from a reference image and a text description. According to Runway, it represents a new advancement towards what they call a "General World Model," an AI system that builds an internal representation of an environment and uses it to simulate future events within that environment. Such a model will be capable of representing and simulating a wide range of situations and interactions, similar to those encountered in the real world.
Gen-4 naturally integrates into audiovisual production chains, alongside live-action, animated, or visual effects-generated content. The system enables the generation of 5 or 10-second videos, at 24 frames per second, in different image formats suited for digital platforms (16:9, 9:16, 1:1, 21:9, ...). The process relies on a mandatory input image, which acts as a visual starting point, and a text prompt focused on describing the desired motion. No custom training phase is required: the models are immediately operational.
Two Models for Complementary Uses
Gen-4 Turbo has been optimized for rapid iteration, with a reduced cost of 5 credits per second. Taking only 30 seconds to generate a 10-second video, it allows exploring multiple variations quickly. The standard Gen-4 is more expensive (12 credits/second) and may take up to a few minutes to generate a video of the same duration. However, it offers enhanced quality, useful for final versions.
Runway thus recommends testing ideas first in Turbo, before refining them with Gen-4 if necessary. Generations are unlimited in Explorer mode, which does not use credits, facilitating experimentation.
Generation Process
The user must first have downloaded, selected, or created the input image, before following these 3 steps:
- Writing the PromptThe imported image defines the initial visual frame (style, composition, colors, lighting), the text prompt must specify the expected dynamics (movement, transformation, interaction). The text limit is set at 1,000 characters.
- Setting ParametersIt is then possible to define the duration, resolution, and opt for a fixed seed, which will ensure generations with a similar style and movement.
- Generation and IterationThe user can then launch the generation. Videos can be reviewed in the current session or found in the personal project library, they can be refined by modifying the input image or the text prompt.
Post-Generation Features
Several options are offered to enrich or adjust the generated content. They allow:
- Applying a new visual style;
- Extending a scene;
- Adjusting the video to correct composition or rhythm;
- Aligning it with dialogue through lip-sync;
- Switching to 4K for a high-resolution version;
- Using the current frame as a starting point for a new generation.
All productions are archived by session, with options for renaming, sharing, or downloading. These tools promote an iterative approach, focused on visual precision without technical complexity.
Initial feedback is very positive. Runway, which has just raised $308 million in a funding round led by General Atlantic, valuing it at over $3 billion, is democratizing technologies once reserved for large productions, thus opening up new opportunities for content creators.
To better understand
What is a 'General World Model' in artificial intelligence and why is it important?
A 'General World Model' is an AI system capable of simulating future events by constructing an internal representation of an environment. This enables AI systems to better understand and interact with the real world, paving the way for more advanced and versatile applications.
What is the historical evolution of AI media generation technologies and what are the key milestones?
The evolution of AI media generation technologies began with simple image manipulation techniques to today's advanced models enabling realistic video sequence creation. Key milestones include the development of deep learning, the integration of convolutional neural networks, and the creation of GAN algorithms, each transforming how media is generated by AI.