The Video Generator node transforms still images into AI-generated videos. You choose from multiple state-of-the-art models, select a generation mode, and control motion to produce professional video content directly within your workflow.
What It Does
This node accepts an image (and optionally a prompt) as input and generates a video clip. Depending on the mode, you can create videos from scratch, animate an existing image, remix a source video, extend a clip, apply motion control, or use multi-reference inputs. Some models support dialogue with automatic lip-sync via video voice profiles.
Configuration
| Parameter | Description | Options / Range |
|---|
| Model | The AI video model used for generation | See models table below |
| Mode | The type of video generation to perform | standard, image-to-video, remix, extend, motion-control, omni-reference |
| Prompt | Optional text guidance for the video's content and motion | Free-form text |
| Duration | Length of the output video | Model-dependent |
Available Models
All video pricing is per second of generated video. Cost varies by model, resolution, and whether audio is included.
Seedance 1.5 Pro
| Resolution | Without Audio | With Audio |
|---|
| 480p (SD) | 12 cr/s | 24 cr/s |
| 720p (HD) | 30 cr/s | 60 cr/s |
| 1080p (FHD) | 60 cr/s | 120 cr/s |
Kling 2.6
| Mode | Without Audio | With Audio |
|---|
| 720p (Text/Image-to-Video) | 70 cr/s | 140 cr/s |
| 720p Motion Control (Standard) | 70 cr/s | -- |
| 1080p Motion Control (Pro) | 120 cr/s | -- |
Kling V3
| Resolution | Standard | With Audio |
|---|
| 720p | 168 cr/s | 224 cr/s |
| 1080p | 224 cr/s | 280 cr/s |
Kling V3 supports video voices — assign up to 3 voice profiles for automatic dialogue lip-sync. Wrap dialogue in double quotes within your prompt (e.g., "Hello world").
Kling O3
| Resolution | Standard | With Audio | Video-to-Video |
|---|
| 720p | 168 cr/s | 224 cr/s | 252 cr/s |
| 1080p | 224 cr/s | 280 cr/s | 336 cr/s |
Seedance 2.0
| Mode | Resolution | Credits/s |
|---|
| Standard / Image-to-Video | 720p | 150 cr/s |
| Standard / Image-to-Video | 1080p | 300 cr/s |
| Omni-Reference | 1080p | 300 cr/s |
Seedance 2.0 introduces omni-reference mode, supporting up to 9 image references, 3 video references, and 3 audio references simultaneously. Duration: 4–15 seconds.
Seedance 2.0 is currently in limited beta. Availability may vary.
Sora 2
| Tier | Resolution | Credits/s |
|---|
| Standard | 720p (HD) | 100 cr/s |
| Pro | 720p (HD) | 300 cr/s |
| Pro | 1080p (FHD) | 500 cr/s |
Veo 3.1 Lite
| Resolution | All Modes (audio included) |
|---|
| 720p (HD) | 50 cr/s |
| 1080p (FHD) | 80 cr/s |
Veo 3.1 Lite is a cost-efficient variant with audio bundled in the per-second price. Supported modes: standard, image-to-video (first/last frame), references, and extend. Available durations: 4, 6, or 8 seconds. Resolutions: 720p and 1080p in 16:9 or 9:16.
Veo 3.1 Fast
| Resolution | Without Audio | With Audio |
|---|
| 720p (HD) | 100 cr/s | 150 cr/s |
| 1080p (FHD) | 300 cr/s | 350 cr/s |
Veo 3.1
| Resolution | Without Audio | With Audio |
|---|
| 720p / 1080p | 200 cr/s | 400 cr/s |
Generation Modes
| Mode | Description |
|---|
standard | Generates a video entirely from a text prompt |
image-to-video | Animates a still image into a video clip |
remix | Takes an existing video and re-generates it with modified style or content |
extend | Adds additional frames to the end of an existing video clip |
motion-control | Provides fine-grained control over camera movement and subject motion |
omni-reference | Multi-reference mode with up to 15 inputs (Seedance 2.0) |
first-last-frame | Generates a video that interpolates between a start and end frame |
video-to-video | Edits or re-styles an existing video |
Video Voices
Models that support video voices (currently Kling V3) let you assign voice profiles to dialogue segments for automatic lip-sync. Wrap each dialogue line in double quotes within your prompt.
You can assign up to 3 voices per generation. With a single voice, it is applied to all dialogue automatically. Voice profiles can be system voices or custom cloned voices.
To clone a voice for video use, see Audio Generator Node — voice cloning for video costs 7 credits.
Additional Parameters
| Parameter | Description |
|---|
| Resolution | 720p (HD) or 1080p (FHD), depending on the model |
| Audio | Some models support generating audio alongside video |
| Negative Prompt | Describe elements to exclude from the output (Veo models) |
| Reference Images | Up to 3 reference images for style and content guidance (Veo 3.1) |
Usage
- Drag a Video Generator node onto the canvas.
- Connect an Image Generator or Asset Node to provide the source image.
- Optionally connect a Prompt Generator for text guidance.
- Select a Model and Mode that fit your needs.
- Execute the node to produce your video.
For the best results with image-to-video mode, use a high-quality source image at the same aspect ratio as your target video. This reduces visual artifacts and improves motion coherence.