Image to Video
Transform static images into dynamic videos using AI
v1.1 — AI Video
What’s New — April 2026 — Added Seedance 2.0 and Seedance 2.0 Reference models. Seedance 2.0 supports end frame, audio generation, flexible aspect ratios (including 21:9, 4:3, 3:4), and auto duration. Seedance 2.0 Reference generates video from multi-modal references — up to 9 images, 3 videos, and 3 audio files referenced in the prompt as @Image1, @Video1, @Audio1. No start frame needed for Reference mode.
Previous Updates
April 2026 (v1.1) — Multi-provider support
Multi-provider support (Veo 3.1, Sora 2/Pro, Kling v3/o3/o3 Ref), AI audio generation, prompt enhancement, end frame and reference image support, elements system for subject/style consistency, multi-prompt mode for multi-shot videos, resolution options up to 4K, and negative prompt support.
What does this node do?
The Image to Video node transforms static images into dynamic videos using AI. It supports multiple model providers, each offering different capabilities such as audio generation, end frame control, reference images, and multi-shot narratives.
Common uses:
- Animate product photos into polished showcase videos
- Create engaging social media videos from static visuals
- Generate cinematic clips with camera motion and effects
- Build multi-shot narratives from a series of images using multi-prompt mode
Quick setup
Add the Image to Video node
Find it in AI Nodes → AI_VIDEO → Image to Video
Connect a start frame
Connect an image output to the input_start_frame input. This is the image that will be animated.
Select a model and describe the motion
Choose a model (e.g. Veo 3.1, Sora 2, Kling v3, Seedance 2.0) and write a prompt describing the desired motion and animation.
Run the workflow
Execute the workflow. The node outputs a video file.
Configuration
Model
modelName LLM selection required The AI model to use for video generation. Each model family offers different capabilities — see the comparison table below.
Prompt
prompt string required Description of the desired motion and animation. Supports {{variables}} for dynamic content. You can reference connected inputs using @Element1, @Element2 (Kling), or @Image1, @Video1, @Audio1 (Seedance 2.0 Reference).
Examples:
- “Slow cinematic zoom in, soft lighting transitions”
- “Product rotates 360 degrees on a white background”
- “Camera pans left to right across the landscape, clouds moving”
- “@Image1 is walking through a forest in the style of @Image2. The ambient soundtrack from @Audio1 plays throughout.” (Seedance 2.0 Ref)
Audio
generate_audio boolean default: true Enable AI-generated audio for the video. Supported by Veo 3.1 and Seedance 2.0 models (enabled by default on Seedance).
Enhance Prompt
enhance_prompt boolean default: true Let the AI enhance your prompt for better results. The model rewrites your prompt with more detail and cinematic direction.
Aspect Ratio
aspect_ratio string default: 16:9 Output video aspect ratio. Available options vary by model:
- Veo 3.1: 16:9, 9:16
- Sora 2 / Sora 2 Pro: Auto, 9:16, 16:9
- Kling v3 / o3 / o3 Ref: 16:9, 9:16, 1:1
- Seedance 2.0 / 2.0 Ref: Auto, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16
Duration
duration_seconds number default: 8 Video duration in seconds. Range varies by model:
- Veo 3.1: 4–8s
- Sora 2 / Sora 2 Pro: 4, 8, or 12s
- Kling v3 / o3 / o3 Ref: 3–15s
- Seedance 2.0 / 2.0 Ref: Auto, or 4–15s
Number of Videos
num_videos number default: 1 Number of videos to generate (1–2).
Resolution
resolution string default: 1080p Output video resolution. Available options vary by model (up to 4K on supported models).
Negative Prompt
negative_prompt string Describe what you want to avoid in the generated video. Only supported by Kling models.
Example: “blurry, low quality, distorted faces, watermark”
End Frame
use_end_frame boolean default: false Enable end frame support. When turned on, a dynamic input_end_frame input appears. Connect an image to define how the video should end. Supported by Veo 3.1, Kling models, and Seedance 2.0.
Reference Images
use_reference_images boolean default: false Enable reference images for visual consistency. When turned on, a dynamic input_reference_images input appears. Supported by Veo 3.1 (up to 3 images) and Kling o3 Ref (up to 4 images).
Elements
elements_config json Array of element configurations, each with an id and type (image or video). Connected element inputs can be referenced in the prompt via @Element1, @Element2, etc. to maintain subject or style consistency across the video. Supported by Kling v3 (up to 4) and Kling o3 Ref (up to 4).
Seedance References
seedance_refs_config json Configure multi-modal reference inputs for Seedance 2.0 Reference. This model does not use a start frame — instead, all media is provided as named references and cited in the prompt.
Use the counter controls in the settings panel to add references:
- Images (
@Image1–@Image9): Up to 9 reference images. JPEG, PNG, or WebP. Max 30 MB each. - Videos (
@Video1–@Video3): Up to 3 reference videos. MP4 or MOV. Resolution 480p–720p, combined duration 2–15s, total size under 50 MB. - Audio (
@Audio1–@Audio3): Up to 3 audio files. MP3 or WAV. Max 15 MB each, combined duration max 15s. Requires at least 1 image or video.
Total across all modalities must not exceed 12. Each reference creates a connector input on the node. Reference them in your prompt using @Image1, @Video1, @Audio1, etc.
Example prompt:
@Image1 is walking through a forest in the style of @Image2.
The camera follows her from behind as she moves along the path shown in @Video1.
The ambient soundtrack from @Audio1 plays throughout the scene. Multi-Prompt
multi_prompt_enabled boolean default: false Enable multi-shot video generation. When turned on, the video is composed of multiple sequential shots, each with its own prompt and duration.
multi_prompt_config json Array of shot definitions, each containing a prompt and duration. Used when multi_prompt_enabled is true. Supported by Kling v3 and Kling o3 Ref.
Example:
[
{ "prompt": "Close-up of the product on a table", "duration": 5 },
{ "prompt": "Camera pulls back to reveal the full scene", "duration": 5 }
] Model comparison
| Feature | Veo 3.1 | Sora 2 | Sora 2 Pro | Kling v3 | Kling o3 | Kling o3 Ref | Seedance 2.0 | Seedance 2.0 Ref |
|---|---|---|---|---|---|---|---|---|
| Start Frame | Optional | Optional | Optional | Required | Required | Optional | Required | No |
| Aspect Ratios | 16:9, 9:16 | Auto, 9:16, 16:9 | Auto, 9:16, 16:9 | 16:9, 9:16, 1:1 | 16:9, 9:16, 1:1 | 16:9, 9:16, 1:1 | Auto, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16 | Auto, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16 |
| Duration | 4–8s | 4, 8, 12s | 4, 8, 12s | 3–15s | 3–15s | 3–15s | Auto, 4–15s | Auto, 4–15s |
| Resolution | 4K, 1080p, 720p | Auto, 720p | Auto, 1080p, 720p | 1080p | 1080p | 1080p | 720p, 480p | 720p, 480p |
| Audio | Yes | No | No | Yes | Yes | No | Yes | Yes |
| End Frame | Yes | No | No | Yes | Yes | Yes | Yes | No |
| References | 3 images | No | No | No | No | 4 images | No | 9 images, 3 videos, 3 audio |
| Elements | No | No | No | 4 max | No | 4 max | No | No |
| Multi-Prompt | No | No | No | Yes | No | Yes | No | No |
Output
output video The generated video file.
Examples
Product animation with Veo 3.1
Model: Veo 3.1 Start frame: Product photo on a clean background Prompt: “Product slowly rotates with soft studio lighting, gentle reflections on surface, ambient background music” Audio: Enabled Duration: 6s
The node generates a polished product showcase video with synchronized AI audio.
Social media clip with Sora 2
Model: Sora 2 Start frame: Landscape photograph Prompt: “Cinematic camera pan from left to right, clouds drifting in the sky, sun rays breaking through” Aspect ratio: 9:16 Duration: 8s
Produces a vertical video ready for social media platforms.
Multi-shot narrative with Kling v3
Model: Kling v3 Start frame: Character portrait Multi-prompt enabled: true Shots:
- “Close-up of the character looking at the camera, subtle smile” — 5s
- “Camera pulls back to reveal a city skyline at sunset behind the character” — 5s
- “Wide aerial shot of the city as the sun sets” — 5s
Creates a 15-second narrative video with three sequential shots, maintaining visual consistency.
Reference-driven video with Seedance 2.0 Reference
Model: Seedance 2.0 Reference References: 2 images, 1 video, 1 audio Prompt: “@Image1 is walking through a forest in the style of @Image2. The camera follows her from behind as she moves along the path shown in @Video1. The ambient soundtrack from @Audio1 plays throughout the scene with birds chirping.” Duration: Auto Audio: Enabled
No start frame is needed. The model composes the video entirely from the referenced media and the prompt description. Each @Image, @Video, and @Audio tag maps to a connector input on the node.
Best practices
- Start with high-quality images. The output quality directly depends on the input image resolution and clarity.
- Be specific in your prompts. Describe camera motion, lighting changes, and subject movement explicitly rather than using vague terms.
- Match the model to your needs. Use Veo 3.1 for high-res output, Sora 2 for longer clips, Kling for multi-shot narratives or element consistency, and Seedance 2.0 Reference when you need multi-modal references (images + videos + audio).
- Use end frames for controlled transitions. When you need the video to arrive at a specific final state, provide an end frame image.
- Keep multi-prompt shots coherent. Each shot should flow naturally into the next. Describe transitions in the prompts.
Common issues
Video quality is low or blurry Use a higher-resolution source image and increase the output resolution setting. Avoid upscaling small images before input.
Motion does not match the prompt Be more explicit about the type of motion. Instead of “make it move,” describe the exact camera movement or subject action. Enable prompt enhancement to let the model refine your description.
Audio is missing from the output
AI audio generation is only supported by Veo 3.1. Verify that generate_audio is enabled and that you are using a Veo model.
Elements are not reflected in the video
Ensure you reference elements in the prompt using @Element1, @Element2, etc. Elements are only supported by Kling v3 and Kling o3 Ref.
Seedance 2.0 Reference: audio is ignored
Audio references require at least one image or video reference to be provided. Make sure you have added at least one @Image or @Video before adding @Audio inputs.