AutoReach is now in public beta — AI video ads in minutes. Start free today
Try for free
Ad Studio/Models

Models

Every base AI generation model available in the Playground — from cinematic video to image generation.

8 min read

Models are the underlying AI engines that power everything in Ad Studio. In the Playground, you interact with them directly — no guided format, just raw generation with full control over settings. Each model has different strengths, speed profiles, and output styles.

Nano Banana 2

Nano Banana 2

Advanced image generation with support for up to 16 reference images and 4K resolution output. Faster than NanoBanana Pro with very similar quality — a strong option when you need high-quality images without the longer wait time.

Provider: Google Gemini

Nano Banana Pro

Nano Banana Pro

The most advanced image generation model available on the platform. Takes 2 to 3 minutes per image but produces exceptional detail and accuracy. Supports up to 16 reference images and 4K resolution.

Google Search Grounding

A toggle available in NanoBanana Pro. Turn it on when you reference real-world products, brands, or objects — like a Bentley interior, the latest iPhone, or new AirPods. It searches Google Images to understand your reference and produce more accurate results.

Provider: Google Gemini

Sora 2 Pro

Sora 2 Pro

The most advanced, most popular, and most expensive video generation model on the platform. Produces the most natural, human-like movement — including the small imperfections that make a video look like it was shot on an iPhone. Best for realistic UGC, authentic social media ads, and any generation where natural lip sync and actor consistency matter.

  • Duration: Up to 20 seconds at full HD
  • Cost: Highest tier — up to 1,200 credits per video

Provider: OpenAI

Sora 2 Remix

Sora 2 Remix

Select a Sora 2 video and describe the visual edits you want to make. Think of it as a visual editor for Sora-generated content — change the setting, lighting, background, or look without regenerating from scratch.

Provider: OpenAI

Kling 2.6

Kling 2.6

Last year's leading video model. Reliable for B-rolls and low-movement videos. Not well-suited for realistic talking heads or authentic UGC — it tends to produce smoother, more polished movement that can look less natural than Sora.

Provider: Kling

Kling 3.0

Kling 3.0

The newest Kling model, released February 2026. Significantly better quality, sound, and lip sync compared to Kling 2.6. Much better at keeping text consistent across frames — for example, text printed on a T-shirt stays readable and stable. Still not the best option for natural, authentic UGC movement.

  • Duration: Up to 15 seconds

Provider: Kling

VEO 3.1

VEO 3.1

Google's video generation model. Shorter maximum duration than most models but higher resolution output. Good for B-rolls, controlled product animations, and visuals where resolution quality is the priority. More expensive per second than comparable models.

  • Duration: Up to 8 seconds at up to 4K resolution

Provider: Google Veo

VEO Reference

VEO Reference

VEO 3.1 with support for up to 3 reference images. Excellent for multi-character scenes where you want to reference specific people and describe interactions between them.

Provider: Google Veo

Seedance 1.5 Pro

Seedance 1.5 Pro

Strong for B-rolls and static-to-motion content. If you have an image of a person and want subtle, natural movement — a slight head turn, a live-photo feel, someone glancing up — Seedance handles this well. Similar quality tier to Kling for B-roll work. Not the best choice for authentic UGC or social media content.

Provider: Bytedance / Fal

Motion Control

Motion Control

Upload a video of yourself performing an action, then upload an image of a different person or setting. The AI maps the movement from your video onto the target image. Useful for specific movements that are difficult to describe in a prompt — meme-style content, creative transitions, or character animations.

Provider: Kling

Kling V3 OMNI

Kling V3 OMNI

Multi-shot scene generation — up to five shots per video, each with an independent duration and its own prompt. Best for cinematic ads, brand awareness sequences, and product showcases that require multiple connected shots (for example: someone reaching into a bag, pulling out a product, and using it).

Elements

A unique feature of Kling V3 OMNI. Create named elements by uploading 2 to 4 reference images with a description. Then reference them in your prompts using @elementname tags. This keeps characters, products, or objects visually consistent across every shot. You can use multiple elements per generation and combine them with multi-shot for complex sequences.

Provider: Kling

GPT Image

GPT Image

Standard image generation powered by OpenAI. Faster and cheaper than NanoBanana Pro. Best for quick visualizations, concept iterations, or any time you need an image fast without requiring maximum quality.

Provider: OpenAI