Vidgen API v2

Introduction

The Vidgen API provides AI-powered video generation using multi-scene diffusion models, with built-in prompt enhancement, content compliance checking, content credentials, and watermarking. All prompt parsing, workflow construction, and post-processing are handled server-side — clients submit raw scene prompts and receive signed MP4 videos hosted on S3.

  • Multi-scene video generation (1 to 12 scenes per request, videos up to 60 seconds)
  • Text-to-video (t2v) and image-to-video (i2v) modes per scene via start/end frames
  • Optional LLM-powered prompt enhancement
  • Automated pre-generation and post-generation content compliance review
  • Optional AI-generated background music and FX sound effects
  • C2PA manifest signing, invisible watermark embedding, and fingerprinting
  • S3-hosted output with pre-signed URLs for the signed video and thumbnail

Base URL

All endpoints accept POST requests with a Content-Type: application/json body containing an action field.

POST https://vidgen.api.efficientstack.com/api/v2
Content-Type: application/json
Authorization: Bearer <API-KEY>

{
  "action": "generate",
  ...
}

Authentication

All requests must include a Bearer token in the Authorization header:

Authorization: Bearer <API-KEY>

Requests without a valid, enabled key receive a 401 Unauthorized response.

Prompt Syntax

The API uses a unified raw prompt format per scene. The server parses the following tokens:

TokenSyntaxExampleDescription
Negative-term-blurryComma-separated term prefixed with - is moved to the negative prompt.
Positiveeverything elsewoman dancing, smooth cameraDescriptive prompt text for the scene.

Example

cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry, -low quality, -shaky

Parsed as: positive=cinematic ocean waves crashing on rocks, golden hour, slow motion, negative=blurry, low quality, shaky.

Endpoints

ActionDescription
modelsList available models and their labels
generateSubmit a multi-scene video generation job
statusPoll a generation job for completion
optimizeEnhance a prompt with LLM
randomGet a random prompt from configured seeds

POST models

Returns available video generation models and the maximum number of scenes allowed per request. No parameters beyond action.

{ "action": "models" }

Response

{
  "models": {
    "scenex": { "label": "SceneX" }
  },
  "max_scenes": 5
}

POST generate

Submit a video generation job. Returns a job ID for polling. Each scene produces approximately 5 seconds of video; scenes are stitched into a single continuous MP4.

Request Parameters

ParameterTypeRequiredDefaultDescription
actionstringyes"generate"
scenesarrayyesArray of 1 to max_scenes scene objects.
scenes[].promptstringyesScene prompt. Supports -negative syntax.
scenes[].negativestringno""Additional negative terms for this scene.
scenes[].start_framestringnoBase64 data URI of the start frame (enables i2v mode).
scenes[].end_framestringnoBase64 data URI of the end frame (first-last frame mode).
modelstringnofirst modelModel key from models response.
optimizebooleannofalseLLM-enhance each scene prompt before generation.
sizeintegerno720Max dimension for i2v frame resize (720–1280).
widthintegerno832Width for t2v mode. Must be divisible by 8.
heightintegerno480Height for t2v mode. Must be divisible by 8.
upscalebooleannofalseEnable 2× AI upscaling.
music.enabledbooleannofalseEnable AI-generated background music.
music.tagsstringno""Music style tags (e.g. "R&B, slow jam").
music.lyricsstringno"[Inst]"Lyrics or [Inst] for instrumental.
music.bpmintegerno85Tempo in beats per minute.
music.keyscalestringno"Eb minor"Musical key signature.
fx_sound.enabledbooleannofalseEnable AI-generated FX sound effects.
fx_sound.promptstringno""FX audio description (e.g. "wind, footsteps").
fx_sound.negative_promptstringno""FX audio terms to avoid.
wm_imagestringno""Custom watermark as base64 data URI.
wm_positionstringno"bottom-right"Watermark anchor: top-left, top-right, center, bottom-left, bottom-right.
wm_scaleintegerno5Watermark scale (1–100).
wm_transparencyintegerno100Watermark opacity (0–100).
wm_rotationintegerno0Watermark rotation in degrees (-360–360).
wm_padding_xintegerno10Horizontal padding from anchor (0–500).
wm_padding_yintegerno10Vertical padding from anchor (0–500).
cs_authorstringno""Override C2PA author.
cs_titlestringno""Override C2PA title.
cs_descriptionstringno""Override C2PA description.
cs_organizationstringno""Override C2PA organization.
cs_vendorstringno""Override C2PA vendor.

Scene Modes

The generation mode for each scene is determined by which frames are provided:

Start FrameEnd FrameModeDescription
Text-to-VideoGenerates video from text using width × height.
ProvidedImage-to-VideoAnimates from the start frame. Frame is resized to size.
ProvidedProvidedFirst-Last FrameGenerates video transitioning between both frames.

Valid Dimensions (t2v mode)

width and height must each be divisible by 8 and between 320 and 1280. Common pairs: 832×480 (16:9), 480×832 (9:16), 768×512 (3:2), 512×768 (2:3), 640×480 (4:3), 640×640 (1:1). Values not divisible by 8 are rounded to the nearest multiple.

Response

{
  "id": "run_abc123xyz",
  "gen_id": "a1b2c3d4e5f6g7h8i9j0",
  "model": "scenex",
  "scenes": [
    { "prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry" }
  ]
}

POST status

Poll a generation job. Pass the id and model returned by generate.

{
  "action": "status",
  "id": "run_abc123xyz",
  "model": "scenex"
}

Pending

{ "status": "IN_QUEUE" }
{ "status": "IN_PROGRESS" }

Completed (Safe)

{
  "status": "COMPLETED",
  "meta": {
    "id": "run_abc123xyz",
    "gen_id": "a1b2c3d4e5f6g7h8i9j0",
    "time": "2025-06-20T14:30:00Z",
    "prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry",
    "scenes": [
      {
        "prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry",
        "positive": "cinematic ocean waves crashing on rocks, golden hour, slow motion",
        "negative": "base negatives, blurry",
        "has_start_frame": false,
        "has_end_frame": false
      }
    ],
    "model": "scenex",
    "size": 720,
    "width": 832,
    "height": 480,
    "upscale": false,
    "music_enabled": false,
    "fx_sound_enabled": false,
    "scene_count": 1,
    "optimized": false,
    "compliance": "safe",
    "compliance_codes": [],
    "fingerprint": "a4f8e2c1b9d0...",
    "flagged": false,
    "images": [
      { "url": "https://s3.../video.mp4?...", "filename": "genid_fp_00001.mp4", "type": "signed" },
      { "url": "https://s3.../thumb.jpg?...", "filename": "genid_fp_00001_thumb.jpg", "type": "thumbnail" }
    ],
    "enable_watermark": true,
    "enable_c2pa": true,
    "compute_fingerprint": true,
    "wm_position": "bottom-right",
    "wm_scale": 5,
    "wm_transparency": 100,
    "cs_author": "Author",
    "cs_title": "AI Video",
    "cs_description": "Text to Video - Generative AI",
    "cs_organization": "",
    "cs_vendor": ""
  }
}

Completed (Blocked)

{
  "status": "COMPLETED",
  "compliance": "unsafe",
  "error": "Content blocked — flagged for: Category Name.",
  "filter_categories": ["id1", "id2"]
}

Failed

{ "status": "FAILED", "error": "Generation failed" }

POST optimize

Enhance a prompt independently of generation. Uses the model’s configured LLM enhancement prompt.

{
  "action": "optimize",
  "prompt": "woman walking in rain",
  "model": "scenex"
}

Response

{
  "prompt": "elegant woman walking gracefully through a gentle downpour, cobblestone street glistening with rain, cinematic lighting, shallow depth of field"
}

POST random

Returns a random prompt assembled from configured seed categories (actions, clothing, framing, locations). No parameters beyond action.

{ "action": "random" }

Response

{
  "prompt": "dancing gracefully, flowing silk dress, medium close-up, on a sunlit rooftop terrace"
}

Prompt Construction

The server constructs the final positive and negative prompts for each scene from the raw input. Understanding this helps write more effective prompts.

Positive Prompt

{user prompt text, with -negative terms removed}

If optimize is enabled, the user prompt is first enhanced by an LLM using the model’s configured enhancement system prompt.

Negative Prompt

{model base negatives}[, parsed -terms from prompt][, explicit scene negative]

Negatives are assembled in this order: the model’s built-in base negatives, then any -term tokens parsed from the raw prompt, then any explicit negative string provided in the scene object.

Content Credentials

All generated videos can include three layers of content provenance, configurable via the admin panel:

  • C2PA manifest signing — Embeds a signed C2PA manifest into the MP4 container, recording the model, software agent, generation type, author, and other metadata. Verifiable at contentcredentials.org/verify.
  • Invisible watermark — Embeds a DCT-domain watermark carrying the generation ID into every N-th frame. The watermark is imperceptible but extractable with the corresponding private key.
  • Fingerprinting — Computes a perceptual fingerprint of the output video, returned in meta.fingerprint and embedded in the output filename.

Override C2PA metadata per request using: cs_author, cs_title, cs_description, cs_organization, cs_vendor. When omitted, server-configured defaults are used.

Polling Strategy

SettingRecommended
Interval5 seconds
Max polls360 (30 min timeout)
Overlap guardWait for each poll response before starting the next

Terminal states: COMPLETED, FAILED, TIMED_OUT, CANCELLED. On COMPLETED, always check the compliance field before accessing meta — a value of "unsafe" means the output was blocked by post-generation compliance review.

Generation TimesVideo generation typically takes ~100 seconds per scene at 480p. Upscaling and higher resolutions increase time by 4–5×. Multi-scene videos are proportionally longer. Music and FX sound add an additional 15–30 seconds.

Error Handling

StatusErrorCause
400Missing actionNo action field in request body
400scenes array is requiredMissing or empty scenes array
400Each scene requires a promptA scene object has an empty prompt
400Maximum N scenes allowedToo many scenes for current configuration
400Content blockedPre-generation compliance check failed
400Bad IDInvalid job ID format in status request
401Missing or invalid Authorization headerNo Bearer token provided
401Invalid API keyKey not found or does not match
401API key is disabledKey exists but has been disabled
500No models configuredNo models set up in admin panel
502Generation failedUpstream RunPod error
502Enhancement failedLLM prompt enhancement error

Silent fallbacks: invalid model → first configured model, non-divisible-by-8 dimensions → nearest multiple of 8, out-of-range dimensions → clamped to 320–1280.

Code Examples

const API = 'https://vidgen.api.efficientstack.com/api/v2';
const TOKEN = 'your-api-key';

async function api(body) {
  const r = await fetch(API, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${TOKEN}` },
    body: JSON.stringify(body)
  });
  return r.json();
}

async function generateVideo(scenes, opts = {}) {
  const job = await api({
    action: 'generate', scenes, model: opts.model || 'scenex',
    size: opts.size || 720, width: opts.width || 832, height: opts.height || 480,
    upscale: opts.upscale || false, optimize: opts.optimize || false
  });
  if (job.error) throw new Error(job.error);

  for (let i = 0; i < 360; i++) {
    await new Promise(r => setTimeout(r, 5000));
    const s = await api({ action: 'status', id: job.id, model: job.model });
    if (s.status === 'IN_QUEUE' || s.status === 'IN_PROGRESS') continue;
    if (s.status === 'COMPLETED') {
      if (s.compliance === 'unsafe') throw new Error(s.error);
      return s.meta;
    }
    throw new Error(s.error || 'Failed: ' + s.status);
  }
  throw new Error('Timeout');
}

// Single scene (text-to-video)
const meta = await generateVideo([
  { prompt: 'cinematic ocean waves crashing at sunset, golden hour, -blurry' }
]);

// Multi-scene
const multi = await generateVideo([
  { prompt: 'woman walking down a neon-lit street at night, rain reflections' },
  { prompt: 'close-up of her face looking up, rain drops on skin, slow motion' },
  { prompt: 'wide aerial shot pulling away from the city, -shaky, -low quality' }
], { upscale: true });

const video = meta.images.find(i => i.type === 'signed');
console.log('Video URL:', video?.url);
import requests, time

API = "https://vidgen.api.efficientstack.com/api/v2"
TOKEN = "your-api-key"

def api(body):
    return requests.post(API, json=body,
        headers={"Authorization": f"Bearer {TOKEN}"}).json()

def generate(scenes, model="scenex", size=720, upscale=False):
    job = api({
        "action": "generate", "scenes": scenes,
        "model": model, "size": size, "upscale": upscale
    })
    if "error" in job:
        raise Exception(job["error"])
    for _ in range(360):
        time.sleep(5)
        s = api({"action": "status", "id": job["id"], "model": job["model"]})
        if s["status"] in ("IN_QUEUE", "IN_PROGRESS"):
            continue
        if s["status"] == "COMPLETED":
            if s.get("compliance") == "unsafe":
                raise Exception(s.get("error"))
            return s["meta"]
        raise Exception(s.get("error", f"Failed: {s['status']}"))
    raise TimeoutError("Polling timed out")

# Single scene
meta = generate([{"prompt": "cinematic ocean waves crashing at sunset"}])

# Multi-scene with music
meta = generate(
    [
        {"prompt": "woman dancing in a ballroom, elegant dress, warm lighting"},
        {"prompt": "spinning in slow motion, camera orbiting around her"},
    ],
    model="scenex",
    upscale=True,
)

video = next(i for i in meta["images"] if i["type"] == "signed")
print("URL:", video["url"])
TOKEN="your-api-key"
BASE="https://vidgen.api.efficientstack.com/api/v2"

# List models
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"action":"models"}'

# Generate (single scene, text-to-video)
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "action": "generate",
    "model": "scenex",
    "scenes": [{"prompt": "cinematic ocean waves crashing at sunset, -blurry"}],
    "width": 832,
    "height": 480
  }'

# Generate (multi-scene with music)
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "action": "generate",
    "model": "scenex",
    "scenes": [
      {"prompt": "woman walking through autumn forest, golden leaves falling"},
      {"prompt": "close-up of leaves crunching underfoot, shallow depth of field"}
    ],
    "music": {"enabled": true, "tags": "ambient, cinematic", "bpm": 72}
  }'

# Poll status (replace JOB_ID with the id from generate response)
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"action":"status","id":"JOB_ID","model":"scenex"}'

# Enhance prompt
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"action":"optimize","prompt":"woman walking in rain","model":"scenex"}'

# Random prompt
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"action":"random"}'
<?php
$api = 'https://vidgen.api.efficientstack.com/api/v2';
$token = 'your-api-key';

function apiCall($api, $token, $body) {
    $ch = curl_init($api);
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => json_encode($body),
        CURLOPT_HTTPHEADER => [
            'Content-Type: application/json',
            "Authorization: Bearer $token",
        ],
    ]);
    $r = curl_exec($ch);
    curl_close($ch);
    return json_decode($r, true);
}

// Generate a single-scene video
$job = apiCall($api, $token, [
    'action' => 'generate',
    'model' => 'scenex',
    'scenes' => [['prompt' => 'cinematic ocean waves crashing at sunset, -blurry']],
    'width' => 832,
    'height' => 480,
]);

if (isset($job['error'])) {
    die('Error: ' . $job['error']);
}

// Poll for completion
for ($i = 0; $i < 360; $i++) {
    sleep(5);
    $s = apiCall($api, $token, [
        'action' => 'status',
        'id' => $job['id'],
        'model' => $job['model'],
    ]);

    if (in_array($s['status'], ['IN_QUEUE', 'IN_PROGRESS'])) {
        continue;
    }

    if ($s['status'] === 'COMPLETED') {
        if (($s['compliance'] ?? '') === 'unsafe') {
            die('Blocked: ' . ($s['error'] ?? 'unsafe'));
        }
        $signed = array_values(array_filter(
            $s['meta']['images'],
            fn($i) => $i['type'] === 'signed'
        ))[0];
        echo "Video URL: " . $signed['url'];
        break;
    }

    die($s['error'] ?? 'Failed: ' . $s['status']);
}