Vidgen API v2

Introduction

The Vidgen API provides AI-powered video generation using multi-scene diffusion models, with built-in prompt enhancement, content compliance checking, content credentials, and watermarking. All prompt parsing, workflow construction, and post-processing are handled server-side — clients submit raw scene prompts and receive signed MP4 videos hosted on S3.

Multi-scene video generation (1 to 12 scenes per request, videos up to 60 seconds)
Text-to-video (t2v) and image-to-video (i2v) modes per scene via start/end frames
Optional LLM-powered prompt enhancement
Automated pre-generation and post-generation content compliance review
Optional AI-generated background music and FX sound effects
C2PA manifest signing, invisible watermark embedding, and fingerprinting
S3-hosted output with pre-signed URLs for the signed video and thumbnail

Base URL

All endpoints accept POST requests with a Content-Type: application/json body containing an action field.

POST https://vidgen.api.efficientstack.com/api/v2
Content-Type: application/json
Authorization: Bearer <API-KEY>

{
  "action": "generate",
  ...
}

Authentication

All requests must include a Bearer token in the Authorization header:

Authorization: Bearer <API-KEY>

Requests without a valid, enabled key receive a 401 Unauthorized response.

Prompt Syntax

The API uses a unified raw prompt format per scene. The server parses the following tokens:

Token	Syntax	Example	Description
Negative	`-term`	`-blurry`	Comma-separated term prefixed with `-` is moved to the negative prompt.
Positive	everything else	`woman dancing, smooth camera`	Descriptive prompt text for the scene.

Example

cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry, -low quality, -shaky

Parsed as: positive=cinematic ocean waves crashing on rocks, golden hour, slow motion, negative=blurry, low quality, shaky.

Endpoints

Action	Description
`models`	List available models and their labels
`generate`	Submit a multi-scene video generation job
`status`	Poll a generation job for completion
`optimize`	Enhance a prompt with LLM
`random`	Get a random prompt from configured seeds

POST models

Returns available video generation models and the maximum number of scenes allowed per request. No parameters beyond action.

{ "action": "models" }

Response

{
  "models": {
    "scenex": { "label": "SceneX" }
  },
  "max_scenes": 5
}

POST generate

Submit a video generation job. Returns a job ID for polling. Each scene produces approximately 5 seconds of video; scenes are stitched into a single continuous MP4.

Request Parameters

Parameter	Type	Required	Default	Description
`action`	string	yes	—	`"generate"`
`scenes`	array	yes	—	Array of 1 to `max_scenes` scene objects.
`scenes[].prompt`	string	yes	—	Scene prompt. Supports `-negative` syntax.
`scenes[].negative`	string	no	`""`	Additional negative terms for this scene.
`scenes[].start_frame`	string	no	—	Base64 data URI of the start frame (enables i2v mode).
`scenes[].end_frame`	string	no	—	Base64 data URI of the end frame (first-last frame mode).
`model`	string	no	first model	Model key from `models` response.
`optimize`	boolean	no	`false`	LLM-enhance each scene prompt before generation.
`size`	integer	no	`720`	Max dimension for i2v frame resize (720–1280).
`width`	integer	no	`832`	Width for t2v mode. Must be divisible by 8.
`height`	integer	no	`480`	Height for t2v mode. Must be divisible by 8.
`upscale`	boolean	no	`false`	Enable 2× AI upscaling.
`music.enabled`	boolean	no	`false`	Enable AI-generated background music.
`music.tags`	string	no	`""`	Music style tags (e.g. `"R&B, slow jam"`).
`music.lyrics`	string	no	`"[Inst]"`	Lyrics or `[Inst]` for instrumental.
`music.bpm`	integer	no	`85`	Tempo in beats per minute.
`music.keyscale`	string	no	`"Eb minor"`	Musical key signature.
`fx_sound.enabled`	boolean	no	`false`	Enable AI-generated FX sound effects.
`fx_sound.prompt`	string	no	`""`	FX audio description (e.g. `"wind, footsteps"`).
`fx_sound.negative_prompt`	string	no	`""`	FX audio terms to avoid.
`wm_image`	string	no	`""`	Custom watermark as base64 data URI.
`wm_position`	string	no	`"bottom-right"`	Watermark anchor: `top-left`, `top-right`, `center`, `bottom-left`, `bottom-right`.
`wm_scale`	integer	no	`5`	Watermark scale (1–100).
`wm_transparency`	integer	no	`100`	Watermark opacity (0–100).
`wm_rotation`	integer	no	`0`	Watermark rotation in degrees (-360–360).
`wm_padding_x`	integer	no	`10`	Horizontal padding from anchor (0–500).
`wm_padding_y`	integer	no	`10`	Vertical padding from anchor (0–500).
`cs_author`	string	no	`""`	Override C2PA author.
`cs_title`	string	no	`""`	Override C2PA title.
`cs_description`	string	no	`""`	Override C2PA description.
`cs_organization`	string	no	`""`	Override C2PA organization.
`cs_vendor`	string	no	`""`	Override C2PA vendor.

Scene Modes

The generation mode for each scene is determined by which frames are provided:

Start Frame	End Frame	Mode	Description
—	—	Text-to-Video	Generates video from text using `width` × `height`.
Provided	—	Image-to-Video	Animates from the start frame. Frame is resized to `size`.
Provided	Provided	First-Last Frame	Generates video transitioning between both frames.

Valid Dimensions (t2v mode)

width and height must each be divisible by 8 and between 320 and 1280. Common pairs: 832×480 (16:9), 480×832 (9:16), 768×512 (3:2), 512×768 (2:3), 640×480 (4:3), 640×640 (1:1). Values not divisible by 8 are rounded to the nearest multiple.

Response

{
  "id": "run_abc123xyz",
  "gen_id": "a1b2c3d4e5f6g7h8i9j0",
  "model": "scenex",
  "scenes": [
    { "prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry" }
  ]
}

POST status

Poll a generation job. Pass the id and model returned by generate.

{
  "action": "status",
  "id": "run_abc123xyz",
  "model": "scenex"
}

Pending

{ "status": "IN_QUEUE" }
{ "status": "IN_PROGRESS" }

Completed (Safe)

{
  "status": "COMPLETED",
  "meta": {
    "id": "run_abc123xyz",
    "gen_id": "a1b2c3d4e5f6g7h8i9j0",
    "time": "2025-06-20T14:30:00Z",
    "prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry",
    "scenes": [
      {
        "prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry",
        "positive": "cinematic ocean waves crashing on rocks, golden hour, slow motion",
        "negative": "base negatives, blurry",
        "has_start_frame": false,
        "has_end_frame": false
      }
    ],
    "model": "scenex",
    "size": 720,
    "width": 832,
    "height": 480,
    "upscale": false,
    "music_enabled": false,
    "fx_sound_enabled": false,
    "scene_count": 1,
    "optimized": false,
    "compliance": "safe",
    "compliance_codes": [],
    "fingerprint": "a4f8e2c1b9d0...",
    "flagged": false,
    "images": [
      { "url": "https://s3.../video.mp4?...", "filename": "genid_fp_00001.mp4", "type": "signed" },
      { "url": "https://s3.../thumb.jpg?...", "filename": "genid_fp_00001_thumb.jpg", "type": "thumbnail" }
    ],
    "enable_watermark": true,
    "enable_c2pa": true,
    "compute_fingerprint": true,
    "wm_position": "bottom-right",
    "wm_scale": 5,
    "wm_transparency": 100,
    "cs_author": "Author",
    "cs_title": "AI Video",
    "cs_description": "Text to Video - Generative AI",
    "cs_organization": "",
    "cs_vendor": ""
  }
}

Completed (Blocked)

{
  "status": "COMPLETED",
  "compliance": "unsafe",
  "error": "Content blocked — flagged for: Category Name.",
  "filter_categories": ["id1", "id2"]
}

Failed

{ "status": "FAILED", "error": "Generation failed" }

POST optimize

Enhance a prompt independently of generation. Uses the model’s configured LLM enhancement prompt.

{
  "action": "optimize",
  "prompt": "woman walking in rain",
  "model": "scenex"
}

Response

{
  "prompt": "elegant woman walking gracefully through a gentle downpour, cobblestone street glistening with rain, cinematic lighting, shallow depth of field"
}

POST random

Returns a random prompt assembled from configured seed categories (actions, clothing, framing, locations). No parameters beyond action.

{ "action": "random" }

Response

{
  "prompt": "dancing gracefully, flowing silk dress, medium close-up, on a sunlit rooftop terrace"
}

Prompt Construction

The server constructs the final positive and negative prompts for each scene from the raw input. Understanding this helps write more effective prompts.

Positive Prompt

{user prompt text, with -negative terms removed}

If optimize is enabled, the user prompt is first enhanced by an LLM using the model’s configured enhancement system prompt.

Negative Prompt

{model base negatives}[, parsed -terms from prompt][, explicit scene negative]

Negatives are assembled in this order: the model’s built-in base negatives, then any -term tokens parsed from the raw prompt, then any explicit negative string provided in the scene object.

Content Credentials

All generated videos can include three layers of content provenance, configurable via the admin panel:

C2PA manifest signing — Embeds a signed C2PA manifest into the MP4 container, recording the model, software agent, generation type, author, and other metadata. Verifiable at contentcredentials.org/verify.
Invisible watermark — Embeds a DCT-domain watermark carrying the generation ID into every N-th frame. The watermark is imperceptible but extractable with the corresponding private key.
Fingerprinting — Computes a perceptual fingerprint of the output video, returned in meta.fingerprint and embedded in the output filename.

Override C2PA metadata per request using: cs_author, cs_title, cs_description, cs_organization, cs_vendor. When omitted, server-configured defaults are used.

Polling Strategy

Setting	Recommended
Interval	5 seconds
Max polls	360 (30 min timeout)
Overlap guard	Wait for each poll response before starting the next

Terminal states: COMPLETED, FAILED, TIMED_OUT, CANCELLED. On COMPLETED, always check the compliance field before accessing meta — a value of "unsafe" means the output was blocked by post-generation compliance review.

Generation TimesVideo generation typically takes ~100 seconds per scene at 480p. Upscaling and higher resolutions increase time by 4–5×. Multi-scene videos are proportionally longer. Music and FX sound add an additional 15–30 seconds.

Error Handling

Status	Error	Cause
400	Missing action	No `action` field in request body
400	scenes array is required	Missing or empty `scenes` array
400	Each scene requires a prompt	A scene object has an empty prompt
400	Maximum N scenes allowed	Too many scenes for current configuration
400	Content blocked	Pre-generation compliance check failed
400	Bad ID	Invalid job ID format in status request
401	Missing or invalid Authorization header	No Bearer token provided
401	Invalid API key	Key not found or does not match
401	API key is disabled	Key exists but has been disabled
500	No models configured	No models set up in admin panel
502	Generation failed	Upstream RunPod error
502	Enhancement failed	LLM prompt enhancement error

Silent fallbacks: invalid model → first configured model, non-divisible-by-8 dimensions → nearest multiple of 8, out-of-range dimensions → clamped to 320–1280.

Code Examples

const API = 'https://vidgen.api.efficientstack.com/api/v2';
const TOKEN = 'your-api-key';

async function api(body) {
  const r = await fetch(API, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${TOKEN}` },
    body: JSON.stringify(body)
  });
  return r.json();
}

async function generateVideo(scenes, opts = {}) {
  const job = await api({
    action: 'generate', scenes, model: opts.model || 'scenex',
    size: opts.size || 720, width: opts.width || 832, height: opts.height || 480,
    upscale: opts.upscale || false, optimize: opts.optimize || false
  });
  if (job.error) throw new Error(job.error);

  for (let i = 0; i < 360; i++) {
    await new Promise(r => setTimeout(r, 5000));
    const s = await api({ action: 'status', id: job.id, model: job.model });
    if (s.status === 'IN_QUEUE' || s.status === 'IN_PROGRESS') continue;
    if (s.status === 'COMPLETED') {
      if (s.compliance === 'unsafe') throw new Error(s.error);
      return s.meta;
    }
    throw new Error(s.error || 'Failed: ' + s.status);
  }
  throw new Error('Timeout');
}

// Single scene (text-to-video)
const meta = await generateVideo([
  { prompt: 'cinematic ocean waves crashing at sunset, golden hour, -blurry' }
]);

// Multi-scene
const multi = await generateVideo([
  { prompt: 'woman walking down a neon-lit street at night, rain reflections' },
  { prompt: 'close-up of her face looking up, rain drops on skin, slow motion' },
  { prompt: 'wide aerial shot pulling away from the city, -shaky, -low quality' }
], { upscale: true });

const video = meta.images.find(i => i.type === 'signed');
console.log('Video URL:', video?.url);

import requests, time

API = "https://vidgen.api.efficientstack.com/api/v2"
TOKEN = "your-api-key"

def api(body):
    return requests.post(API, json=body,
        headers={"Authorization": f"Bearer {TOKEN}"}).json()

def generate(scenes, model="scenex", size=720, upscale=False):
    job = api({
        "action": "generate", "scenes": scenes,
        "model": model, "size": size, "upscale": upscale
    })
    if "error" in job:
        raise Exception(job["error"])
    for _ in range(360):
        time.sleep(5)
        s = api({"action": "status", "id": job["id"], "model": job["model"]})
        if s["status"] in ("IN_QUEUE", "IN_PROGRESS"):
            continue
        if s["status"] == "COMPLETED":
            if s.get("compliance") == "unsafe":
                raise Exception(s.get("error"))
            return s["meta"]
        raise Exception(s.get("error", f"Failed: {s['status']}"))
    raise TimeoutError("Polling timed out")

# Single scene
meta = generate([{"prompt": "cinematic ocean waves crashing at sunset"}])

# Multi-scene with music
meta = generate(
    [
        {"prompt": "woman dancing in a ballroom, elegant dress, warm lighting"},
        {"prompt": "spinning in slow motion, camera orbiting around her"},
    ],
    model="scenex",
    upscale=True,
)

video = next(i for i in meta["images"] if i["type"] == "signed")
print("URL:", video["url"])

TOKEN="your-api-key"
BASE="https://vidgen.api.efficientstack.com/api/v2"

# List models
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"action":"models"}'

# Generate (single scene, text-to-video)
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "action": "generate",
    "model": "scenex",
    "scenes": [{"prompt": "cinematic ocean waves crashing at sunset, -blurry"}],
    "width": 832,
    "height": 480
  }'

# Generate (multi-scene with music)
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "action": "generate",
    "model": "scenex",
    "scenes": [
      {"prompt": "woman walking through autumn forest, golden leaves falling"},
      {"prompt": "close-up of leaves crunching underfoot, shallow depth of field"}
    ],
    "music": {"enabled": true, "tags": "ambient, cinematic", "bpm": 72}
  }'

# Poll status (replace JOB_ID with the id from generate response)
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"action":"status","id":"JOB_ID","model":"scenex"}'

# Enhance prompt
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"action":"optimize","prompt":"woman walking in rain","model":"scenex"}'

# Random prompt
curl -s -X POST "$BASE" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"action":"random"}'

<?php
$api = 'https://vidgen.api.efficientstack.com/api/v2';
$token = 'your-api-key';

function apiCall($api, $token, $body) {
    $ch = curl_init($api);
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => json_encode($body),
        CURLOPT_HTTPHEADER => [
            'Content-Type: application/json',
            "Authorization: Bearer $token",
        ],
    ]);
    $r = curl_exec($ch);
    curl_close($ch);
    return json_decode($r, true);
}

// Generate a single-scene video
$job = apiCall($api, $token, [
    'action' => 'generate',
    'model' => 'scenex',
    'scenes' => [['prompt' => 'cinematic ocean waves crashing at sunset, -blurry']],
    'width' => 832,
    'height' => 480,
]);

if (isset($job['error'])) {
    die('Error: ' . $job['error']);
}

// Poll for completion
for ($i = 0; $i < 360; $i++) {
    sleep(5);
    $s = apiCall($api, $token, [
        'action' => 'status',
        'id' => $job['id'],
        'model' => $job['model'],
    ]);

    if (in_array($s['status'], ['IN_QUEUE', 'IN_PROGRESS'])) {
        continue;
    }

    if ($s['status'] === 'COMPLETED') {
        if (($s['compliance'] ?? '') === 'unsafe') {
            die('Blocked: ' . ($s['error'] ?? 'unsafe'));
        }
        $signed = array_values(array_filter(
            $s['meta']['images'],
            fn($i) => $i['type'] === 'signed'
        ))[0];
        echo "Video URL: " . $signed['url'];
        break;
    }

    die($s['error'] ?? 'Failed: ' . $s['status']);
}