Vidgen API v2
Introduction
The Vidgen API provides AI-powered video generation using multi-scene diffusion models, with built-in prompt enhancement, content compliance checking, content credentials, and watermarking. All prompt parsing, workflow construction, and post-processing are handled server-side — clients submit raw scene prompts and receive signed MP4 videos hosted on S3.
- Multi-scene video generation (1 to 12 scenes per request, videos up to 60 seconds)
- Text-to-video (t2v) and image-to-video (i2v) modes per scene via start/end frames
- Optional LLM-powered prompt enhancement
- Automated pre-generation and post-generation content compliance review
- Optional AI-generated background music and FX sound effects
- C2PA manifest signing, invisible watermark embedding, and fingerprinting
- S3-hosted output with pre-signed URLs for the signed video and thumbnail
Base URL
All endpoints accept POST requests with a Content-Type: application/json body containing an action field.
POST https://vidgen.api.efficientstack.com/api/v2
Content-Type: application/json
Authorization: Bearer <API-KEY>
{
"action": "generate",
...
}
Authentication
All requests must include a Bearer token in the Authorization header:
Authorization: Bearer <API-KEY>
Requests without a valid, enabled key receive a 401 Unauthorized response.
Prompt Syntax
The API uses a unified raw prompt format per scene. The server parses the following tokens:
| Token | Syntax | Example | Description |
|---|---|---|---|
| Negative | -term | -blurry | Comma-separated term prefixed with - is moved to the negative prompt. |
| Positive | everything else | woman dancing, smooth camera | Descriptive prompt text for the scene. |
Example
cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry, -low quality, -shaky
Parsed as: positive=cinematic ocean waves crashing on rocks, golden hour, slow motion, negative=blurry, low quality, shaky.
Endpoints
| Action | Description |
|---|---|
models | List available models and their labels |
generate | Submit a multi-scene video generation job |
status | Poll a generation job for completion |
optimize | Enhance a prompt with LLM |
random | Get a random prompt from configured seeds |
POST models
Returns available video generation models and the maximum number of scenes allowed per request. No parameters beyond action.
{ "action": "models" }
Response
{
"models": {
"scenex": { "label": "SceneX" }
},
"max_scenes": 5
}
POST generate
Submit a video generation job. Returns a job ID for polling. Each scene produces approximately 5 seconds of video; scenes are stitched into a single continuous MP4.
Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
action | string | yes | — | "generate" |
scenes | array | yes | — | Array of 1 to max_scenes scene objects. |
scenes[].prompt | string | yes | — | Scene prompt. Supports -negative syntax. |
scenes[].negative | string | no | "" | Additional negative terms for this scene. |
scenes[].start_frame | string | no | — | Base64 data URI of the start frame (enables i2v mode). |
scenes[].end_frame | string | no | — | Base64 data URI of the end frame (first-last frame mode). |
model | string | no | first model | Model key from models response. |
optimize | boolean | no | false | LLM-enhance each scene prompt before generation. |
size | integer | no | 720 | Max dimension for i2v frame resize (720–1280). |
width | integer | no | 832 | Width for t2v mode. Must be divisible by 8. |
height | integer | no | 480 | Height for t2v mode. Must be divisible by 8. |
upscale | boolean | no | false | Enable 2× AI upscaling. |
music.enabled | boolean | no | false | Enable AI-generated background music. |
music.tags | string | no | "" | Music style tags (e.g. "R&B, slow jam"). |
music.lyrics | string | no | "[Inst]" | Lyrics or [Inst] for instrumental. |
music.bpm | integer | no | 85 | Tempo in beats per minute. |
music.keyscale | string | no | "Eb minor" | Musical key signature. |
fx_sound.enabled | boolean | no | false | Enable AI-generated FX sound effects. |
fx_sound.prompt | string | no | "" | FX audio description (e.g. "wind, footsteps"). |
fx_sound.negative_prompt | string | no | "" | FX audio terms to avoid. |
wm_image | string | no | "" | Custom watermark as base64 data URI. |
wm_position | string | no | "bottom-right" | Watermark anchor: top-left, top-right, center, bottom-left, bottom-right. |
wm_scale | integer | no | 5 | Watermark scale (1–100). |
wm_transparency | integer | no | 100 | Watermark opacity (0–100). |
wm_rotation | integer | no | 0 | Watermark rotation in degrees (-360–360). |
wm_padding_x | integer | no | 10 | Horizontal padding from anchor (0–500). |
wm_padding_y | integer | no | 10 | Vertical padding from anchor (0–500). |
cs_author | string | no | "" | Override C2PA author. |
cs_title | string | no | "" | Override C2PA title. |
cs_description | string | no | "" | Override C2PA description. |
cs_organization | string | no | "" | Override C2PA organization. |
cs_vendor | string | no | "" | Override C2PA vendor. |
Scene Modes
The generation mode for each scene is determined by which frames are provided:
| Start Frame | End Frame | Mode | Description |
|---|---|---|---|
| — | — | Text-to-Video | Generates video from text using width × height. |
| Provided | — | Image-to-Video | Animates from the start frame. Frame is resized to size. |
| Provided | Provided | First-Last Frame | Generates video transitioning between both frames. |
Valid Dimensions (t2v mode)
width and height must each be divisible by 8 and between 320 and 1280. Common pairs: 832×480 (16:9), 480×832 (9:16), 768×512 (3:2), 512×768 (2:3), 640×480 (4:3), 640×640 (1:1). Values not divisible by 8 are rounded to the nearest multiple.
Response
{
"id": "run_abc123xyz",
"gen_id": "a1b2c3d4e5f6g7h8i9j0",
"model": "scenex",
"scenes": [
{ "prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry" }
]
}
POST status
Poll a generation job. Pass the id and model returned by generate.
{
"action": "status",
"id": "run_abc123xyz",
"model": "scenex"
}
Pending
{ "status": "IN_QUEUE" }
{ "status": "IN_PROGRESS" }
Completed (Safe)
{
"status": "COMPLETED",
"meta": {
"id": "run_abc123xyz",
"gen_id": "a1b2c3d4e5f6g7h8i9j0",
"time": "2025-06-20T14:30:00Z",
"prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry",
"scenes": [
{
"prompt": "cinematic ocean waves crashing on rocks, golden hour, slow motion, -blurry",
"positive": "cinematic ocean waves crashing on rocks, golden hour, slow motion",
"negative": "base negatives, blurry",
"has_start_frame": false,
"has_end_frame": false
}
],
"model": "scenex",
"size": 720,
"width": 832,
"height": 480,
"upscale": false,
"music_enabled": false,
"fx_sound_enabled": false,
"scene_count": 1,
"optimized": false,
"compliance": "safe",
"compliance_codes": [],
"fingerprint": "a4f8e2c1b9d0...",
"flagged": false,
"images": [
{ "url": "https://s3.../video.mp4?...", "filename": "genid_fp_00001.mp4", "type": "signed" },
{ "url": "https://s3.../thumb.jpg?...", "filename": "genid_fp_00001_thumb.jpg", "type": "thumbnail" }
],
"enable_watermark": true,
"enable_c2pa": true,
"compute_fingerprint": true,
"wm_position": "bottom-right",
"wm_scale": 5,
"wm_transparency": 100,
"cs_author": "Author",
"cs_title": "AI Video",
"cs_description": "Text to Video - Generative AI",
"cs_organization": "",
"cs_vendor": ""
}
}
Completed (Blocked)
{
"status": "COMPLETED",
"compliance": "unsafe",
"error": "Content blocked — flagged for: Category Name.",
"filter_categories": ["id1", "id2"]
}
Failed
{ "status": "FAILED", "error": "Generation failed" }
POST optimize
Enhance a prompt independently of generation. Uses the model’s configured LLM enhancement prompt.
{
"action": "optimize",
"prompt": "woman walking in rain",
"model": "scenex"
}
Response
{
"prompt": "elegant woman walking gracefully through a gentle downpour, cobblestone street glistening with rain, cinematic lighting, shallow depth of field"
}
POST random
Returns a random prompt assembled from configured seed categories (actions, clothing, framing, locations). No parameters beyond action.
{ "action": "random" }
Response
{
"prompt": "dancing gracefully, flowing silk dress, medium close-up, on a sunlit rooftop terrace"
}
Prompt Construction
The server constructs the final positive and negative prompts for each scene from the raw input. Understanding this helps write more effective prompts.
Positive Prompt
{user prompt text, with -negative terms removed}
If optimize is enabled, the user prompt is first enhanced by an LLM using the model’s configured enhancement system prompt.
Negative Prompt
{model base negatives}[, parsed -terms from prompt][, explicit scene negative]
Negatives are assembled in this order: the model’s built-in base negatives, then any -term tokens parsed from the raw prompt, then any explicit negative string provided in the scene object.
Content Credentials
All generated videos can include three layers of content provenance, configurable via the admin panel:
- C2PA manifest signing — Embeds a signed C2PA manifest into the MP4 container, recording the model, software agent, generation type, author, and other metadata. Verifiable at contentcredentials.org/verify.
- Invisible watermark — Embeds a DCT-domain watermark carrying the generation ID into every N-th frame. The watermark is imperceptible but extractable with the corresponding private key.
- Fingerprinting — Computes a perceptual fingerprint of the output video, returned in
meta.fingerprintand embedded in the output filename.
Override C2PA metadata per request using: cs_author, cs_title, cs_description, cs_organization, cs_vendor. When omitted, server-configured defaults are used.
Polling Strategy
| Setting | Recommended |
|---|---|
| Interval | 5 seconds |
| Max polls | 360 (30 min timeout) |
| Overlap guard | Wait for each poll response before starting the next |
Terminal states: COMPLETED, FAILED, TIMED_OUT, CANCELLED. On COMPLETED, always check the compliance field before accessing meta — a value of "unsafe" means the output was blocked by post-generation compliance review.
Error Handling
| Status | Error | Cause |
|---|---|---|
| 400 | Missing action | No action field in request body |
| 400 | scenes array is required | Missing or empty scenes array |
| 400 | Each scene requires a prompt | A scene object has an empty prompt |
| 400 | Maximum N scenes allowed | Too many scenes for current configuration |
| 400 | Content blocked | Pre-generation compliance check failed |
| 400 | Bad ID | Invalid job ID format in status request |
| 401 | Missing or invalid Authorization header | No Bearer token provided |
| 401 | Invalid API key | Key not found or does not match |
| 401 | API key is disabled | Key exists but has been disabled |
| 500 | No models configured | No models set up in admin panel |
| 502 | Generation failed | Upstream RunPod error |
| 502 | Enhancement failed | LLM prompt enhancement error |
Silent fallbacks: invalid model → first configured model, non-divisible-by-8 dimensions → nearest multiple of 8, out-of-range dimensions → clamped to 320–1280.
Code Examples
const API = 'https://vidgen.api.efficientstack.com/api/v2';
const TOKEN = 'your-api-key';
async function api(body) {
const r = await fetch(API, {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${TOKEN}` },
body: JSON.stringify(body)
});
return r.json();
}
async function generateVideo(scenes, opts = {}) {
const job = await api({
action: 'generate', scenes, model: opts.model || 'scenex',
size: opts.size || 720, width: opts.width || 832, height: opts.height || 480,
upscale: opts.upscale || false, optimize: opts.optimize || false
});
if (job.error) throw new Error(job.error);
for (let i = 0; i < 360; i++) {
await new Promise(r => setTimeout(r, 5000));
const s = await api({ action: 'status', id: job.id, model: job.model });
if (s.status === 'IN_QUEUE' || s.status === 'IN_PROGRESS') continue;
if (s.status === 'COMPLETED') {
if (s.compliance === 'unsafe') throw new Error(s.error);
return s.meta;
}
throw new Error(s.error || 'Failed: ' + s.status);
}
throw new Error('Timeout');
}
// Single scene (text-to-video)
const meta = await generateVideo([
{ prompt: 'cinematic ocean waves crashing at sunset, golden hour, -blurry' }
]);
// Multi-scene
const multi = await generateVideo([
{ prompt: 'woman walking down a neon-lit street at night, rain reflections' },
{ prompt: 'close-up of her face looking up, rain drops on skin, slow motion' },
{ prompt: 'wide aerial shot pulling away from the city, -shaky, -low quality' }
], { upscale: true });
const video = meta.images.find(i => i.type === 'signed');
console.log('Video URL:', video?.url);
import requests, time
API = "https://vidgen.api.efficientstack.com/api/v2"
TOKEN = "your-api-key"
def api(body):
return requests.post(API, json=body,
headers={"Authorization": f"Bearer {TOKEN}"}).json()
def generate(scenes, model="scenex", size=720, upscale=False):
job = api({
"action": "generate", "scenes": scenes,
"model": model, "size": size, "upscale": upscale
})
if "error" in job:
raise Exception(job["error"])
for _ in range(360):
time.sleep(5)
s = api({"action": "status", "id": job["id"], "model": job["model"]})
if s["status"] in ("IN_QUEUE", "IN_PROGRESS"):
continue
if s["status"] == "COMPLETED":
if s.get("compliance") == "unsafe":
raise Exception(s.get("error"))
return s["meta"]
raise Exception(s.get("error", f"Failed: {s['status']}"))
raise TimeoutError("Polling timed out")
# Single scene
meta = generate([{"prompt": "cinematic ocean waves crashing at sunset"}])
# Multi-scene with music
meta = generate(
[
{"prompt": "woman dancing in a ballroom, elegant dress, warm lighting"},
{"prompt": "spinning in slow motion, camera orbiting around her"},
],
model="scenex",
upscale=True,
)
video = next(i for i in meta["images"] if i["type"] == "signed")
print("URL:", video["url"])
TOKEN="your-api-key"
BASE="https://vidgen.api.efficientstack.com/api/v2"
# List models
curl -s -X POST "$BASE" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"action":"models"}'
# Generate (single scene, text-to-video)
curl -s -X POST "$BASE" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"action": "generate",
"model": "scenex",
"scenes": [{"prompt": "cinematic ocean waves crashing at sunset, -blurry"}],
"width": 832,
"height": 480
}'
# Generate (multi-scene with music)
curl -s -X POST "$BASE" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"action": "generate",
"model": "scenex",
"scenes": [
{"prompt": "woman walking through autumn forest, golden leaves falling"},
{"prompt": "close-up of leaves crunching underfoot, shallow depth of field"}
],
"music": {"enabled": true, "tags": "ambient, cinematic", "bpm": 72}
}'
# Poll status (replace JOB_ID with the id from generate response)
curl -s -X POST "$BASE" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"action":"status","id":"JOB_ID","model":"scenex"}'
# Enhance prompt
curl -s -X POST "$BASE" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"action":"optimize","prompt":"woman walking in rain","model":"scenex"}'
# Random prompt
curl -s -X POST "$BASE" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"action":"random"}'
<?php
$api = 'https://vidgen.api.efficientstack.com/api/v2';
$token = 'your-api-key';
function apiCall($api, $token, $body) {
$ch = curl_init($api);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => json_encode($body),
CURLOPT_HTTPHEADER => [
'Content-Type: application/json',
"Authorization: Bearer $token",
],
]);
$r = curl_exec($ch);
curl_close($ch);
return json_decode($r, true);
}
// Generate a single-scene video
$job = apiCall($api, $token, [
'action' => 'generate',
'model' => 'scenex',
'scenes' => [['prompt' => 'cinematic ocean waves crashing at sunset, -blurry']],
'width' => 832,
'height' => 480,
]);
if (isset($job['error'])) {
die('Error: ' . $job['error']);
}
// Poll for completion
for ($i = 0; $i < 360; $i++) {
sleep(5);
$s = apiCall($api, $token, [
'action' => 'status',
'id' => $job['id'],
'model' => $job['model'],
]);
if (in_array($s['status'], ['IN_QUEUE', 'IN_PROGRESS'])) {
continue;
}
if ($s['status'] === 'COMPLETED') {
if (($s['compliance'] ?? '') === 'unsafe') {
die('Blocked: ' . ($s['error'] ?? 'unsafe'));
}
$signed = array_values(array_filter(
$s['meta']['images'],
fn($i) => $i['type'] === 'signed'
))[0];
echo "Video URL: " . $signed['url'];
break;
}
die($s['error'] ?? 'Failed: ' . $s['status']);
}