New on Truffle
Type a scene.
The agent shoots it.
The eval grades it.
Take is a small text-to-video studio where the evaluation loop is the product. Describe a shot. An agent writes the direction, Luma Ray3.2 renders it, and a four-level cascade decides whether the take survives: deterministic gates, computer-vision lanes, a vision-language judge, and a bounded retake policy. You watch every step happen live.
The cascade
Generation is the easy half. Take's pipeline treats every render as a claim that has to survive four levels of scrutiny before it gets published.
- L0
Deterministic gates
ffprobe decode, duration, resolution, framerate. Then pixel gates: not black, not frozen. A clip that fails here never wastes a judge call.
- L1
Computer-vision lanes
Frame-difference flicker, Farneback optical flow, CLIPScore prompt alignment, DINOv2 identity drift against frame zero. The temporal questions a vision-language model answers at chance level are owned by code.
- L2
Vision-language judge
A timestamped contact sheet goes to a frontier judge that writes its rationale before its rating, on six axes: fidelity, aesthetics, consistency, motion, semantics, physics. Discrete levels, not fake-precision decimals.
- L3
Bounded decision
Deterministic accept-or-retake. Any axis at poor or below triggers a retake with the judge's advice folded into the next composition. Three attempts, then the best surviving take stands.
Gallery
Every tile is a take that survived the cascade. Hover to play; open for the full eval receipt.
Nothing here yet. The first graded takes arrive soon.