Take

An old wooden rowboat tied to a misty lake dock at dawn, ripples spre...

eval 19/24· 540p / 5s / 16:9 · 1 attempt ·154s wall

The prompt

An old wooden rowboat tied to a misty lake dock at dawn, ripples spreading slowly as a heron lands on the bow

What the agent wrote

An old weathered wooden rowboat tied with a frayed rope to a misty lake dock at dawn, soft pink and gold light filtering through low fog over still gray water. A great gray heron glides down and lands gently on the bow, its wings folding as the boat dips slightly, sending slow concentric ripples spreading outward across the glassy surface. The camera holds in a slow steady dolly-in toward the boat, mist drifting lazily. Calm, serene, painterly atmosphere with gentle reflections and dewy reeds swaying faintly at the water's edge.

Why this framing: A wide tranquil lakescape with horizontal water and drifting mist reads best in 16:9, giving room for the heron's descent and the ripples to spread.

The eval receipt

Every Take clip ships with the measurements that admitted it. Deterministic gates and CV lanes run first; the VLM judge only rules on what those lanes cannot measure, and it must justify each level before naming it. How the cascade works.

L0 · Deterministic gates

GateResultReadingCheck
decodes pass h264 ffprobe parsed a video stream
duration pass 5.04s expected 5s ± 1s
resolution pass 960x540 expected height 540
framerate pass 24.00 fps sane range 12-60
not black pass mean luma 86.6 0/121 frames under 12.0
not frozen pass mean frame diff 3.744 static-cheat detector, threshold 0.35

L1 · CV lanes

Flicker 3.74
min 2.24 · max 5.60 · σ 0.87

mean abs luma diff between consecutive frames

How much each frame differs from the next, checked on every frame. A high average usually just means a busy scene; sudden spikes far above the average mean strobing.

Optical flow 4.54
min 3.78 · max 5.79 · σ 0.68

Farneback flow magnitude, motion energy per frame pair

How far pixels move between frames: one number for how much is actually happening. Under 0.3 is basically a still image; 2 to 8 is normal motion.

CLIPScore 0.332
min 0.306 · max 0.365 · σ 0.020

cosine of CLIP ViT-B/32 prompt and frame embeddings

How well the frames match what you typed, scored by CLIP. 0.30 and up is well on-prompt; below 0.24 the model probably wandered off.

DINO drift 0.777
min 0.517 · max 1.000 · σ 0.110

DINOv2 cosine of each sampled frame against frame 0

Whether the subject stays the same subject, with every frame compared against the first. Watch the min: a single frame below 0.5 means identity broke.

These bands come from a 14-clip calibration set we ran before launch. See the full thresholds and the clips that set them.

L2 · Judge verdict

fidelity good The heron is rendered with convincing plumage and natural wing geometry across the landing sequence; body and legs read cleanly in #8-#9. The main softness is motion blur on the wings during the flare (#2-#3, #6-#7), which is expected for a flapping bird rather than an artifact. No mangled anatomy, extra limbs, or garbled textures on the boat's weathered planks. Overall clean.
aesthetics good Strong painterly mood: layered fog, muted palette, soft reflections in the still water, and dewy reeds filling the background. Composition is well-balanced with the boat anchored low-right and the bird as focal point. The one miss versus intent is the lighting — it reads cool gray-mauve rather than the prompt's warm pink-and-gold dawn, so it's atmospheric but not as luminous as specified.
consistency good The rowboat, reeds, and water hold steady in form and texture across all 10 frames. dino_drift mean 0.777 (dipping to 0.517 at #9) looks alarming but is fully explained: the single subject is a heron transforming from a wide-winged glide (#0-#2) to a fully folded perched pose (#9), plus a slow dolly that reframes the boat — both legitimately collapse frame-0 identity cosine without any morphing. The scene itself is coherent.
motion excellent Flow mean 4.54 confirms steady, moderate motion consistent with a gentle landing and drifting mist. The pose progression is highly plausible: descending glide (#0-#1), braking flare with wings raised (#2-#5), folding (#6-#8), settled perch (#9) — a believable heron landing cycle with no teleporting or frozen limbs.
semantics good Core elements are present: weathered wooden rowboat, great gray heron gliding in and landing on the bow with wings folding, concentric ripples spreading from the hull (visible #4-#9), misty water, and reeds at the edge. clipscore 0.33 supports good prompt alignment. Shortfalls: no clearly visible dock, the frayed mooring rope is not legible, and the warm dawn light is largely absent.
physics good Ripples emanate correctly from around the boat as the bird settles (#4-#9), the boat casts a coherent reflection on the glassy surface, and the heron's contact point on the bow looks grounded by #8-#9. Object permanence holds. No visible gravity or shadow violations.

A serene, painterly clip of a gray heron gliding down and landing on the bow of a weathered rowboat, with believable wing mechanics and ripples spreading across still misty water. Scene consistency is solid despite a low dino_drift number, which is an artifact of the bird's pose change and the dolly move rather than morphing. The main gap is mood lighting — it reads cool and gray instead of the warm pink-and-gold dawn the prompt calls for, and the dock and frayed rope aren't clearly shown.

What the judge saw

The timestamped contact sheet, exactly as handed to the judge. ffmpeg pulled eight evenly spaced frames from the clip, stamped each with its timestamp, and tiled them into this one grid; it is the only image the judge reads. Why a grid beats a video.

Timestamped contact sheet

Attempt history

Attempt 1 accepted score 19/24
gen 86.6s · eval 12.3s
Make your own take Read how the eval works