Levenshtein edit distance

003-levenshtein-distance. Read two short ASCII strings `A` and `B` from standard input (one per line),

Prompt

This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.

## Task: Levenshtein edit distance

Read two short ASCII strings `A` and `B` from standard input (one per line),
each at most 50 characters and using only printable ASCII (`0x20..=0x7E`).
Compute the **Levenshtein edit distance** between them: the minimum number
of single-character insertions, deletions, or substitutions to transform
`A` into `B`. Each edit has cost 1. Write the distance to standard output
followed by a single newline. Exit with status 0.

Edge cases:

- If `A == B` the distance is `0`.
- If either string is empty the distance is the length of the other.

## Acceptance

- Stdin: two lines. Line 1 is `A`, line 2 is `B`. Either may be empty.
- Stdout: the decimal digits of the distance, followed by `\n`.
- Stderr: empty.
- Exit code: 0.
- The program must complete each test case within 5 seconds.

## Examples

| A          | B          | distance |
| ---------- | ---------- | -------- |
| `kitten`   | `sitting`  | `3`      |
| `saturday` | `sunday`   | `3`      |
| `flaw`     | `lawn`     | `2`      |
| `intention`| `execution`| `5`      |
| empty      | `abc`      | `3`      |
| empty      | empty      | `0`      |

Acceptance

A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."

stdout (byte-exact, per case)	true
stderr (exact bytes)	""
exit code	0
wall time max (ms)	5000
tags	computation, dp, strings

Results

Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.

Model	Zero	TypeScript	Rust	Go	Python
gpt-4o	compile	✓	✓	✓	✓
gpt-4o-mini	compile	other	✓	✓	✓
gpt-5	compile	✓	✓	✓	✓
opus	compile	✓	✓	✓	✓
sonnet	wrong output	✓	✓	✓	✓

Failure excerpts

6 of 25 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.

gpt-4o Zero compile

ref.zero:1:1 IMP001: unknown package-local import 'std'

gpt-4o-mini Zero compile

ref.zero:9:1 PAR100: expected '{' before block

gpt-4o-mini TypeScript other
```
(no diagnostic captured)
```

gpt-5 Zero compile

ref.zero:3:8 PAR100: expected '{' before block

opus Zero compile

ref.zero:1:11 PAR100: unexpected character '@'

sonnet Zero wrong output
```
(no diagnostic captured)
```

Reference implementations

The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.

Click a language to expand

Zero n/a

No reference implementation in this language.

TypeScript 31 lines

// Levenshtein edit distance, TypeScript reference.
// Reads A and B from stdin (one per line), runs a two-row DP.
import { readFileSync } from "node:fs";

const raw = readFileSync(0, "utf8");
const lines = raw.split("\n");
const a = lines[0] ?? "";
const b = lines[1] ?? "";
if (a.length > 50 || b.length > 50) {
  process.stderr.write("each string must be at most 50 characters\n");
  process.exit(1);
}

const m = a.length;
const n = b.length;
let prev = new Array<number>(n + 1);
let curr = new Array<number>(n + 1);
for (let j = 0; j <= n; j++) prev[j] = j;
for (let i = 0; i < m; i++) {
  curr[0] = i + 1;
  const ai = a.charCodeAt(i);
  for (let k = 0; k < n; k++) {
    const delCost = prev[k + 1] + 1;
    const insCost = curr[k] + 1;
    const subCost = prev[k] + (ai === b.charCodeAt(k) ? 0 : 1);
    curr[k + 1] = Math.min(delCost, insCost, subCost);
  }
  [prev, curr] = [curr, prev];
}
process.stdout.write(`${prev[n]}\n`);

Rust 38 lines

// Levenshtein edit distance, Rust reference.
// Reads A and B from stdin (one per line), runs a two-row DP.
use std::io::{self, Read, Write};
use std::process::ExitCode;

fn main() -> ExitCode {
    let mut input = String::new();
    if io::stdin().read_to_string(&mut input).is_err() {
        let _ = writeln!(io::stderr(), "failed to read stdin");
        return ExitCode::from(1);
    }
    let mut lines = input.split('\n');
    let a = lines.next().unwrap_or("");
    let b = lines.next().unwrap_or("");
    if a.chars().count() > 50 || b.chars().count() > 50 {
        let _ = writeln!(io::stderr(), "each string must be at most 50 characters");
        return ExitCode::from(1);
    }
    let a_bytes = a.as_bytes();
    let b_bytes = b.as_bytes();
    let m = a_bytes.len();
    let n = b_bytes.len();
    let mut prev: Vec<u32> = (0..=n as u32).collect();
    let mut curr: Vec<u32> = vec![0; n + 1];
    for i in 0..m {
        curr[0] = (i + 1) as u32;
        for k in 0..n {
            let del_cost = prev[k + 1] + 1;
            let ins_cost = curr[k] + 1;
            let sub_cost = prev[k] + if a_bytes[i] == b_bytes[k] { 0 } else { 1 };
            curr[k + 1] = del_cost.min(ins_cost).min(sub_cost);
        }
        std::mem::swap(&mut prev, &mut curr);
    }
    let _ = writeln!(io::stdout(), "{}", prev[n]);
    ExitCode::SUCCESS
}

Go 59 lines

// Levenshtein edit distance, Go reference.
// Reads A and B from stdin (one per line), runs a two-row DP.
package main

import (
	"bufio"
	"fmt"
	"io"
	"os"
	"strings"
)

func main() {
	raw, err := io.ReadAll(bufio.NewReader(os.Stdin))
	if err != nil {
		fmt.Fprintln(os.Stderr, "failed to read stdin")
		os.Exit(1)
	}
	lines := strings.SplitN(string(raw), "\n", 3)
	var a, b string
	if len(lines) >= 1 {
		a = lines[0]
	}
	if len(lines) >= 2 {
		b = lines[1]
	}
	if len(a) > 50 || len(b) > 50 {
		fmt.Fprintln(os.Stderr, "each string must be at most 50 characters")
		os.Exit(1)
	}
	m, n := len(a), len(b)
	prev := make([]int, n+1)
	curr := make([]int, n+1)
	for j := 0; j <= n; j++ {
		prev[j] = j
	}
	for i := 0; i < m; i++ {
		curr[0] = i + 1
		for k := 0; k < n; k++ {
			delCost := prev[k+1] + 1
			insCost := curr[k] + 1
			subCost := prev[k]
			if a[i] != b[k] {
				subCost++
			}
			best := delCost
			if insCost < best {
				best = insCost
			}
			if subCost < best {
				best = subCost
			}
			curr[k+1] = best
		}
		prev, curr = curr, prev
	}
	fmt.Print(prev[n], "\n")
}

Python 33 lines

"""Levenshtein edit distance, Python reference.

Reads two lines from stdin (A and B), computes the classic two-row DP,
and writes the distance plus a trailing newline.
"""
import sys


def main() -> None:
    data = sys.stdin.read().split("\n")
    a = data[0] if len(data) >= 1 else ""
    b = data[1] if len(data) >= 2 else ""
    if len(a) > 50 or len(b) > 50:
        sys.stderr.write("each string must be at most 50 characters\n")
        sys.exit(1)
    m, n = len(a), len(b)
    prev = list(range(n + 1))
    curr = [0] * (n + 1)
    for i in range(m):
        curr[0] = i + 1
        ai = a[i]
        for k in range(n):
            del_cost = prev[k + 1] + 1
            ins_cost = curr[k] + 1
            sub_cost = prev[k] + (0 if ai == b[k] else 1)
            curr[k + 1] = min(del_cost, ins_cost, sub_cost)
        prev, curr = curr, prev
    sys.stdout.write(f"{prev[n]}\n")


if __name__ == "__main__":
    main()

Design notes

Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/003-levenshtein-distance/notes.md.

Why this task

Third pure-computation task. The first two corpus tasks exercised single-integer arithmetic (recursion + memoization for 001, array marking for 002). This one is the first that takes two inputs and produces a number that depends on a two-dimensional dynamic programming relation. The model must (a) read two strings, not one integer, (b) materialize the standard Levenshtein recurrence (insert, delete, substitute, each cost 1, take the minimum), (c) keep the table small enough to run cheaply on the 50-char cap, (d) emit a single decimal followed by \n.

Difficulty: easy. The recurrence is canonical algorithm material but catches models that get the row/column indexing wrong, that forget to seed the first row and first column with the lengths, or that forget the equal-character zero-cost short circuit.

Input shape changes

Tasks 001 and 002 read a single integer N from stdin (or argv[1] for Zero). This task reads two strings, one per line. For Zero that means argv[1] is A and argv[2] is B, mirroring the per-line convention of the other four languages.

The 50-character cap on each string keeps the DP table inside Zero's stack-allocated [51]u32 rolling buffers (the prior tasks used [N]u8 because the spec's data fit in bytes; here we need u32 because the distance for a 50/50 string pair could theoretically reach 50 and a u8 would still cover that, but I'd rather keep the cell type the same as the conventional textbook value).

Edge cases probed

Both strings empty: distance 0. Catches off-by-one initialization.
One empty, one non-empty: distance equals the length of the other. Catches models that skip the boundary seeding entirely.
Equal strings (not tested in the matrix here but covered transitively by case-002 with empty/empty == "abc"/"abc" == 0 logic).
Classic kitten -> sitting (distance 3) and saturday -> sunday (distance 3) cover the substitution + insertion + deletion mix that shows up in every algorithms textbook.
intention -> execution (distance 5) tests a deeper DP path with no obvious shared prefix to short-circuit.
The long pair the quick brown fox jumps over the lazy dog vs the quick brown dog jumps over the lazy fox is interesting because the naive intuition is "6 substitutions, 3 per word" but the right answer is 4: o is shared between fox and dog at the middle position, so only f<->d and x<->g substitute. Easy hand-error.

Zero implementation notes

Same input convention as 001 and 002: argv-based because Zero 0.1.2 has no exposed stdin capability. All logic stays inside pub fun main.

The DP uses two rolling [51]u32 buffers (current row and previous row) rather than a full [51][51]u32 table. Two reasons:

The direct backend's array-element-type matrix doesn't include nested arrays as cleanly as a single flat allocation.
Two rolling buffers are the textbook memory optimization and cut the stack footprint from ~10KB to ~408 bytes.

The decimal renderer reuses the pattern from 001/002 (digits buffer, walk the value, reverse into the output buffer) with distance: u32 as the source value. Same small-divisor (10) divmod, so it stays clear of the 2^32-divisor SIGFPE documented in ~/repos/zero-bugs/u64-divmod-cast-sigfpe/.

Future revisions

Higher per-string cap. The current [51]u32 allocation tracks the spec's 50-char limit. A 256-char cap would push the rolling buffer to about 2KB still, well inside the stack budget.
Unicode. Today's spec is ASCII-only to keep the byte indexing trivial. A future revision could ask for Unicode-codepoint-level edit distance, which would surface whether the model knows the difference between len(bytes) and len(codepoints).
Damerau-Levenshtein (add a transposition rule). Same recurrence with one extra case for adjacent swap. Defer to v0.2 if a model consistently confuses the two variants.

Cost

Model	Prompt tokens	Completion tokens	API ms
gpt-4o	2,955	1,963	15,340
gpt-4o-mini	2,955	1,645	44,590
gpt-5	2,950	15,759	193,755
opus	14	1,989	78,204
sonnet	12	1,346	104,872

Tokens and API ms are summed across the five languages this model attempted for this task.

Compare

Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 · opus · sonnet . Back to the leaderboard and methodology.

Truffle · 003-levenshtein-distance · run captured 2026-06-27