Run-length encode the input as count/byte pairs

011-rle-encode. Read all bytes from standard input.

Prompt

This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.

# Task 011 — rle-encode

## Problem

Read all bytes from standard input. Walk left to right; for every
maximal run of identical consecutive bytes, write one line of the
form `<count> <byte_decimal>\n` to standard output where `<count>`
is the run length in decimal and `<byte_decimal>` is the byte value
(0..255) in decimal. The lines must appear in input order. Exit with
status 0.

## Output format

- Each line is `<count>` (decimal, no padding), one ASCII space
  (0x20), `<byte_decimal>` (0..255, no padding), one newline (0x0A).
- No header. No trailing extra newline beyond the per-record one.

## Edge cases

- Empty input → empty output.
- Single byte → one line, count 1.
- All identical bytes → one line with the full count.
- Each maximal run is emitted exactly once; a byte that recurs after
  a gap forms a new run with its own line.

## Examples

- Input `aaa` → `3 97\n`
- Input `abc` → `1 97\n1 98\n1 99\n`
- Input `aaabbc` → `3 97\n2 98\n1 99\n`
- Input `aabbaa` → `2 97\n2 98\n2 97\n` (the second `aa` is a
  separate run because there is a different byte between them).

## Acceptance

The stdout produced by your program must match the expected bytes
exactly for every test case. Stderr must be empty and the process
must exit 0.

## Input convention

- **stdin** (TypeScript, Rust, Go, Python): read all bytes until EOF.
- **argv[1]** (Zero 0.1.2 has no exposed stdin capability): read the
  bytes from the first command-line argument.

## Token budget

1600 tokens.

Acceptance

A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."

stdout (byte-exact, per case)	true
stderr (exact bytes)	""
exit code	0
wall time max (ms)	5000
tags	io, encoding, single-pass, multi-record-output

Results

Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.

Model	Zero	TypeScript	Rust	Go	Python
gpt-4o	compile	✓	✓	✓	✓
gpt-4o-mini	compile	✓	✓	✓	✓
gpt-5	compile	✓	✓	✓	✓
opus	compile	✓	✓	✓	✓
sonnet	wrong output	✓	✓	✓	✓

Failure excerpts

5 of 25 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.

gpt-4o Zero compile

ref.zero:1:1 IMP001: unknown package-local import 'std'

gpt-4o-mini Zero compile

ref.zero:1:15 PAR100: expected expression

gpt-5 Zero compile

ref.zero:1:1 IMP001: unknown package-local import 'std'

opus Zero compile

ref.zero:26:34 PAR100: unexpected character '@'

sonnet Zero wrong output
```
(no diagnostic captured)
```

Reference implementations

The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.

Click a language to expand

Zero n/a

No reference implementation in this language.

TypeScript 26 lines

// Run-length encode the input, TypeScript reference.

import { readFileSync } from "node:fs";

function main(): void {
  const data = readFileSync(0);
  if (data.length === 0) return;
  const out: string[] = [];
  let curByte = data[0];
  let curCount = 1;
  for (let i = 1; i < data.length; i++) {
    const b = data[i];
    if (b === curByte) {
      curCount++;
    } else {
      out.push(`${curCount} ${curByte}\n`);
      curByte = b;
      curCount = 1;
    }
  }
  out.push(`${curCount} ${curByte}\n`);
  process.stdout.write(out.join(""));
}

main();

Rust 26 lines

// Run-length encode the input, Rust reference.

use std::io::{self, Read, Write};

fn main() {
    let mut data = Vec::new();
    io::stdin().read_to_end(&mut data).expect("read stdin");
    if data.is_empty() {
        return;
    }
    let stdout = io::stdout();
    let mut stdout = stdout.lock();
    let mut cur_byte: u8 = data[0];
    let mut cur_count: u32 = 1;
    for &b in &data[1..] {
        if b == cur_byte {
            cur_count += 1;
        } else {
            writeln!(stdout, "{} {}", cur_count, cur_byte).expect("write");
            cur_byte = b;
            cur_count = 1;
        }
    }
    writeln!(stdout, "{} {}", cur_count, cur_byte).expect("write");
}

Go 36 lines

// Run-length encode the input, Go reference.

package main

import (
	"bufio"
	"fmt"
	"io"
	"os"
)

func main() {
	data, err := io.ReadAll(os.Stdin)
	if err != nil {
		os.Exit(1)
	}
	if len(data) == 0 {
		return
	}
	w := bufio.NewWriter(os.Stdout)
	defer w.Flush()
	curByte := data[0]
	curCount := 1
	for i := 1; i < len(data); i++ {
		b := data[i]
		if b == curByte {
			curCount++
		} else {
			fmt.Fprintf(w, "%d %d\n", curCount, curByte)
			curByte = b
			curCount = 1
		}
	}
	fmt.Fprintf(w, "%d %d\n", curCount, curByte)
}

Python 27 lines

#!/usr/bin/env python3
"""Run-length encode the input, Python reference."""

import sys


def main() -> None:
    data = sys.stdin.buffer.read()
    if not data:
        return
    parts = []
    cur_byte = data[0]
    cur_count = 1
    for b in data[1:]:
        if b == cur_byte:
            cur_count += 1
        else:
            parts.append(f"{cur_count} {cur_byte}\n")
            cur_byte = b
            cur_count = 1
    parts.append(f"{cur_count} {cur_byte}\n")
    sys.stdout.write("".join(parts))


if __name__ == "__main__":
    main()

Design notes

Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/011-rle-encode/notes.md.

Algorithm

Single forward walk with one open run at a time:

Open the first run at byte 0; cur_byte = bytes[0], cur_count = 1.
For each subsequent byte: if it equals cur_byte, increment cur_count; otherwise emit the open run, then start a new one anchored on this byte.
After the loop, emit the last open run.

The four reference implementations in TS/Rust/Go/Python all use the emit-on-change-plus-final-flush shape directly. The Zero ref uses a slightly different shape — an outer loop that picks up one run per iteration via an inner-flag scan — because the emit-on-change shape needs the "scan past the matching run" step to live somewhere, and the explicit-flag inner-scan pattern from task 008 reuses cleanly here.

Edge cases

Empty input → emit nothing.
Single byte → one line with count=1.
Two-byte input with identical bytes → one line with count=2.
Two-byte input with different bytes → two lines, each count=1.
Recurring byte after a gap (aabbaa) forms a new run — the third run is still 2 97\n even though byte 97 already appeared.

Zero-specific notes

argv[1] is the input bytes; an absent argv falls back to an empty span.
Inner-loop scan uses the explicit-flag pattern (no && short-circuit available in Zero 0.1.2 direct backend).
Decimal renderer is inlined twice per emit site (once for the count, once for the byte value) because Zero 0.1.2 does not support Span<u8>/MutSpan<u8> as function parameters; the function extraction shape is unavailable. The count renderer assumes positive (cur_count >= 1), so it omits the v == 0 branch; the byte renderer keeps the v == 0 branch because byte value 0 is valid input.
Output buffer at [8192]u8: worst case is the 1000-byte input cap with every byte different, producing 1000 lines of "1 NNN\n" = 7 bytes max each, total 7000 bytes. Comfortably under 8192.

Cross-implementation parity

All five references walk the input byte-by-byte and emit one record per maximal run in input order. Byte-exact agreement on every case.

Cost

Model	Prompt tokens	Completion tokens	API ms
gpt-4o	3,350	893	9,811
gpt-4o-mini	3,350	782	16,224
gpt-5	3,345	16,102	151,686
opus	14	1,163	73,211
sonnet	14	957	64,134

Tokens and API ms are summed across the five languages this model attempted for this task.

Compare

Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 · opus · sonnet . Back to the leaderboard and methodology.

Truffle · 011-rle-encode · run captured 2026-06-27