agentlang-index · task easy

Run-length encode the input as count/byte pairs

011-rle-encode. Read all bytes from standard input.

Prompt

This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.

# Task 011 — rle-encode

## Problem

Read all bytes from standard input. Walk left to right; for every
maximal run of identical consecutive bytes, write one line of the
form `<count> <byte_decimal>\n` to standard output where `<count>`
is the run length in decimal and `<byte_decimal>` is the byte value
(0..255) in decimal. The lines must appear in input order. Exit with
status 0.

## Output format

- Each line is `<count>` (decimal, no padding), one ASCII space
  (0x20), `<byte_decimal>` (0..255, no padding), one newline (0x0A).
- No header. No trailing extra newline beyond the per-record one.

## Edge cases

- Empty input → empty output.
- Single byte → one line, count 1.
- All identical bytes → one line with the full count.
- Each maximal run is emitted exactly once; a byte that recurs after
  a gap forms a new run with its own line.

## Examples

- Input `aaa` → `3 97\n`
- Input `abc` → `1 97\n1 98\n1 99\n`
- Input `aaabbc` → `3 97\n2 98\n1 99\n`
- Input `aabbaa` → `2 97\n2 98\n2 97\n` (the second `aa` is a
  separate run because there is a different byte between them).

## Acceptance

The stdout produced by your program must match the expected bytes
exactly for every test case. Stderr must be empty and the process
must exit 0.

## Input convention

- **stdin** (TypeScript, Rust, Go, Python): read all bytes until EOF.
- **argv[1]** (Zero 0.1.2 has no exposed stdin capability): read the
  bytes from the first command-line argument.

## Token budget

1600 tokens.

Acceptance

A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."

stdout (byte-exact, per case) true
stderr (exact bytes) ""
exit code 0
wall time max (ms) 5000
tags io, encoding, single-pass, multi-record-output

Results

Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.

Model ZeroTypeScriptRustGoPython
gpt-4o compile
gpt-4o-mini compile
gpt-5 compile

Failure excerpts

3 of 15 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.

  1. gpt-4o Zero compile
    ref.zero:1:1 IMP001: unknown package-local import 'std'
  2. gpt-4o-mini Zero compile
    ref.zero:1:15 PAR100: expected expression
  3. gpt-5 Zero compile
    ref.zero:1:1 IMP001: unknown package-local import 'std'

Reference implementations

The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.

Click a language to expand

Zero 88 lines
// Run-length encode the input, Zero 0.1.2 direct backend.
//
// argv[1] is the input bytes (Zero 0.1.2 has no exposed stdin capability).
// All logic stays inside `pub fun main` to avoid Span/MutSpan parameter
// restrictions.
//
// Outer loop steps run-by-run: pick up the current byte, walk forward
// while it matches, then emit `<count> <byte_decimal>\n`. Inner scan
// uses the explicit-flag pattern (no `&&` short-circuit available in
// Zero 0.1.2). Two inline decimal renderers per run because the MVP
// subset cannot factor them into a function that writes back into the
// outer output buffer (no MutSpan parameters).
pub fun main(world: World) -> Void raises {
    let line_opt = std.args.get(1)
    let mut bytes: Span<u8> = std.mem.span("")
    if line_opt.has {
        bytes = std.mem.span(line_opt.value)
    }
    let n: usize = std.mem.len(bytes)

    let mut out_buf: [8192]u8 = [0_u8; 8192]
    let mut written: usize = 0

    let mut i: usize = 0
    while i < n {
        let cur_byte: u8 = bytes[i]
        let mut cur_count: u32 = 1_u32
        i = i + 1
        let mut scanning: Bool = true
        while scanning {
            if i >= n {
                scanning = false
            } else {
                if bytes[i] == cur_byte {
                    cur_count = cur_count + 1_u32
                    i = i + 1
                } else {
                    scanning = false
                }
            }
        }

        let mut tmp: [16]u8 = [0_u8; 16]
        let mut tn: usize = 0
        let mut v: u32 = cur_count
        while v > 0_u32 {
            tmp[tn] = (48_u32 + (v % 10_u32)) as u8
            tn = tn + 1
            v = v / 10_u32
        }
        let mut j: usize = 0
        while j < tn {
            out_buf[written] = tmp[tn - 1 - j]
            written = written + 1
            j = j + 1
        }
        out_buf[written] = 32_u8
        written = written + 1

        let b_val: u32 = cur_byte as u32
        if b_val == 0_u32 {
            out_buf[written] = 48_u8
            written = written + 1
        } else {
            let mut tmp2: [4]u8 = [0_u8; 4]
            let mut tn2: usize = 0
            let mut v2: u32 = b_val
            while v2 > 0_u32 {
                tmp2[tn2] = (48_u32 + (v2 % 10_u32)) as u8
                tn2 = tn2 + 1
                v2 = v2 / 10_u32
            }
            let mut j2: usize = 0
            while j2 < tn2 {
                out_buf[written] = tmp2[tn2 - 1 - j2]
                written = written + 1
                j2 = j2 + 1
            }
        }
        out_buf[written] = 10_u8
        written = written + 1
    }

    let out: Span<u8> = out_buf[0..written]
    check world.out.write(out)
    return
}
TypeScript 26 lines
// Run-length encode the input, TypeScript reference.

import { readFileSync } from "node:fs";

function main(): void {
  const data = readFileSync(0);
  if (data.length === 0) return;
  const out: string[] = [];
  let curByte = data[0];
  let curCount = 1;
  for (let i = 1; i < data.length; i++) {
    const b = data[i];
    if (b === curByte) {
      curCount++;
    } else {
      out.push(`${curCount} ${curByte}\n`);
      curByte = b;
      curCount = 1;
    }
  }
  out.push(`${curCount} ${curByte}\n`);
  process.stdout.write(out.join(""));
}

main();
Rust 26 lines
// Run-length encode the input, Rust reference.

use std::io::{self, Read, Write};

fn main() {
    let mut data = Vec::new();
    io::stdin().read_to_end(&mut data).expect("read stdin");
    if data.is_empty() {
        return;
    }
    let stdout = io::stdout();
    let mut stdout = stdout.lock();
    let mut cur_byte: u8 = data[0];
    let mut cur_count: u32 = 1;
    for &b in &data[1..] {
        if b == cur_byte {
            cur_count += 1;
        } else {
            writeln!(stdout, "{} {}", cur_count, cur_byte).expect("write");
            cur_byte = b;
            cur_count = 1;
        }
    }
    writeln!(stdout, "{} {}", cur_count, cur_byte).expect("write");
}
Go 36 lines
// Run-length encode the input, Go reference.

package main

import (
	"bufio"
	"fmt"
	"io"
	"os"
)

func main() {
	data, err := io.ReadAll(os.Stdin)
	if err != nil {
		os.Exit(1)
	}
	if len(data) == 0 {
		return
	}
	w := bufio.NewWriter(os.Stdout)
	defer w.Flush()
	curByte := data[0]
	curCount := 1
	for i := 1; i < len(data); i++ {
		b := data[i]
		if b == curByte {
			curCount++
		} else {
			fmt.Fprintf(w, "%d %d\n", curCount, curByte)
			curByte = b
			curCount = 1
		}
	}
	fmt.Fprintf(w, "%d %d\n", curCount, curByte)
}
Python 27 lines
#!/usr/bin/env python3
"""Run-length encode the input, Python reference."""

import sys


def main() -> None:
    data = sys.stdin.buffer.read()
    if not data:
        return
    parts = []
    cur_byte = data[0]
    cur_count = 1
    for b in data[1:]:
        if b == cur_byte:
            cur_count += 1
        else:
            parts.append(f"{cur_count} {cur_byte}\n")
            cur_byte = b
            cur_count = 1
    parts.append(f"{cur_count} {cur_byte}\n")
    sys.stdout.write("".join(parts))


if __name__ == "__main__":
    main()

Design notes

Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/011-rle-encode/notes.md.

Algorithm

Single forward walk with one open run at a time:

  1. Open the first run at byte 0; cur_byte = bytes[0], cur_count = 1.
  2. For each subsequent byte: if it equals cur_byte, increment cur_count; otherwise emit the open run, then start a new one anchored on this byte.
  3. After the loop, emit the last open run.

The four reference implementations in TS/Rust/Go/Python all use the emit-on-change-plus-final-flush shape directly. The Zero ref uses a slightly different shape — an outer loop that picks up one run per iteration via an inner-flag scan — because the emit-on-change shape needs the "scan past the matching run" step to live somewhere, and the explicit-flag inner-scan pattern from task 008 reuses cleanly here.

Edge cases

  • Empty input → emit nothing.
  • Single byte → one line with count=1.
  • Two-byte input with identical bytes → one line with count=2.
  • Two-byte input with different bytes → two lines, each count=1.
  • Recurring byte after a gap (aabbaa) forms a new run — the third run is still 2 97\n even though byte 97 already appeared.

Zero-specific notes

  • argv[1] is the input bytes; an absent argv falls back to an empty span.
  • Inner-loop scan uses the explicit-flag pattern (no && short-circuit available in Zero 0.1.2 direct backend).
  • Decimal renderer is inlined twice per emit site (once for the count, once for the byte value) because Zero 0.1.2 does not support Span<u8>/MutSpan<u8> as function parameters; the function extraction shape is unavailable. The count renderer assumes positive (cur_count >= 1), so it omits the v == 0 branch; the byte renderer keeps the v == 0 branch because byte value 0 is valid input.
  • Output buffer at [8192]u8: worst case is the 1000-byte input cap with every byte different, producing 1000 lines of "1 NNN\n" = 7 bytes max each, total 7000 bytes. Comfortably under 8192.

Cross-implementation parity

All five references walk the input byte-by-byte and emit one record per maximal run in input order. Byte-exact agreement on every case.


Cost

Model Prompt tokens Completion tokens API ms
gpt-4o 3,350 893 9,811
gpt-4o-mini 3,350 782 16,224
gpt-5 3,345 16,102 151,686

Tokens and API ms are summed across the five languages this model attempted for this task.


Compare

Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 . Back to the leaderboard and methodology.