agentlang-index · task easy

Run-length encode a lowercase ASCII string, or error on bad input

019-run-length-encode. Read one line from standard input.

Prompt

This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.

# 019-run-length-encode

Read one line from standard input. It is a non-empty string of
lowercase ASCII letters (bytes `a` through `z`), terminated by a
single `\n`.

Walk the input from left to right grouping consecutive identical
bytes into maximal runs. For each run, write the byte followed by
the decimal run length (no leading zeros, no separator). After all
runs are written, append a single trailing `\n`.

If the input line is empty, or if any byte is outside `a` through
`z` (uppercase, digits, punctuation, whitespace, or non-ASCII), write
the literal string `error\n` instead and exit.

Do not write to standard error. Exit with status 0 in every case.

## Examples

Input (stdin): `aaabbc\n`

Output: `a3b2c1\n`

Input (stdin): `abcdef\n`

Output: `a1b1c1d1e1f1\n` (every run has length one)

Input (stdin): `z\n`

Output: `z1\n` (single byte is a single run)

Input (stdin): `aaaaaaaaaaaaa\n` (thirteen `a`s)

Output: `a13\n` (multi-digit run length)

Input (stdin): `\n` (empty line)

Output: `error\n`

Input (stdin): `Hello\n`

Output: `error\n` (uppercase H rejects)

## Zero input convention

Zero 0.1.2 has no exposed stdin capability. The Zero reference is a
multi-file project under `zero/`: `zero.json` is the package manifest,
`src/main.0` is the driver, and `src/lib.0` exports
`is_lowercase_letter`, `digit_byte`, and `decimal_length`. The driver
reads the plaintext from `argv[1]`; the value is interpreted exactly
as the stdin line for other languages (taken verbatim). Invoked as
`zero run zero -- <plaintext>`.

Acceptance

A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."

stdout (byte-exact, per case) true
stderr (exact bytes) ""
exit code 0
wall time max (ms) 5000
tags string-transform, module-boundary, multi-file, run-length-encoding

Results

Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.

Model ZeroTypeScriptRustGoPython
gpt-4o compile wrong output
gpt-4o-mini compile wrong output other wrong output
gpt-5 compile

Failure excerpts

7 of 15 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.

  1. gpt-4o Zero compile
    zero/src/lib.0:28:1 PAR100: unexpected character '`'
  2. gpt-4o TypeScript wrong output
    (no diagnostic captured)
  3. gpt-4o-mini Zero compile
    zero/src/lib.0:23:1 PAR100: unexpected character '`'
  4. gpt-4o-mini TypeScript wrong output
    (no diagnostic captured)
  5. gpt-4o-mini Rust other
    (no diagnostic captured)
  6. gpt-4o-mini Go wrong output
    (no diagnostic captured)
  7. gpt-5 Zero compile
    zero/src/main.0:1:1 IMP001: unknown package-local import 'world'

Reference implementations

The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.

Click a language to expand

Zero 158 lines
// src/main.0
// Run-length encoder for AgentLang Index slot 019.
//
// Zero 0.1.2 has no exposed stdin, so plaintext comes from argv[1].
// Lib exports is_lowercase_letter, digit_byte, decimal_length.
//
// Failure modes collapse to "error\n":
//   - argv[1] missing
//   - plaintext is empty
//   - any plaintext byte is outside [a-z]
//
// Output: for each maximal run of consecutive identical bytes, emit
// the byte followed by the run length in decimal. Final newline.
// Examples: "aaabbc" -> "a3b2c1\n", "abcdef" -> "a1b1c1d1e1f1\n",
//           "aaaaaaaaaaaaa" -> "a13\n".

use lib

pub fun main(world: World) -> Void raises {
    let maybe_text = std.args.get(1)
    if maybe_text.has == false {
        check world.out.write("error\n")
        return
    }
    let text_in = std.mem.span(maybe_text.value)

    let t_len = std.mem.len(text_in)
    if t_len == 0 {
        check world.out.write("error\n")
        return
    }

    // Validate all bytes in [a-z] first.
    let mut text_ok: Bool = true
    let mut v_i: usize = 0
    while v_i < t_len {
        let b = text_in[v_i]
        if is_lowercase_letter(b) == false {
            text_ok = false
            v_i = t_len
        } else {
            v_i = v_i + 1
        }
    }
    if text_ok == false {
        check world.out.write("error\n")
        return
    }

    // Walk maximal runs and emit byte + decimal count.
    // Output buffer: each input byte expands to at most 1 + 10 bytes
    // (letter + up to 10 decimal digits for u32 max). Plus newline.
    let mut out: [16384]u8 = [0_u8; 16384]
    let mut out_n: usize = 0

    let mut run_start: usize = 0
    while run_start < t_len {
        let run_byte = text_in[run_start]
        let mut run_len: usize = 1
        let mut probe: usize = run_start + 1
        let mut scanning: Bool = true
        while scanning {
            if probe >= t_len {
                scanning = false
            } else {
                if text_in[probe] == run_byte {
                    run_len = run_len + 1
                    probe = probe + 1
                } else {
                    scanning = false
                }
            }
        }

        let run_len_u32: u32 = run_len as u32
        let digit_count = decimal_length(run_len_u32)

        // Emit the byte.
        out[out_n] = run_byte
        out_n = out_n + 1

        // Emit decimal digits left-to-right by iterating positions.
        let mut emit_i: u32 = digit_count
        while emit_i > 0_u32 {
            let pos = emit_i - 1_u32
            let mut divisor: u32 = 1_u32
            let mut p: u32 = 0_u32
            while p < pos {
                divisor = divisor * 10_u32
                p = p + 1_u32
            }
            let d8: u8 = ((run_len_u32 / divisor) % 10_u32) as u8
            out[out_n] = digit_byte(d8)
            out_n = out_n + 1
            emit_i = emit_i - 1_u32
        }

        run_start = run_start + run_len
    }

    out[out_n] = 10_u8
    out_n = out_n + 1
    check world.out.write(out[0..out_n])
    return
}


// src/lib.0
// Run-length encoder helpers. Pure scalar exports keeping the
// seventh codegen quirk happy at the module boundary.

pub fun is_lowercase_letter(b: u8) -> Bool {
    if b < 97_u8 {
        return false
    }
    if b > 122_u8 {
        return false
    }
    return true
}

pub fun digit_byte(d: u8) -> u8 {
    // Caller passes d in 0..=9. Returns the ASCII byte for that digit.
    return 48_u8 + d
}

pub fun decimal_length(n: u32) -> u32 {
    // Number of decimal digits needed to render n. Always at least 1.
    if n < 10_u32 {
        return 1_u32
    }
    if n < 100_u32 {
        return 2_u32
    }
    if n < 1000_u32 {
        return 3_u32
    }
    if n < 10000_u32 {
        return 4_u32
    }
    if n < 100000_u32 {
        return 5_u32
    }
    if n < 1000000_u32 {
        return 6_u32
    }
    if n < 10000000_u32 {
        return 7_u32
    }
    if n < 100000000_u32 {
        return 8_u32
    }
    if n < 1000000000_u32 {
        return 9_u32
    }
    return 10_u32
}
TypeScript 50 lines
function isLowercaseLetter(b: number): boolean {
    return b >= 97 && b <= 122;
}

async function readAll(): Promise<string> {
    const chunks: Buffer[] = [];
    for await (const chunk of process.stdin) {
        chunks.push(chunk as Buffer);
    }
    return Buffer.concat(chunks).toString("utf8");
}

async function main(): Promise<void> {
    const data = await readAll();
    const lines = data.split("\n");
    if (lines.length < 1) {
        process.stdout.write("error\n");
        return;
    }
    const text = lines[0];
    if (text.length === 0) {
        process.stdout.write("error\n");
        return;
    }
    const bytes = Buffer.from(text, "utf8");
    for (let i = 0; i < bytes.length; i++) {
        if (!isLowercaseLetter(bytes[i])) {
            process.stdout.write("error\n");
            return;
        }
    }
    const out: string[] = [];
    let i = 0;
    while (i < bytes.length) {
        const runByte = bytes[i];
        let runLen = 1;
        let j = i + 1;
        while (j < bytes.length && bytes[j] === runByte) {
            runLen++;
            j++;
        }
        out.push(String.fromCharCode(runByte));
        out.push(String(runLen));
        i = j;
    }
    process.stdout.write(out.join("") + "\n");
}

main();
Rust 47 lines
use std::io::{self, Read, Write};

fn is_lowercase_letter(b: u8) -> bool {
    (b'a'..=b'z').contains(&b)
}

fn main() {
    let mut input = String::new();
    if io::stdin().read_to_string(&mut input).is_err() {
        let _ = io::stdout().write_all(b"error\n");
        return;
    }
    let mut lines = input.split('\n');
    let line_text = lines.next();
    let Some(lt) = line_text else {
        let _ = io::stdout().write_all(b"error\n");
        return;
    };
    let text_bytes = lt.as_bytes();
    if text_bytes.is_empty() {
        let _ = io::stdout().write_all(b"error\n");
        return;
    }
    for &b in text_bytes {
        if !is_lowercase_letter(b) {
            let _ = io::stdout().write_all(b"error\n");
            return;
        }
    }
    let mut out: Vec<u8> = Vec::with_capacity(text_bytes.len() * 2 + 1);
    let mut i = 0usize;
    while i < text_bytes.len() {
        let run_byte = text_bytes[i];
        let mut run_len = 1usize;
        let mut j = i + 1;
        while j < text_bytes.len() && text_bytes[j] == run_byte {
            run_len += 1;
            j += 1;
        }
        out.push(run_byte);
        out.extend_from_slice(run_len.to_string().as_bytes());
        i = j;
    }
    out.push(b'\n');
    let _ = io::stdout().write_all(&out);
}
Go 68 lines
package main

import (
	"bufio"
	"fmt"
	"os"
	"strconv"
	"strings"
)

func isLowercaseLetter(b byte) bool {
	return b >= 'a' && b <= 'z'
}

func readAll(r *bufio.Reader) string {
	var sb strings.Builder
	buf := make([]byte, 4096)
	for {
		n, err := r.Read(buf)
		if n > 0 {
			sb.Write(buf[:n])
		}
		if err != nil {
			break
		}
	}
	return sb.String()
}

func main() {
	w := bufio.NewWriter(os.Stdout)
	defer w.Flush()
	r := bufio.NewReader(os.Stdin)
	data := readAll(r)
	lines := strings.SplitN(data, "\n", 2)
	if len(lines) < 1 {
		fmt.Fprint(w, "error\n")
		return
	}
	text := lines[0]
	if len(text) == 0 {
		fmt.Fprint(w, "error\n")
		return
	}
	for i := 0; i < len(text); i++ {
		if !isLowercaseLetter(text[i]) {
			fmt.Fprint(w, "error\n")
			return
		}
	}
	out := make([]byte, 0, len(text)*2+1)
	i := 0
	for i < len(text) {
		runByte := text[i]
		runLen := 1
		j := i + 1
		for j < len(text) && text[j] == runByte {
			runLen++
			j++
		}
		out = append(out, runByte)
		out = append(out, strconv.Itoa(runLen)...)
		i = j
	}
	out = append(out, '\n')
	w.Write(out)
}
Python 42 lines
#!/usr/bin/env python3
import sys


def is_lowercase_letter(b: int) -> bool:
    return 97 <= b <= 122


def main() -> None:
    data = sys.stdin.read()
    lines = data.split("\n")
    if len(lines) < 1:
        sys.stdout.write("error\n")
        return
    text = lines[0]
    if not text:
        sys.stdout.write("error\n")
        return
    text_bytes = text.encode("utf-8")
    for b in text_bytes:
        if not is_lowercase_letter(b):
            sys.stdout.write("error\n")
            return
    out = []
    i = 0
    n = len(text_bytes)
    while i < n:
        run_byte = text_bytes[i]
        run_len = 1
        j = i + 1
        while j < n and text_bytes[j] == run_byte:
            run_len += 1
            j += 1
        out.append(bytes([run_byte]).decode("ascii"))
        out.append(str(run_len))
        i = j
    sys.stdout.write("".join(out) + "\n")


if __name__ == "__main__":
    main()

Design notes

Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/019-run-length-encode/notes.md.

Algorithm

Read plaintext (lowercase a-z, non-empty, one line). Walk left to right grouping consecutive identical bytes into maximal runs. For each run, emit the byte followed by the decimal run length (no leading zeros, no separator). Final trailing \n. If the line is empty or contains any byte outside a..z, write error\n. Process exit is 0 in every case.

Multi-file arc closer

This task closes the multi-file arc that 018-caesar-cipher opened. The Zero implementation lives under zero/ as a proper Zero project:

zero/
  zero.json       # package metadata (cli exe target, linux-musl-x64)
  src/
    main.0        # parses argv, validates, walks runs, renders decimals
    lib.0         # is_lowercase_letter, digit_byte, decimal_length

The driver imports the library with use lib, then calls is_lowercase_letter(b) for validation, decimal_length(n) to size the digit emit, and digit_byte(d) to render each decimal digit. The project is invoked as zero run zero -- <plaintext> — project directory, double-dash, one argv value.

Lib signature constraint (still scalar-only)

Lib signatures stay pure scalars to stay clear of the seventh quirk (Span/MutSpan/shape values at the module boundary are rejected on the direct ELF64 backend):

pub fun is_lowercase_letter(b: u8) -> Bool
pub fun digit_byte(d: u8) -> u8
pub fun decimal_length(n: u32) -> u32

Decimal length uses a hard-coded chain of if n < 10/100/.../10^9 returning 1..10 — covers any u32 with no log calls and no allocator.

Tenth quirk — if-expressions in let bindings

A first draft of main.0 ended an inner loop with:

let actual_end = if run_end > t_len { run_end - 1 } else { run_end }

This is rejected with PAR100 "expected ';' after expression". Zero 0.1.2 does not support if-expressions as r-value forms — if is statement-only.

The workaround is the standard scanning-flag pattern: a mut accumulator plus a scanning: Bool sentinel, with state mutated inside the if arms instead of returned from them:

let mut run_len: usize = 1
let mut probe: usize = run_start + 1
let mut scanning: Bool = true
while scanning {
    if probe >= t_len {
        scanning = false
    } else {
        if text_in[probe] == run_byte {
            run_len = run_len + 1
            probe = probe + 1
        } else {
            scanning = false
        }
    }
}

This is a refinement of the existing PAR100 quirk surface but is worth surfacing on its own line because it's a separate language restriction with a different workaround. The known tally is now ten.

Cross-implementation parity

All five share the same dispatch:

  1. read input (stdin line 1 or argv[1])
  2. validate non-empty
  3. validate each byte in [a-z]
  4. walk maximal runs: for each, emit byte + decimal run length
  5. emit trailing \n

Byte-exact agreement on every case across zero/ts/rust/go/python.

Zero-specific notes

  • argv[1] carries plaintext (no exposed stdin in Zero 0.1.2).
  • Output buffer is [16384]u8 = [0_u8; 16384] — generous fixed upper bound; each input byte expands to at most 1 + 10 bytes (letter + up to 10 decimal digits for u32 max). Real corpus inputs are far below this.
  • Decimal digits are emitted left-to-right by dividing by 10^pos for each position from digit_count - 1 down to 0, mod 10. This avoids any inline reversal or scratch buffer.
  • main ends with explicit return to dodge the trailing-write byte-count-as-exit-code codegen quirk from task 012.

v1.0 corpus closure

This is task 20 of 20. The corpus reaches v1.0 with all twenty tasks shipping byte-exact across all five languages. Tag and release notes land in the v1.0.0 commit.


Cost

Model Prompt tokens Completion tokens API ms
gpt-4o 3,542 1,474 16,267
gpt-4o-mini 3,542 1,423 23,874
gpt-5 3,537 16,132 145,997

Tokens and API ms are summed across the five languages this model attempted for this task.


Compare

Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 . Back to the leaderboard and methodology.