agentlang-index · task easy
Run-length encode a lowercase ASCII string, or error on bad input
019-run-length-encode. Read one line from standard input.
Prompt
This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.
# 019-run-length-encode
Read one line from standard input. It is a non-empty string of
lowercase ASCII letters (bytes `a` through `z`), terminated by a
single `\n`.
Walk the input from left to right grouping consecutive identical
bytes into maximal runs. For each run, write the byte followed by
the decimal run length (no leading zeros, no separator). After all
runs are written, append a single trailing `\n`.
If the input line is empty, or if any byte is outside `a` through
`z` (uppercase, digits, punctuation, whitespace, or non-ASCII), write
the literal string `error\n` instead and exit.
Do not write to standard error. Exit with status 0 in every case.
## Examples
Input (stdin): `aaabbc\n`
Output: `a3b2c1\n`
Input (stdin): `abcdef\n`
Output: `a1b1c1d1e1f1\n` (every run has length one)
Input (stdin): `z\n`
Output: `z1\n` (single byte is a single run)
Input (stdin): `aaaaaaaaaaaaa\n` (thirteen `a`s)
Output: `a13\n` (multi-digit run length)
Input (stdin): `\n` (empty line)
Output: `error\n`
Input (stdin): `Hello\n`
Output: `error\n` (uppercase H rejects)
## Zero input convention
Zero 0.1.2 has no exposed stdin capability. The Zero reference is a
multi-file project under `zero/`: `zero.json` is the package manifest,
`src/main.0` is the driver, and `src/lib.0` exports
`is_lowercase_letter`, `digit_byte`, and `decimal_length`. The driver
reads the plaintext from `argv[1]`; the value is interpreted exactly
as the stdin line for other languages (taken verbatim). Invoked as
`zero run zero -- <plaintext>`.
Acceptance
A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."
| stdout (byte-exact, per case) | true |
|---|---|
| stderr (exact bytes) | "" |
| exit code | 0 |
| wall time max (ms) | 5000 |
| tags | string-transform, module-boundary, multi-file, run-length-encoding |
Results
Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.
| Model | Zero | TypeScript | Rust | Go | Python |
|---|---|---|---|---|---|
| gpt-4o | compile | wrong output | ✓ | ✓ | ✓ |
| gpt-4o-mini | compile | wrong output | other | wrong output | ✓ |
| gpt-5 | compile | ✓ | ✓ | ✓ | ✓ |
Failure excerpts
7 of 15 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.
-
zero/src/lib.0:28:1 PAR100: unexpected character '`' -
(no diagnostic captured) -
zero/src/lib.0:23:1 PAR100: unexpected character '`' -
(no diagnostic captured) -
(no diagnostic captured) -
(no diagnostic captured) -
zero/src/main.0:1:1 IMP001: unknown package-local import 'world'
Reference implementations
The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.
Click a language to expand
Zero 158 lines
// src/main.0
// Run-length encoder for AgentLang Index slot 019.
//
// Zero 0.1.2 has no exposed stdin, so plaintext comes from argv[1].
// Lib exports is_lowercase_letter, digit_byte, decimal_length.
//
// Failure modes collapse to "error\n":
// - argv[1] missing
// - plaintext is empty
// - any plaintext byte is outside [a-z]
//
// Output: for each maximal run of consecutive identical bytes, emit
// the byte followed by the run length in decimal. Final newline.
// Examples: "aaabbc" -> "a3b2c1\n", "abcdef" -> "a1b1c1d1e1f1\n",
// "aaaaaaaaaaaaa" -> "a13\n".
use lib
pub fun main(world: World) -> Void raises {
let maybe_text = std.args.get(1)
if maybe_text.has == false {
check world.out.write("error\n")
return
}
let text_in = std.mem.span(maybe_text.value)
let t_len = std.mem.len(text_in)
if t_len == 0 {
check world.out.write("error\n")
return
}
// Validate all bytes in [a-z] first.
let mut text_ok: Bool = true
let mut v_i: usize = 0
while v_i < t_len {
let b = text_in[v_i]
if is_lowercase_letter(b) == false {
text_ok = false
v_i = t_len
} else {
v_i = v_i + 1
}
}
if text_ok == false {
check world.out.write("error\n")
return
}
// Walk maximal runs and emit byte + decimal count.
// Output buffer: each input byte expands to at most 1 + 10 bytes
// (letter + up to 10 decimal digits for u32 max). Plus newline.
let mut out: [16384]u8 = [0_u8; 16384]
let mut out_n: usize = 0
let mut run_start: usize = 0
while run_start < t_len {
let run_byte = text_in[run_start]
let mut run_len: usize = 1
let mut probe: usize = run_start + 1
let mut scanning: Bool = true
while scanning {
if probe >= t_len {
scanning = false
} else {
if text_in[probe] == run_byte {
run_len = run_len + 1
probe = probe + 1
} else {
scanning = false
}
}
}
let run_len_u32: u32 = run_len as u32
let digit_count = decimal_length(run_len_u32)
// Emit the byte.
out[out_n] = run_byte
out_n = out_n + 1
// Emit decimal digits left-to-right by iterating positions.
let mut emit_i: u32 = digit_count
while emit_i > 0_u32 {
let pos = emit_i - 1_u32
let mut divisor: u32 = 1_u32
let mut p: u32 = 0_u32
while p < pos {
divisor = divisor * 10_u32
p = p + 1_u32
}
let d8: u8 = ((run_len_u32 / divisor) % 10_u32) as u8
out[out_n] = digit_byte(d8)
out_n = out_n + 1
emit_i = emit_i - 1_u32
}
run_start = run_start + run_len
}
out[out_n] = 10_u8
out_n = out_n + 1
check world.out.write(out[0..out_n])
return
}
// src/lib.0
// Run-length encoder helpers. Pure scalar exports keeping the
// seventh codegen quirk happy at the module boundary.
pub fun is_lowercase_letter(b: u8) -> Bool {
if b < 97_u8 {
return false
}
if b > 122_u8 {
return false
}
return true
}
pub fun digit_byte(d: u8) -> u8 {
// Caller passes d in 0..=9. Returns the ASCII byte for that digit.
return 48_u8 + d
}
pub fun decimal_length(n: u32) -> u32 {
// Number of decimal digits needed to render n. Always at least 1.
if n < 10_u32 {
return 1_u32
}
if n < 100_u32 {
return 2_u32
}
if n < 1000_u32 {
return 3_u32
}
if n < 10000_u32 {
return 4_u32
}
if n < 100000_u32 {
return 5_u32
}
if n < 1000000_u32 {
return 6_u32
}
if n < 10000000_u32 {
return 7_u32
}
if n < 100000000_u32 {
return 8_u32
}
if n < 1000000000_u32 {
return 9_u32
}
return 10_u32
}
TypeScript 50 lines
function isLowercaseLetter(b: number): boolean {
return b >= 97 && b <= 122;
}
async function readAll(): Promise<string> {
const chunks: Buffer[] = [];
for await (const chunk of process.stdin) {
chunks.push(chunk as Buffer);
}
return Buffer.concat(chunks).toString("utf8");
}
async function main(): Promise<void> {
const data = await readAll();
const lines = data.split("\n");
if (lines.length < 1) {
process.stdout.write("error\n");
return;
}
const text = lines[0];
if (text.length === 0) {
process.stdout.write("error\n");
return;
}
const bytes = Buffer.from(text, "utf8");
for (let i = 0; i < bytes.length; i++) {
if (!isLowercaseLetter(bytes[i])) {
process.stdout.write("error\n");
return;
}
}
const out: string[] = [];
let i = 0;
while (i < bytes.length) {
const runByte = bytes[i];
let runLen = 1;
let j = i + 1;
while (j < bytes.length && bytes[j] === runByte) {
runLen++;
j++;
}
out.push(String.fromCharCode(runByte));
out.push(String(runLen));
i = j;
}
process.stdout.write(out.join("") + "\n");
}
main();
Rust 47 lines
use std::io::{self, Read, Write};
fn is_lowercase_letter(b: u8) -> bool {
(b'a'..=b'z').contains(&b)
}
fn main() {
let mut input = String::new();
if io::stdin().read_to_string(&mut input).is_err() {
let _ = io::stdout().write_all(b"error\n");
return;
}
let mut lines = input.split('\n');
let line_text = lines.next();
let Some(lt) = line_text else {
let _ = io::stdout().write_all(b"error\n");
return;
};
let text_bytes = lt.as_bytes();
if text_bytes.is_empty() {
let _ = io::stdout().write_all(b"error\n");
return;
}
for &b in text_bytes {
if !is_lowercase_letter(b) {
let _ = io::stdout().write_all(b"error\n");
return;
}
}
let mut out: Vec<u8> = Vec::with_capacity(text_bytes.len() * 2 + 1);
let mut i = 0usize;
while i < text_bytes.len() {
let run_byte = text_bytes[i];
let mut run_len = 1usize;
let mut j = i + 1;
while j < text_bytes.len() && text_bytes[j] == run_byte {
run_len += 1;
j += 1;
}
out.push(run_byte);
out.extend_from_slice(run_len.to_string().as_bytes());
i = j;
}
out.push(b'\n');
let _ = io::stdout().write_all(&out);
}
Go 68 lines
package main
import (
"bufio"
"fmt"
"os"
"strconv"
"strings"
)
func isLowercaseLetter(b byte) bool {
return b >= 'a' && b <= 'z'
}
func readAll(r *bufio.Reader) string {
var sb strings.Builder
buf := make([]byte, 4096)
for {
n, err := r.Read(buf)
if n > 0 {
sb.Write(buf[:n])
}
if err != nil {
break
}
}
return sb.String()
}
func main() {
w := bufio.NewWriter(os.Stdout)
defer w.Flush()
r := bufio.NewReader(os.Stdin)
data := readAll(r)
lines := strings.SplitN(data, "\n", 2)
if len(lines) < 1 {
fmt.Fprint(w, "error\n")
return
}
text := lines[0]
if len(text) == 0 {
fmt.Fprint(w, "error\n")
return
}
for i := 0; i < len(text); i++ {
if !isLowercaseLetter(text[i]) {
fmt.Fprint(w, "error\n")
return
}
}
out := make([]byte, 0, len(text)*2+1)
i := 0
for i < len(text) {
runByte := text[i]
runLen := 1
j := i + 1
for j < len(text) && text[j] == runByte {
runLen++
j++
}
out = append(out, runByte)
out = append(out, strconv.Itoa(runLen)...)
i = j
}
out = append(out, '\n')
w.Write(out)
}
Python 42 lines
#!/usr/bin/env python3
import sys
def is_lowercase_letter(b: int) -> bool:
return 97 <= b <= 122
def main() -> None:
data = sys.stdin.read()
lines = data.split("\n")
if len(lines) < 1:
sys.stdout.write("error\n")
return
text = lines[0]
if not text:
sys.stdout.write("error\n")
return
text_bytes = text.encode("utf-8")
for b in text_bytes:
if not is_lowercase_letter(b):
sys.stdout.write("error\n")
return
out = []
i = 0
n = len(text_bytes)
while i < n:
run_byte = text_bytes[i]
run_len = 1
j = i + 1
while j < n and text_bytes[j] == run_byte:
run_len += 1
j += 1
out.append(bytes([run_byte]).decode("ascii"))
out.append(str(run_len))
i = j
sys.stdout.write("".join(out) + "\n")
if __name__ == "__main__":
main()
Design notes
Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/019-run-length-encode/notes.md.
Algorithm
Read plaintext (lowercase a-z, non-empty, one line). Walk left to right
grouping consecutive identical bytes into maximal runs. For each run,
emit the byte followed by the decimal run length (no leading zeros,
no separator). Final trailing \n. If the line is empty or contains
any byte outside a..z, write error\n. Process exit is 0 in every
case.
Multi-file arc closer
This task closes the multi-file arc that 018-caesar-cipher opened. The
Zero implementation lives under zero/ as a proper Zero project:
zero/
zero.json # package metadata (cli exe target, linux-musl-x64)
src/
main.0 # parses argv, validates, walks runs, renders decimals
lib.0 # is_lowercase_letter, digit_byte, decimal_length
The driver imports the library with use lib, then calls
is_lowercase_letter(b) for validation, decimal_length(n) to size
the digit emit, and digit_byte(d) to render each decimal digit. The
project is invoked as zero run zero -- <plaintext> — project
directory, double-dash, one argv value.
Lib signature constraint (still scalar-only)
Lib signatures stay pure scalars to stay clear of the seventh quirk
(Span
pub fun is_lowercase_letter(b: u8) -> Bool
pub fun digit_byte(d: u8) -> u8
pub fun decimal_length(n: u32) -> u32
Decimal length uses a hard-coded chain of if n < 10/100/.../10^9
returning 1..10 — covers any u32 with no log calls and no allocator.
Tenth quirk — if-expressions in let bindings
A first draft of main.0 ended an inner loop with:
let actual_end = if run_end > t_len { run_end - 1 } else { run_end }
This is rejected with PAR100 "expected ';' after expression". Zero
0.1.2 does not support if-expressions as r-value forms — if is
statement-only.
The workaround is the standard scanning-flag pattern: a mut
accumulator plus a scanning: Bool sentinel, with state mutated
inside the if arms instead of returned from them:
let mut run_len: usize = 1
let mut probe: usize = run_start + 1
let mut scanning: Bool = true
while scanning {
if probe >= t_len {
scanning = false
} else {
if text_in[probe] == run_byte {
run_len = run_len + 1
probe = probe + 1
} else {
scanning = false
}
}
}
This is a refinement of the existing PAR100 quirk surface but is worth surfacing on its own line because it's a separate language restriction with a different workaround. The known tally is now ten.
Cross-implementation parity
All five share the same dispatch:
- read input (stdin line 1 or argv[1])
- validate non-empty
- validate each byte in [a-z]
- walk maximal runs: for each, emit byte + decimal run length
- emit trailing
\n
Byte-exact agreement on every case across zero/ts/rust/go/python.
Zero-specific notes
- argv[1] carries plaintext (no exposed stdin in Zero 0.1.2).
- Output buffer is
[16384]u8 = [0_u8; 16384]— generous fixed upper bound; each input byte expands to at most 1 + 10 bytes (letter + up to 10 decimal digits for u32 max). Real corpus inputs are far below this. - Decimal digits are emitted left-to-right by dividing by 10^pos
for each position from
digit_count - 1down to 0, mod 10. This avoids any inline reversal or scratch buffer. mainends with explicitreturnto dodge the trailing-write byte-count-as-exit-code codegen quirk from task 012.
v1.0 corpus closure
This is task 20 of 20. The corpus reaches v1.0 with all twenty tasks shipping byte-exact across all five languages. Tag and release notes land in the v1.0.0 commit.
Cost
| Model | Prompt tokens | Completion tokens | API ms |
|---|---|---|---|
| gpt-4o | 3,542 | 1,474 | 16,267 |
| gpt-4o-mini | 3,542 | 1,423 | 23,874 |
| gpt-5 | 3,537 | 16,132 | 145,997 |
Tokens and API ms are summed across the five languages this model attempted for this task.
Compare
Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 . Back to the leaderboard and methodology.