agentlang-index · task easy
Per-byte frequency table sorted by byte value
010-byte-frequency. Read all bytes from standard input.
Prompt
This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.
# Task 010 — byte-frequency
## Problem
Read all bytes from standard input. Count how often each byte value
(0 through 255) appears. For every byte value with a non-zero count,
write one line of the form `<byte_decimal> <count>\n` to standard
output. The lines must be in ascending byte-value order. Exit with
status 0.
## Output format
- Each line is `<byte_decimal>` (the byte value 0..255 rendered in
decimal, no padding), one ASCII space (0x20), `<count>` (the
occurrence count in decimal, no padding), one newline (0x0A).
- No header, no trailing extra newline beyond the per-record one.
## Edge cases
- Empty input → empty output (no header, no `0\n`, nothing).
- Single byte input → one line.
- Repeated identical bytes → one line with the matching count.
- High-bit bytes (0x80..0xFF) appear with their decimal value
(128..255) just like ASCII bytes.
## Acceptance
The stdout produced by your program must match the expected bytes
exactly for every test case. Stderr must be empty and the process
must exit 0.
## Input convention
- **stdin** (TypeScript, Rust, Go, Python): read all bytes until EOF.
- **argv[1]** (Zero 0.1.2 has no exposed stdin capability): read the
bytes from the first command-line argument.
## Token budget
1600 tokens.
Acceptance
A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."
| stdout (byte-exact, per case) | true |
|---|---|
| stderr (exact bytes) | "" |
| exit code | 0 |
| wall time max (ms) | 5000 |
| tags | io, counting, sorting, multi-record-output |
Results
Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.
| Model | Zero | TypeScript | Rust | Go | Python |
|---|---|---|---|---|---|
| gpt-4o | compile | ✓ | ✓ | ✓ | ✓ |
| gpt-4o-mini | compile | ✓ | wrong output | ✓ | wrong output |
| gpt-5 | compile | ✓ | ✓ | ✓ | ✓ |
Failure excerpts
5 of 15 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.
-
ref.zero:6:8 PAR100: expected '{' before block -
ref.zero:1:15 PAR100: expected expression -
(no diagnostic captured) -
(no diagnostic captured) -
ref.zero:1:1 IMP001: unknown package-local import 'std'
Reference implementations
The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.
Click a language to expand
Zero 83 lines
// Per-byte frequency table, Zero 0.1.2 direct backend.
//
// argv[1] is the input bytes (Zero 0.1.2 has no exposed stdin capability).
// All logic stays inside `pub fun main` to avoid Span/MutSpan parameter
// restrictions.
//
// One pass to populate a [256]u32 counts table, then walk byte values
// 0..255 in ascending order and emit `<byte_decimal> <count>\n` per
// non-zero entry. Decimal renderer uses small-divisor u32 divmod
// (avoids the 2^32-divisor narrowing-cast SIGFPE) and lives inline in
// both rendering sites because Zero 0.1.2 does not support Span/MutSpan
// parameters and the buffer-write style is tightest as inline blocks.
pub fun main(world: World) -> Void raises {
let line_opt = std.args.get(1)
let mut bytes: Span<u8> = std.mem.span("")
if line_opt.has {
bytes = std.mem.span(line_opt.value)
}
let n: usize = std.mem.len(bytes)
let mut counts: [256]u32 = [0_u32; 256]
let mut i: usize = 0
while i < n {
let idx: usize = bytes[i] as usize
counts[idx] = counts[idx] + 1_u32
i = i + 1
}
let mut out_buf: [4096]u8 = [0_u8; 4096]
let mut written: usize = 0
let mut bi: usize = 0
while bi < 256 {
let c: u32 = counts[bi]
if c > 0_u32 {
let b_val: u32 = bi as u32
if b_val == 0_u32 {
out_buf[written] = 48_u8
written = written + 1
} else {
let mut tmp: [4]u8 = [0_u8; 4]
let mut tn: usize = 0
let mut v: u32 = b_val
while v > 0_u32 {
tmp[tn] = (48_u32 + (v % 10_u32)) as u8
tn = tn + 1
v = v / 10_u32
}
let mut j: usize = 0
while j < tn {
out_buf[written] = tmp[tn - 1 - j]
written = written + 1
j = j + 1
}
}
out_buf[written] = 32_u8
written = written + 1
let mut tmp2: [16]u8 = [0_u8; 16]
let mut tn2: usize = 0
let mut v2: u32 = c
while v2 > 0_u32 {
tmp2[tn2] = (48_u32 + (v2 % 10_u32)) as u8
tn2 = tn2 + 1
v2 = v2 / 10_u32
}
let mut j2: usize = 0
while j2 < tn2 {
out_buf[written] = tmp2[tn2 - 1 - j2]
written = written + 1
j2 = j2 + 1
}
out_buf[written] = 10_u8
written = written + 1
}
bi = bi + 1
}
let out: Span<u8> = out_buf[0..written]
check world.out.write(out)
return
}
TypeScript 22 lines
// Per-byte frequency table, TypeScript reference.
import { readFileSync } from "node:fs";
function main(): void {
const data = readFileSync(0);
const counts = new Uint32Array(256);
for (let i = 0; i < data.length; i++) {
counts[data[i]]++;
}
const out: string[] = [];
for (let b = 0; b < 256; b++) {
const c = counts[b];
if (c > 0) {
out.push(`${b} ${c}\n`);
}
}
process.stdout.write(out.join(""));
}
main();
Rust 20 lines
// Per-byte frequency table, Rust reference.
use std::io::{self, Read, Write};
fn main() {
let mut data = Vec::new();
io::stdin().read_to_end(&mut data).expect("read stdin");
let mut counts = [0u32; 256];
for &b in &data {
counts[b as usize] += 1;
}
let stdout = io::stdout();
let mut stdout = stdout.lock();
for (b, &c) in counts.iter().enumerate() {
if c > 0 {
writeln!(stdout, "{} {}", b, c).expect("write stdout");
}
}
}
Go 29 lines
// Per-byte frequency table, Go reference.
package main
import (
"bufio"
"fmt"
"io"
"os"
)
func main() {
data, err := io.ReadAll(os.Stdin)
if err != nil {
os.Exit(1)
}
var counts [256]uint32
for _, b := range data {
counts[b]++
}
w := bufio.NewWriter(os.Stdout)
defer w.Flush()
for b := 0; b < 256; b++ {
if counts[b] > 0 {
fmt.Fprintf(w, "%d %d\n", b, counts[b])
}
}
}
Python 22 lines
#!/usr/bin/env python3
"""Per-byte frequency table, Python reference."""
import sys
def main() -> None:
data = sys.stdin.buffer.read()
counts = [0] * 256
for b in data:
counts[b] += 1
out_parts = []
for b in range(256):
c = counts[b]
if c > 0:
out_parts.append(f"{b} {c}\n")
sys.stdout.write("".join(out_parts))
if __name__ == "__main__":
main()
Design notes
Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/010-byte-frequency/notes.md.
Algorithm
Two phases:
- Count. Walk every input byte, incrementing
counts[byte]in a length-256 table.countsisu32everywhere; the spec caps input at 100000 bytes, so the max single-byte count is 100000 (well inside u32 range). - Emit. Walk byte values 0..255 in ascending order; for every
entry with a non-zero count, emit
<byte_decimal> <count>\n.
The byte value itself is rendered in decimal (range 0..255 → 1..3 digits), so a 4-byte temp buffer is sufficient. The count renderer uses a 16-byte temp buffer (max u32 = 4294967295 = 10 digits).
Edge cases
- Empty input → no bytes counted → emit nothing.
- Single byte → one line.
- All identical bytes → one line.
- High-bit bytes (0x80..0xFF, decimal 128..255) appear with their decimal values; no special handling.
Zero-specific notes
- argv[1] is the input bytes; an absent argv falls back to an empty span. Embedded tabs, newlines, and high-bit bytes in argv[1] pass through unchanged.
- Decimal rendering is inlined twice (once for the byte value with a
4-byte temp, once for the count with a 16-byte temp) because Zero
0.1.2 does not support
Span<u8>/MutSpan<u8>as function parameters; the function-extraction shape is unavailable, so the buffer-write code lives inline. - Output buffer sized at
[4096]u8: the worst case is all 256 byte values present, each line"<3 digits> <10 digits>\n"= 15 bytes, total 3840 bytes. The spec's 100000-byte input cap caps the max count at 6 digits (100000), so realistic worst case is 11 bytes per line × 256 = 2816, well under 4096. - Uses the small-divisor u32 divmod renderer (
% 10_u32// 10_u32) which dodges the 2^32-divisor narrowing-cast SIGFPE.
Cross-implementation parity
All five references walk the input byte-by-byte and emit one record
per non-zero byte value in ascending byte order. TypeScript uses
Uint32Array(256) for the counts table; Rust uses [0u32; 256]; Go
uses [256]uint32; Python uses a 256-element list. The output format
is the same across all of them: <byte_decimal> <count>\n per
record, no header, no trailing extra newline.
Cost
| Model | Prompt tokens | Completion tokens | API ms |
|---|---|---|---|
| gpt-4o | 2,875 | 872 | 10,021 |
| gpt-4o-mini | 2,875 | 612 | 11,239 |
| gpt-5 | 2,870 | 12,779 | 128,318 |
Tokens and API ms are summed across the five languages this model attempted for this task.
Compare
Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 . Back to the leaderboard and methodology.