agentlang-index · task easy
Prime count via Sieve of Eratosthenes
002-sieve-prime-count. Read a single non-negative integer `N` from standard input (`0 <= N <= 10000`).
Prompt
This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.
## Task: Prime count via Sieve of Eratosthenes
Read a single non-negative integer `N` from standard input (`0 <= N <= 10000`).
Compute the count of prime numbers in the closed interval `[2, N]` using a
**Sieve of Eratosthenes** (a brute-force `is_prime(k)` per candidate is
considered out of band; the task is about the sieve algorithm). Write the
count to standard output followed by a single newline. Exit with status 0.
Edge cases:
- `N = 0` and `N = 1` both yield `0` (no primes in the interval).
- `N = 2` yields `1` (just `2`).
## Acceptance
- Stdin: one line containing the integer `N`.
- Stdout: the decimal digits of the count, followed by `\n`.
- Stderr: empty.
- Exit code: 0.
- The program must complete `N = 10000` within 5 seconds.
## Examples
| N | stdout |
| ----- | --------- |
| 0 | `0\n` |
| 2 | `1\n` |
| 10 | `4\n` |
| 100 | `25\n` |
| 1000 | `168\n` |
| 10000 | `1229\n` |
## Language scaffold
{language_scaffold}
Acceptance
A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."
| stdout (byte-exact, per case) | true |
|---|---|
| stderr (exact bytes) | "" |
| exit code | 0 |
| wall time max (ms) | 5000 |
| requires_sieve | true |
| tags | computation, sieve, arrays |
Results
Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.
| Model | Zero | TypeScript | Rust | Go | Python |
|---|---|---|---|---|---|
| gpt-4o | compile | ✓ | ✓ | ✓ | ✓ |
| gpt-4o-mini | compile | ✓ | other | ✓ | ✓ |
| gpt-5 | compile | ✓ | ✓ | ✓ | ✓ |
Failure excerpts
4 of 15 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.
-
ref.zero:1:1 IMP001: unknown package-local import 'std' -
ref.zero:4:1 PAR100: expected '{' before block -
(no diagnostic captured) -
ref.zero:1:1 IMP001: unknown package-local import 'std'
Reference implementations
The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.
Click a language to expand
Zero 92 lines
// Prime count via Sieve of Eratosthenes, Zero 0.1.2 direct backend.
//
// Same input convention as task 001: N is read from argv[1] because
// Zero 0.1.2 has no exposed stdin capability. All logic stays inside
// `pub fun main` to avoid the Span/MutSpan-as-parameter restriction.
//
// The composite-flag table is fixed-size [10001]u8 so the spec cap
// of N <= 10000 is enforced at the type level. Element type is u8
// to stay inside the direct backend's allowed local-array element
// types (u8, u32, i32).
pub fun main(world: World) -> Void raises {
let arg = std.args.get(1)
if arg.has == false {
check world.err.write("usage: pass N as argv[1] (Zero 0.1.2 has no stdin capability)\n")
return
}
let arg_bytes: Span<u8> = std.mem.span(arg.value)
let arg_len: usize = std.mem.len(arg_bytes)
if arg_len == 0 {
check world.err.write("N must be a non-negative integer\n")
return
}
let mut parse_i: usize = 0
let mut n: u32 = 0_u32
while parse_i < arg_len {
let ch: u8 = arg_bytes[parse_i]
if ch < 48_u8 || ch > 57_u8 {
check world.err.write("N must be a non-negative integer\n")
return
}
n = n * 10_u32 + ((ch - 48_u8) as u32)
parse_i = parse_i + 1
}
if n > 10000_u32 {
check world.err.write("N must be <= 10000\n")
return
}
let mut count: u32 = 0_u32
if n >= 2_u32 {
let mut composite: [10001]u8 = [0_u8; 10001]
let mut i: u32 = 2_u32
while i * i <= n {
if composite[i as usize] == 0_u8 {
let mut j: u32 = i * i
while j <= n {
composite[j as usize] = 1_u8
j = j + i
}
}
i = i + 1_u32
}
let mut k: u32 = 2_u32
while k <= n {
if composite[k as usize] == 0_u8 {
count = count + 1_u32
}
k = k + 1_u32
}
}
let mut buf: [16]u8 = [0_u8; 16]
let mut written: usize = 0
if count == 0_u32 {
buf[0] = 48_u8
buf[1] = 10_u8
written = 2
} else {
let mut digits: [16]u8 = [0_u8; 16]
let mut rem: u32 = count
let mut digit_count: usize = 0
while rem > 0_u32 {
let dq: u32 = rem / 10_u32
let dr: u32 = rem % 10_u32
digits[digit_count] = (dr as u8) + 48_u8
rem = dq
digit_count = digit_count + 1
}
let mut o: usize = 0
while o < digit_count {
buf[o] = digits[digit_count - 1 - o]
o = o + 1
}
buf[digit_count] = 10_u8
written = digit_count + 1
}
let out: Span<u8> = buf[0..written]
check world.out.write(out)
return
}
TypeScript 29 lines
// Prime count via Sieve of Eratosthenes, TypeScript reference.
// Reads N from stdin, runs a byte-flag sieve, counts unmarked [2, N].
import { readFileSync } from "node:fs";
const input = readFileSync(0, "utf8").trim();
const n = Number.parseInt(input, 10);
if (!Number.isInteger(n) || n < 0) {
process.stderr.write("N must be a non-negative integer\n");
process.exit(1);
}
if (n < 2) {
process.stdout.write("0\n");
} else {
const composite = new Uint8Array(n + 1);
for (let i = 2; i * i <= n; i++) {
if (composite[i] === 0) {
for (let j = i * i; j <= n; j += i) {
composite[j] = 1;
}
}
}
let count = 0;
for (let k = 2; k <= n; k++) {
if (composite[k] === 0) count++;
}
process.stdout.write(`${count}\n`);
}
Rust 40 lines
// Prime count via Sieve of Eratosthenes, Rust reference.
// Reads N from stdin, runs a byte-flag sieve, counts unmarked [2, N].
use std::io::{self, Read, Write};
use std::process::ExitCode;
fn main() -> ExitCode {
let mut input = String::new();
if io::stdin().read_to_string(&mut input).is_err() {
let _ = writeln!(io::stderr(), "failed to read stdin");
return ExitCode::from(1);
}
let n: i64 = match input.trim().parse() {
Ok(v) if v >= 0 => v,
_ => {
let _ = writeln!(io::stderr(), "N must be a non-negative integer");
return ExitCode::from(1);
}
};
let n_usize = n as usize;
let count = if n < 2 {
0
} else {
let mut composite = vec![0u8; n_usize + 1];
let mut i: usize = 2;
while i * i <= n_usize {
if composite[i] == 0 {
let mut j = i * i;
while j <= n_usize {
composite[j] = 1;
j += i;
}
}
i += 1;
}
(2..=n_usize).filter(|&k| composite[k] == 0).count()
};
let _ = writeln!(io::stdout(), "{}", count);
ExitCode::SUCCESS
}
Go 41 lines
// Prime count via Sieve of Eratosthenes, Go reference.
// Reads N from stdin, runs a byte-flag sieve, counts unmarked [2, N].
package main
import (
"bufio"
"fmt"
"os"
"strconv"
"strings"
)
func main() {
reader := bufio.NewReader(os.Stdin)
line, _ := reader.ReadString('\n')
n, err := strconv.Atoi(strings.TrimSpace(line))
if err != nil || n < 0 {
fmt.Fprintln(os.Stderr, "N must be a non-negative integer")
os.Exit(1)
}
if n < 2 {
fmt.Print("0\n")
return
}
composite := make([]byte, n+1)
for i := 2; i*i <= n; i++ {
if composite[i] == 0 {
for j := i * i; j <= n; j += i {
composite[j] = 1
}
}
}
count := 0
for k := 2; k <= n; k++ {
if composite[k] == 0 {
count++
}
}
fmt.Print(count, "\n")
}
Python 33 lines
"""Prime count via Sieve of Eratosthenes, Python reference.
Reads N from stdin, runs a standard byte-flag sieve over [0, N], counts
unmarked indices in [2, N]. N is bounded by 10000 by the task spec but
nothing here would change for a larger N apart from the allocation.
"""
import sys
def main() -> None:
n = int(sys.stdin.readline().strip())
if n < 0:
sys.stderr.write("N must be a non-negative integer\n")
sys.exit(1)
if n < 2:
sys.stdout.write("0\n")
return
composite = bytearray(n + 1)
i = 2
while i * i <= n:
if composite[i] == 0:
j = i * i
while j <= n:
composite[j] = 1
j += i
i += 1
count = sum(1 for k in range(2, n + 1) if composite[k] == 0)
sys.stdout.write(f"{count}\n")
if __name__ == "__main__":
main()
Design notes
Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/002-sieve-prime-count/notes.md.
Why this task
Second pure-computation task. Differs from 001-fibonacci-memoized in shape: there is no recursion to memoize and no per-call subproblem to cache; instead the solution exercises array marking and a nested loop with the inner bound depending on the outer index. The model must (a) allocate a flag buffer of size N+1, (b) iterate the right range for the outer pointer, (c) stop the inner mark loop at the right cap, (d) iterate again to count, all without off-by-ones.
Difficulty: easy. The sieve is canonical algorithm material, but it
catches models that confuse the count's upper bound (<= N vs < N)
or that overflow the inner-loop start (i * i rather than 2 * i is
the optimization the spec rewards).
The acceptance criteria reduce to a byte-exact stdout match per test case, same as 001. No subjective scoring.
Edge cases probed
N = 0andN = 1both yield0. A model that emits0\nor\nonly onN < 2rather than always must still handle both.N = 2yields1. Catches models that initializei = 3thinking they can hand-fold the only-even-prime case.N = 10000yields1229. Ais_prime(k)trial-division fallback would still pass on this corpus (O(N * sqrt(N))=O(10^6)for the largest case), so the spec asks for a sieve specifically andacceptance.requires_sievelifts that into a metadata flag that the harness will check against (we'll add a structural check on the model's output later — for v0.1 we trust the prompt).
Zero implementation notes
Same input convention as 001: N is read from argv[1] because Zero
0.1.2 has no exposed stdin capability. All logic stays inside
pub fun main. The flag buffer is [10001]u8 — a stack-allocated
fixed-size array of u8, which is one of the direct backend's
allowed element types. The size cap of 10001 doubles as the spec
upper bound (N <= 10000), so the program returns an error for
N > 10000 rather than risking an out-of-bounds write.
The decimal renderer reuses the pattern from 001 (digits buffer,
walk the value, reverse into the output buffer) but with count: u32
as the source value, so the divmod stays on small u32 divisors and
does not trip the 2^32-divisor SIGFPE documented in
~/repos/zero-bugs/u64-divmod-cast-sigfpe/.
Future revisions
- Property test: a structural check that the model's output uses a
sieve, not trial division — possible candidates are time-budget
shrinking (set
wall_time_max_mssuch that trial division would blow it on largeN) or AST inspection of the generated code. Defer to v0.2. - Higher
N. The sieve allocation isO(N); today's[10001]u8cap is easy to lift to[100001]u8once we confirm the direct backend handles ~100KB stack-local arrays without surprises.
Cost
| Model | Prompt tokens | Completion tokens | API ms |
|---|---|---|---|
| gpt-4o | 2,805 | 1,166 | 10,735 |
| gpt-4o-mini | 2,805 | 932 | 46,297 |
| gpt-5 | 2,800 | 11,245 | 151,004 |
Tokens and API ms are summed across the five languages this model attempted for this task.
Compare
Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 . Back to the leaderboard and methodology.