Prime count via Sieve of Eratosthenes

002-sieve-prime-count. Read a single non-negative integer `N` from standard input (`0 <= N <= 10000`).

Prompt

This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.

## Task: Prime count via Sieve of Eratosthenes

Read a single non-negative integer `N` from standard input (`0 <= N <= 10000`).
Compute the count of prime numbers in the closed interval `[2, N]` using a
**Sieve of Eratosthenes** (a brute-force `is_prime(k)` per candidate is
considered out of band; the task is about the sieve algorithm). Write the
count to standard output followed by a single newline. Exit with status 0.

Edge cases:

- `N = 0` and `N = 1` both yield `0` (no primes in the interval).
- `N = 2` yields `1` (just `2`).

## Acceptance

- Stdin: one line containing the integer `N`.
- Stdout: the decimal digits of the count, followed by `\n`.
- Stderr: empty.
- Exit code: 0.
- The program must complete `N = 10000` within 5 seconds.

## Examples

| N     | stdout    |
| ----- | --------- |
| 0     | `0\n`     |
| 2     | `1\n`     |
| 10    | `4\n`     |
| 100   | `25\n`    |
| 1000  | `168\n`   |
| 10000 | `1229\n`  |

Acceptance

A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."

stdout (byte-exact, per case)	true
stderr (exact bytes)	""
exit code	0
wall time max (ms)	5000
requires_sieve	true
tags	computation, sieve, arrays

Results

Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.

Model	Zero	TypeScript	Rust	Go	Python
gpt-4o	compile	✓	✓	✓	✓
gpt-4o-mini	compile	✓	other	✓	✓
gpt-5	compile	✓	✓	✓	✓
opus	compile	✓	✓	✓	✓
sonnet	wrong output	✓	✓	✓	✓

Failure excerpts

6 of 25 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.

gpt-4o Zero compile

ref.zero:1:1 IMP001: unknown package-local import 'std'

gpt-4o-mini Zero compile

ref.zero:4:1 PAR100: expected '{' before block

gpt-4o-mini Rust other
```
(no diagnostic captured)
```

gpt-5 Zero compile

ref.zero:1:1 IMP001: unknown package-local import 'std'

opus Zero compile

ref.zero:72:16 PAR100: unexpected character '@'

sonnet Zero wrong output
```
(no diagnostic captured)
```

Reference implementations

The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.

Click a language to expand

Zero n/a

No reference implementation in this language.

TypeScript 29 lines

// Prime count via Sieve of Eratosthenes, TypeScript reference.
// Reads N from stdin, runs a byte-flag sieve, counts unmarked [2, N].
import { readFileSync } from "node:fs";

const input = readFileSync(0, "utf8").trim();
const n = Number.parseInt(input, 10);
if (!Number.isInteger(n) || n < 0) {
  process.stderr.write("N must be a non-negative integer\n");
  process.exit(1);
}

if (n < 2) {
  process.stdout.write("0\n");
} else {
  const composite = new Uint8Array(n + 1);
  for (let i = 2; i * i <= n; i++) {
    if (composite[i] === 0) {
      for (let j = i * i; j <= n; j += i) {
        composite[j] = 1;
      }
    }
  }
  let count = 0;
  for (let k = 2; k <= n; k++) {
    if (composite[k] === 0) count++;
  }
  process.stdout.write(`${count}\n`);
}

Rust 40 lines

// Prime count via Sieve of Eratosthenes, Rust reference.
// Reads N from stdin, runs a byte-flag sieve, counts unmarked [2, N].
use std::io::{self, Read, Write};
use std::process::ExitCode;

fn main() -> ExitCode {
    let mut input = String::new();
    if io::stdin().read_to_string(&mut input).is_err() {
        let _ = writeln!(io::stderr(), "failed to read stdin");
        return ExitCode::from(1);
    }
    let n: i64 = match input.trim().parse() {
        Ok(v) if v >= 0 => v,
        _ => {
            let _ = writeln!(io::stderr(), "N must be a non-negative integer");
            return ExitCode::from(1);
        }
    };
    let n_usize = n as usize;
    let count = if n < 2 {
        0
    } else {
        let mut composite = vec![0u8; n_usize + 1];
        let mut i: usize = 2;
        while i * i <= n_usize {
            if composite[i] == 0 {
                let mut j = i * i;
                while j <= n_usize {
                    composite[j] = 1;
                    j += i;
                }
            }
            i += 1;
        }
        (2..=n_usize).filter(|&k| composite[k] == 0).count()
    };
    let _ = writeln!(io::stdout(), "{}", count);
    ExitCode::SUCCESS
}

Go 41 lines

// Prime count via Sieve of Eratosthenes, Go reference.
// Reads N from stdin, runs a byte-flag sieve, counts unmarked [2, N].
package main

import (
	"bufio"
	"fmt"
	"os"
	"strconv"
	"strings"
)

func main() {
	reader := bufio.NewReader(os.Stdin)
	line, _ := reader.ReadString('\n')
	n, err := strconv.Atoi(strings.TrimSpace(line))
	if err != nil || n < 0 {
		fmt.Fprintln(os.Stderr, "N must be a non-negative integer")
		os.Exit(1)
	}
	if n < 2 {
		fmt.Print("0\n")
		return
	}
	composite := make([]byte, n+1)
	for i := 2; i*i <= n; i++ {
		if composite[i] == 0 {
			for j := i * i; j <= n; j += i {
				composite[j] = 1
			}
		}
	}
	count := 0
	for k := 2; k <= n; k++ {
		if composite[k] == 0 {
			count++
		}
	}
	fmt.Print(count, "\n")
}

Python 33 lines

"""Prime count via Sieve of Eratosthenes, Python reference.

Reads N from stdin, runs a standard byte-flag sieve over [0, N], counts
unmarked indices in [2, N]. N is bounded by 10000 by the task spec but
nothing here would change for a larger N apart from the allocation.
"""
import sys


def main() -> None:
    n = int(sys.stdin.readline().strip())
    if n < 0:
        sys.stderr.write("N must be a non-negative integer\n")
        sys.exit(1)
    if n < 2:
        sys.stdout.write("0\n")
        return
    composite = bytearray(n + 1)
    i = 2
    while i * i <= n:
        if composite[i] == 0:
            j = i * i
            while j <= n:
                composite[j] = 1
                j += i
        i += 1
    count = sum(1 for k in range(2, n + 1) if composite[k] == 0)
    sys.stdout.write(f"{count}\n")


if __name__ == "__main__":
    main()

Design notes

Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/002-sieve-prime-count/notes.md.

Why this task

Second pure-computation task. Differs from 001-fibonacci-memoized in shape: there is no recursion to memoize and no per-call subproblem to cache; instead the solution exercises array marking and a nested loop with the inner bound depending on the outer index. The model must (a) allocate a flag buffer of size N+1, (b) iterate the right range for the outer pointer, (c) stop the inner mark loop at the right cap, (d) iterate again to count, all without off-by-ones.

Difficulty: easy. The sieve is canonical algorithm material, but it catches models that confuse the count's upper bound (<= N vs < N) or that overflow the inner-loop start (i * i rather than 2 * i is the optimization the spec rewards).

The acceptance criteria reduce to a byte-exact stdout match per test case, same as 001. No subjective scoring.

Edge cases probed

N = 0 and N = 1 both yield 0. A model that emits 0\n or \n only on N < 2 rather than always must still handle both.
N = 2 yields 1. Catches models that initialize i = 3 thinking they can hand-fold the only-even-prime case.
N = 10000 yields 1229. A is_prime(k) trial-division fallback would still pass on this corpus (O(N * sqrt(N)) = O(10^6) for the largest case), so the spec asks for a sieve specifically and acceptance.requires_sieve lifts that into a metadata flag that the harness will check against (we'll add a structural check on the model's output later — for v0.1 we trust the prompt).

Zero implementation notes

Same input convention as 001: N is read from argv[1] because Zero 0.1.2 has no exposed stdin capability. All logic stays inside pub fun main. The flag buffer is [10001]u8 — a stack-allocated fixed-size array of u8, which is one of the direct backend's allowed element types. The size cap of 10001 doubles as the spec upper bound (N <= 10000), so the program returns an error for N > 10000 rather than risking an out-of-bounds write.

The decimal renderer reuses the pattern from 001 (digits buffer, walk the value, reverse into the output buffer) but with count: u32 as the source value, so the divmod stays on small u32 divisors and does not trip the 2^32-divisor SIGFPE documented in ~/repos/zero-bugs/u64-divmod-cast-sigfpe/.

Future revisions

Property test: a structural check that the model's output uses a sieve, not trial division — possible candidates are time-budget shrinking (set wall_time_max_ms such that trial division would blow it on large N) or AST inspection of the generated code. Defer to v0.2.
Higher N. The sieve allocation is O(N); today's [10001]u8 cap is easy to lift to [100001]u8 once we confirm the direct backend handles ~100KB stack-local arrays without surprises.

Cost

Model	Prompt tokens	Completion tokens	API ms
gpt-4o	2,805	1,166	10,735
gpt-4o-mini	2,805	932	46,297
gpt-5	2,800	11,245	151,004
opus	14	1,606	74,931
sonnet	12	960	71,752

Tokens and API ms are summed across the five languages this model attempted for this task.

Compare

Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 · opus · sonnet . Back to the leaderboard and methodology.

Truffle · 002-sieve-prime-count · run captured 2026-06-27