agentlang-index · task easy
Non-overlapping substring count
006-substring-count. Read two lines from standard input:
Prompt
This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.
## Task: Non-overlapping substring count
Read two lines from standard input:
1. A pattern `P` (1 to 100 printable-ASCII characters).
2. A text `T` (0 to 1000 printable-ASCII characters).
Count the number of **non-overlapping** occurrences of `P` inside `T`, using
a greedy left-to-right scan: once a match starts at index `i`, the next
candidate match starts at `i + len(P)`. Write the decimal count followed by
a newline. Exit with status 0.
## Acceptance
- Stdin: `P\nT\n`. Both lines printable-ASCII; `P` length in `[1, 100]`,
`T` length in `[0, 1000]`.
- Stdout: the count as a decimal integer (no leading zeros, no sign) followed
by exactly one `\n`.
- Stderr: empty.
- Exit code: 0.
## Examples
| pattern (P) | text (T) | output |
| ----------- | ------------------- | ------ |
| `ab` | `abababab` | `4` |
| `aa` | `aaaa` | `2` |
| `xyz` | `aaa` | `0` |
| `a` | `bbbbb` | `0` |
| `na` | `banana` | `2` |
| `hello` | `hello` | `1` |
| `x` | (empty) | `0` |
In `aa` over `aaaa`, the greedy non-overlapping rule yields 2 matches at
positions 0 and 2, not 3 matches. In `na` over `banana`, the matches are at
positions 2 and 4 (no overlap).
## Language scaffold
{language_scaffold}
Acceptance
A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."
| stdout (byte-exact, per case) | true |
|---|---|
| stderr (exact bytes) | "" |
| exit code | 0 |
| wall time max (ms) | 5000 |
| tags | parsing, strings, search |
Results
Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.
| Model | Zero | TypeScript | Rust | Go | Python |
|---|---|---|---|---|---|
| gpt-4o | compile | ✓ | ✓ | ✓ | ✓ |
| gpt-4o-mini | compile | ✓ | ✓ | ✓ | ✓ |
| gpt-5 | compile | ✓ | ✓ | ✓ | ✓ |
Failure excerpts
3 of 15 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.
-
ref.zero:1:1 IMP001: unknown package-local import 'std' -
ref.zero:1:1 PAR100: expected '{' before block -
ref.zero:1:1 IMP001: unknown package-local import 'std'
Reference implementations
The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.
Click a language to expand
Zero 78 lines
// Non-overlapping substring count, Zero 0.1.2 direct backend.
//
// Same input convention as 001-005: arguments come from argv because
// Zero 0.1.2 has no exposed stdin capability. argv[1] is the pattern P
// and argv[2] is the text T. All logic stays inside `pub fun main`
// to avoid Span/MutSpan parameter restrictions.
//
// Greedy non-overlapping scan: once a match starts at index i, the
// next candidate starts at i + m. The count is rendered with the
// small-divisor u32 divmod renderer (avoids the 2^32-divisor SIGFPE).
pub fun main(world: World) -> Void raises {
let p_opt = std.args.get(1)
let t_opt = std.args.get(2)
let mut p_bytes: Span<u8> = std.mem.span("")
let mut t_bytes: Span<u8> = std.mem.span("")
if p_opt.has {
p_bytes = std.mem.span(p_opt.value)
}
if t_opt.has {
t_bytes = std.mem.span(t_opt.value)
}
let m: usize = std.mem.len(p_bytes)
let n: usize = std.mem.len(t_bytes)
let mut count: u32 = 0_u32
if m > 0 && m <= n {
let mut i: usize = 0
while i + m <= n {
let mut k: usize = 0
let mut equal: Bool = true
while k < m {
if t_bytes[i + k] != p_bytes[k] {
equal = false
k = m
} else {
k = k + 1
}
}
if equal {
count = count + 1_u32
i = i + m
} else {
i = i + 1
}
}
}
let mut out_buf: [16]u8 = [0_u8; 16]
let mut written: usize = 0
if count == 0_u32 {
out_buf[0] = 48_u8
written = 1
} else {
let mut digits: [16]u8 = [0_u8; 16]
let mut rem: u32 = count
let mut dc: usize = 0
while rem > 0_u32 {
let dq: u32 = rem / 10_u32
let dr: u32 = rem % 10_u32
digits[dc] = (dr as u8) + 48_u8
rem = dq
dc = dc + 1
}
let mut o: usize = 0
while o < dc {
out_buf[written] = digits[dc - 1 - o]
written = written + 1
o = o + 1
}
}
out_buf[written] = 10_u8
written = written + 1
let out: Span<u8> = out_buf[0..written]
check world.out.write(out)
return
}
TypeScript 29 lines
// Non-overlapping substring count, TypeScript reference.
// Reads pattern P from line 1 of stdin and text T from line 2.
import { readFileSync } from "node:fs";
function main(): void {
const raw = readFileSync(0, "utf8");
const lines = raw.split("\n");
const p = lines[0] ?? "";
const t = lines[1] ?? "";
let count = 0;
if (p.length > 0) {
let i = 0;
const n = t.length;
const m = p.length;
while (i + m <= n) {
if (t.substring(i, i + m) === p) {
count++;
i += m;
} else {
i++;
}
}
}
process.stdout.write(`${count}\n`);
}
main();
Rust 34 lines
// Non-overlapping substring count, Rust reference.
// Reads pattern P from line 1 of stdin and text T from line 2.
use std::io::{self, Read, Write};
fn main() {
let mut input = String::new();
io::stdin().read_to_string(&mut input).unwrap();
let mut lines = input.split('\n');
let p = lines.next().unwrap_or("");
let t = lines.next().unwrap_or("");
let mut count: u64 = 0;
if !p.is_empty() {
let pb = p.as_bytes();
let tb = t.as_bytes();
let m = pb.len();
let n = tb.len();
if m <= n {
let mut i = 0;
while i + m <= n {
if &tb[i..i + m] == pb {
count += 1;
i += m;
} else {
i += 1;
}
}
}
}
let out = io::stdout();
let mut h = out.lock();
writeln!(h, "{}", count).unwrap();
}
Go 40 lines
// Non-overlapping substring count, Go reference.
// Reads pattern P from line 1 of stdin and text T from line 2.
package main
import (
"bufio"
"fmt"
"io"
"os"
"strings"
)
func main() {
raw, _ := io.ReadAll(bufio.NewReader(os.Stdin))
parts := strings.SplitN(string(raw), "\n", 3)
var p, t string
if len(parts) >= 1 {
p = parts[0]
}
if len(parts) >= 2 {
t = parts[1]
}
count := 0
if len(p) > 0 {
m := len(p)
n := len(t)
i := 0
for i+m <= n {
if t[i:i+m] == p {
count++
i += m
} else {
i++
}
}
}
fmt.Printf("%d\n", count)
}
Python 29 lines
"""Non-overlapping substring count, Python reference.
Reads pattern P from line 1 of stdin and text T from line 2. Counts greedy
non-overlapping occurrences and writes the decimal count followed by \\n.
"""
import sys
def main() -> None:
raw = sys.stdin.read().split("\n", 2)
p = raw[0] if len(raw) >= 1 else ""
t = raw[1] if len(raw) >= 2 else ""
count = 0
if p:
i = 0
n = len(t)
m = len(p)
while i + m <= n:
if t[i:i + m] == p:
count += 1
i += m
else:
i += 1
sys.stdout.write(f"{count}\n")
if __name__ == "__main__":
main()
Design notes
Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/006-substring-count/notes.md.
Algorithm
Greedy left-to-right scan. For each position i in 0..=n-m, test
T[i..i+m] == P. On match, count++ and i += m. On miss, i += 1.
Empty T or m > n short-circuits to 0.
Why greedy non-overlap (not overlapping count)
The two reasonable count semantics are:
- Greedy non-overlapping (what we use):
aainaaaa= 2. - Overlapping:
aainaaaa= 3.
We chose the greedy non-overlapping shape because (a) it is the more common
default across str.count implementations (Python "aaaa".count("aa") = 2,
Rust str::matches does non-overlapping when used naively), and (b) it is
the deterministic shape that a model can derive without ambiguity from the
spec wording "once a match starts at index i, the next candidate match
starts at i + len(P)."
Models that conflate the two semantics will fail case 2 (the canonical
trap) and case 1 (which would yield 5, not 4, under the overlapping rule
applied to ab over abababab).
Zero-specific notes
- argv[1] = P, argv[2] = T. Both defaulted to empty
Span<u8>when the caller omits them (we branch on.has). - Inner equality check is a manual byte-loop (no
==onSpan<u8>in Zero 0.1.2 direct backend). Early-exit on mismatch by setting the loop index tom(Zero has nobreak). - The count fits in
u32(max isn/mandn <= 1000). The decimal renderer reuses the small-divisor pattern from tasks 001-004.
Cross-implementation parity
All five references produce byte-exact decimal-count followed by \n on
every case in both stdin (TS/Rust/Go/Python) and argv (Zero) input modes.
Cost
| Model | Prompt tokens | Completion tokens | API ms |
|---|---|---|---|
| gpt-4o | 3,275 | 870 | 8,086 |
| gpt-4o-mini | 3,275 | 679 | 15,692 |
| gpt-5 | 3,270 | 14,299 | 151,663 |
Tokens and API ms are summed across the five languages this model attempted for this task.
Compare
Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 . Back to the leaderboard and methodology.