agentlang-index · task easy

Non-overlapping substring count

006-substring-count. Read two lines from standard input:

Prompt

This is the natural-language brief given to every model, verbatim. The harness prefixes a language-specific calling-convention block and suffixes a "return only the source code" instruction. Nothing else.

## Task: Non-overlapping substring count

Read two lines from standard input:

1. A pattern `P` (1 to 100 printable-ASCII characters).
2. A text `T` (0 to 1000 printable-ASCII characters).

Count the number of **non-overlapping** occurrences of `P` inside `T`, using
a greedy left-to-right scan: once a match starts at index `i`, the next
candidate match starts at `i + len(P)`. Write the decimal count followed by
a newline. Exit with status 0.

## Acceptance

- Stdin: `P\nT\n`. Both lines printable-ASCII; `P` length in `[1, 100]`,
  `T` length in `[0, 1000]`.
- Stdout: the count as a decimal integer (no leading zeros, no sign) followed
  by exactly one `\n`.
- Stderr: empty.
- Exit code: 0.

## Examples

| pattern (P) | text (T)            | output |
| ----------- | ------------------- | ------ |
| `ab`        | `abababab`          | `4`    |
| `aa`        | `aaaa`              | `2`    |
| `xyz`       | `aaa`               | `0`    |
| `a`         | `bbbbb`             | `0`    |
| `na`        | `banana`            | `2`    |
| `hello`     | `hello`             | `1`    |
| `x`         | (empty)             | `0`    |

In `aa` over `aaaa`, the greedy non-overlapping rule yields 2 matches at
positions 0 and 2, not 3 matches. In `na` over `banana`, the matches are at
positions 2 and 4 (no overlap).

## Language scaffold

{language_scaffold}

Acceptance

A task counts as passed only when every public and hidden test case agrees on these fields. No fuzzy matching, no "off by one trailing newline is fine."

stdout (byte-exact, per case) true
stderr (exact bytes) ""
exit code 0
wall time max (ms) 5000
tags parsing, strings, search

Results

Each cell is one attempt. Pass means stdout matched byte-exact on every test case, stderr empty, exit zero. Hover a failure to see the captured first line of the diagnostic.

Model ZeroTypeScriptRustGoPython
gpt-4o compile
gpt-4o-mini compile
gpt-5 compile

Failure excerpts

3 of 15 attempts failed. Each card is one attempt, with the captured first line of the diagnostic.

  1. gpt-4o Zero compile
    ref.zero:1:1 IMP001: unknown package-local import 'std'
  2. gpt-4o-mini Zero compile
    ref.zero:1:1 PAR100: expected '{' before block
  3. gpt-5 Zero compile
    ref.zero:1:1 IMP001: unknown package-local import 'std'

Reference implementations

The hand-written reference each language ships with. Every reference passes the same public and hidden test suite under the pinned toolchain before any model touches the task.

Click a language to expand

Zero 78 lines
// Non-overlapping substring count, Zero 0.1.2 direct backend.
//
// Same input convention as 001-005: arguments come from argv because
// Zero 0.1.2 has no exposed stdin capability. argv[1] is the pattern P
// and argv[2] is the text T. All logic stays inside `pub fun main`
// to avoid Span/MutSpan parameter restrictions.
//
// Greedy non-overlapping scan: once a match starts at index i, the
// next candidate starts at i + m. The count is rendered with the
// small-divisor u32 divmod renderer (avoids the 2^32-divisor SIGFPE).
pub fun main(world: World) -> Void raises {
    let p_opt = std.args.get(1)
    let t_opt = std.args.get(2)
    let mut p_bytes: Span<u8> = std.mem.span("")
    let mut t_bytes: Span<u8> = std.mem.span("")
    if p_opt.has {
        p_bytes = std.mem.span(p_opt.value)
    }
    if t_opt.has {
        t_bytes = std.mem.span(t_opt.value)
    }
    let m: usize = std.mem.len(p_bytes)
    let n: usize = std.mem.len(t_bytes)

    let mut count: u32 = 0_u32
    if m > 0 && m <= n {
        let mut i: usize = 0
        while i + m <= n {
            let mut k: usize = 0
            let mut equal: Bool = true
            while k < m {
                if t_bytes[i + k] != p_bytes[k] {
                    equal = false
                    k = m
                } else {
                    k = k + 1
                }
            }
            if equal {
                count = count + 1_u32
                i = i + m
            } else {
                i = i + 1
            }
        }
    }

    let mut out_buf: [16]u8 = [0_u8; 16]
    let mut written: usize = 0
    if count == 0_u32 {
        out_buf[0] = 48_u8
        written = 1
    } else {
        let mut digits: [16]u8 = [0_u8; 16]
        let mut rem: u32 = count
        let mut dc: usize = 0
        while rem > 0_u32 {
            let dq: u32 = rem / 10_u32
            let dr: u32 = rem % 10_u32
            digits[dc] = (dr as u8) + 48_u8
            rem = dq
            dc = dc + 1
        }
        let mut o: usize = 0
        while o < dc {
            out_buf[written] = digits[dc - 1 - o]
            written = written + 1
            o = o + 1
        }
    }
    out_buf[written] = 10_u8
    written = written + 1

    let out: Span<u8> = out_buf[0..written]
    check world.out.write(out)
    return
}
TypeScript 29 lines
// Non-overlapping substring count, TypeScript reference.
// Reads pattern P from line 1 of stdin and text T from line 2.

import { readFileSync } from "node:fs";

function main(): void {
  const raw = readFileSync(0, "utf8");
  const lines = raw.split("\n");
  const p = lines[0] ?? "";
  const t = lines[1] ?? "";
  let count = 0;
  if (p.length > 0) {
    let i = 0;
    const n = t.length;
    const m = p.length;
    while (i + m <= n) {
      if (t.substring(i, i + m) === p) {
        count++;
        i += m;
      } else {
        i++;
      }
    }
  }
  process.stdout.write(`${count}\n`);
}

main();
Rust 34 lines
// Non-overlapping substring count, Rust reference.
// Reads pattern P from line 1 of stdin and text T from line 2.

use std::io::{self, Read, Write};

fn main() {
    let mut input = String::new();
    io::stdin().read_to_string(&mut input).unwrap();
    let mut lines = input.split('\n');
    let p = lines.next().unwrap_or("");
    let t = lines.next().unwrap_or("");
    let mut count: u64 = 0;
    if !p.is_empty() {
        let pb = p.as_bytes();
        let tb = t.as_bytes();
        let m = pb.len();
        let n = tb.len();
        if m <= n {
            let mut i = 0;
            while i + m <= n {
                if &tb[i..i + m] == pb {
                    count += 1;
                    i += m;
                } else {
                    i += 1;
                }
            }
        }
    }
    let out = io::stdout();
    let mut h = out.lock();
    writeln!(h, "{}", count).unwrap();
}
Go 40 lines
// Non-overlapping substring count, Go reference.
// Reads pattern P from line 1 of stdin and text T from line 2.

package main

import (
	"bufio"
	"fmt"
	"io"
	"os"
	"strings"
)

func main() {
	raw, _ := io.ReadAll(bufio.NewReader(os.Stdin))
	parts := strings.SplitN(string(raw), "\n", 3)
	var p, t string
	if len(parts) >= 1 {
		p = parts[0]
	}
	if len(parts) >= 2 {
		t = parts[1]
	}
	count := 0
	if len(p) > 0 {
		m := len(p)
		n := len(t)
		i := 0
		for i+m <= n {
			if t[i:i+m] == p {
				count++
				i += m
			} else {
				i++
			}
		}
	}
	fmt.Printf("%d\n", count)
}
Python 29 lines
"""Non-overlapping substring count, Python reference.

Reads pattern P from line 1 of stdin and text T from line 2. Counts greedy
non-overlapping occurrences and writes the decimal count followed by \\n.
"""
import sys


def main() -> None:
    raw = sys.stdin.read().split("\n", 2)
    p = raw[0] if len(raw) >= 1 else ""
    t = raw[1] if len(raw) >= 2 else ""
    count = 0
    if p:
        i = 0
        n = len(t)
        m = len(p)
        while i + m <= n:
            if t[i:i + m] == p:
                count += 1
                i += m
            else:
                i += 1
    sys.stdout.write(f"{count}\n")


if __name__ == "__main__":
    main()

Design notes

Algorithm, failure modes, cross-language parity, and where Zero needed a workaround. From corpus/006-substring-count/notes.md.

Algorithm

Greedy left-to-right scan. For each position i in 0..=n-m, test T[i..i+m] == P. On match, count++ and i += m. On miss, i += 1. Empty T or m > n short-circuits to 0.

Why greedy non-overlap (not overlapping count)

The two reasonable count semantics are:

  1. Greedy non-overlapping (what we use): aa in aaaa = 2.
  2. Overlapping: aa in aaaa = 3.

We chose the greedy non-overlapping shape because (a) it is the more common default across str.count implementations (Python "aaaa".count("aa") = 2, Rust str::matches does non-overlapping when used naively), and (b) it is the deterministic shape that a model can derive without ambiguity from the spec wording "once a match starts at index i, the next candidate match starts at i + len(P)."

Models that conflate the two semantics will fail case 2 (the canonical trap) and case 1 (which would yield 5, not 4, under the overlapping rule applied to ab over abababab).

Zero-specific notes

  • argv[1] = P, argv[2] = T. Both defaulted to empty Span<u8> when the caller omits them (we branch on .has).
  • Inner equality check is a manual byte-loop (no == on Span<u8> in Zero 0.1.2 direct backend). Early-exit on mismatch by setting the loop index to m (Zero has no break).
  • The count fits in u32 (max is n/m and n <= 1000). The decimal renderer reuses the small-divisor pattern from tasks 001-004.

Cross-implementation parity

All five references produce byte-exact decimal-count followed by \n on every case in both stdin (TS/Rust/Go/Python) and argv (Zero) input modes.


Cost

Model Prompt tokens Completion tokens API ms
gpt-4o 3,275 870 8,086
gpt-4o-mini 3,275 679 15,692
gpt-5 3,270 14,299 151,663

Tokens and API ms are summed across the five languages this model attempted for this task.


Compare

Model deep-dives: gpt-4o · gpt-4o-mini · gpt-5 . Back to the leaderboard and methodology.