` block; `data-module` is two-digit and matches the filename's module number; exactly three `

` children with ids q1/q2/q3; each carries a non-empty `data-pq-answer` in canonical form (`value === value.trim().toLowerCase()`); if `data-pq-answer-alts` is present, every alt is non-empty/canonical, no alt duplicates the canonical answer, and no two alts are identical; each q-div ships a hint surface (`data-pq-hint` attribute or `

` child) plus an `` and a `.pq-status`; placeholder slots like `{{PT_Q1_ANSWER}}` fail loudly when shipped in a non-template file. Skips `_template/`, `*-smoke.*`, `module-9N-*`, and files with the D54 `` first-line marker. ### Property-based testing convention (sub-seed isolation) Every Python code lab seeds its module-level RNG with `random.seed(0)` for reproducibility; every JS code lab declares a `mulberry32(0)` general-purpose stream. But the **property-based** block at the end of each lab (introduced in D29) must NOT consume from that shared stream. Reason: if a student adds a scratch test between the seed line and the property block, the 200 trials desynchronize between students — the "deterministic" property test becomes "deterministic conditional on the rest of the test set being unchanged," which is half-deterministic. **Convention:** every property-based block opens its own dedicated stream seeded with a small per-block integer (`1` for the first property block in a lab, `2` for the second, etc.). Concretely: ```python # Python _pt_rng = random.Random(1) # dedicated, isolated from module random for _ in range(N): u = [_pt_rng.uniform(...) for _ in range(d)] ... ``` ```javascript // JavaScript const propertyRand = mulberry32(1); // dedicated stream for (let trial = 0; trial < N; trial++) { const u = Array.from({length: d}, () => propertyRand() * ... ); ... } ``` The general-purpose `random` / `rand` stays available for any non-property test that genuinely *wants* shared state. This separates "tests that pin specific inputs" from "tests that assert a property over a sweep" at the seed level. **Trial-count rule (D59, 2026-05-13):** every property-based sweep must run at least **100 trials**, with **200 preferred** (the M1 prototype's calibration). The trial count must appear as a *literal integer at the loop site* — `for _ in range(200):` in Python, `for (let trial = 0; trial < 200; trial++)` in JS — not as a named constant referenced from elsewhere in the snippet. The reason: a property test that runs 10 trials proves nothing about the property; statistical significance for catching Cauchy–Schwarz-class violations (rare, deterministic) sits comfortably above 100. A named constant hides the trial count at code-review time, the same readability failure mode the sub-seed rule fixes for seed values. Modules that go above 200 are fine; the convention is a *floor*, not a target. **Implementation status (D56 + D60, 2026-05-13):** Mechanized by `scripts/check-property-seed.mjs` (wired into `npm run check` as the 9th lint). Per-runnable-snippet checks across every `` python and js block: (A) every `random.seed(N)` literal must be `N == 0`; (B) every `random.Random(N)` literal must be an integer `>= 1`; (C) every `const rand = mulberry32(N)` must be `N == 0`; (D) every `const propertyRand[\w]* = mulberry32(N)` must be an integer `>= 1`; **(E, D60) the first statement-level `for in range(M):` after `random.Random(...)` must use an integer literal `M >= 100`**; **(F, D60) the first `for (let = 0; < M; ++)` after `const propertyRand[\w]* = mulberry32(...)` must use an integer literal `M >= 100`**. Non-literal seeds (e.g. `random.Random(seed_var)`) and non-literal trial counts (`range(N)` where `N` is a name binding) fail loudly — the convention is about *visible* seed isolation and *visible* trial count, so a variable would hide the value at code-review time. T2/T3/T4 snippets (C, Rust, JAX, CUDA, Mojo, Triton) are skipped because the runtime never executes them in-browser. Skips `_template/` is **not** applied here (the template is the canonical seed-isolation example and must satisfy its own contract), but `LINT-SKIP-FIXTURE`, `SKIP_FILE_RE`, and `SKIP_DIRS` are honoured per the shared `_lint-utils.mjs` convention. ### Forge Gate input normalization (D10) Forge Gate inputs grade as `sha256(canonical(input)) === expected_hash` where `canonical(s) = s.trim().toLowerCase().replace(/\s+/g, '')`. Whitespace-strip is what lets `"8,9"`, `"8, 9"`, and `" 8 , 9 "` all hash identically while `"9,8"` and `"(8,9)"` do not. **Authoring rules:** - Publish the SHA-256 of the *smallest canonical form*: a single integer, comma- separated integers without spaces, a single lowercase word, or a single named operator. - Display the canonical format adjacent to the input field — a label like "Comma-separated integers, no spaces" prevents a format-error from reading as a math-error. - Treat any non-alphanumeric character beyond `,` `.` `-` as a publication risk; parentheses, braces, equals signs, and unicode dashes are not stripped. - Never share the Forge Gate hash with the Boss Fight answer (D11/D30) — the Boss Fight is the harder applied check, the Forge Gate is the sanity check. **Implementation status (rigor D48 — closed by coder run 6 + coder E31 lint, 2026-05-12):** `assets/gate.js` (the shared `` element) and the inline `sha256()` in `module-01-vectors.html` both delegate through `canonical(s)` per the spec above. M1's published hash for `"8,9"` happened to already be in canonical form, so no re-mint was required; future modules should mint against `sha256(canonical(answer))` explicitly. `scripts/check-gate-canonical.mjs` (coder E31, wired into `npm run check`) now mechanizes the rigor invariant — it extracts every `canonical(s)` body in the repo and asserts each produces identical output to the spec across a 16-input matrix (8 whitespace/case variants of `"8,9"` plus 8 structurally-distinct controls). Authors who add a new gate should still mint with the one-liner `python3 -c "import hashlib; print(hashlib.sha256('YOUR_ANSWER'.strip().lower().replace(' ', '').encode()).hexdigest())"` and let the lint guard against drift in the runtime implementations. ### Code lab claim coverage (D42) Every code lab ships **exactly six tests** (run-3 E20 contract): four behavioural (parallel · orthogonal · zero-vector · dimension-mismatch), one numerical-stability (D5/D32), and one property-based sweep (D29/D41). Each behavioural test should map to a prove-list claim where the math has one — e.g., M1's "parallel" test reinforces `dotproof`+`cosbounds`, "orthogonal" reinforces `zero`, "200-trial sweep" reinforces `cs`. Stability and dim-mismatch tests are framework-level (no claim). **Authoring rule:** the JS and Python lab snippets must reinforce the *same* claims. Per-language label drift (a JS test asserting `cs` while the Python equivalent asserts `cosbounds`) silently breaks cross-language tutor calibration: a student passing the lab in JS but failing in Python should re-encounter the same claim, not a different one. `scripts/lint-labs.js` (coder E23) is the natural enforcement point when shipped. **Claim-sentinel convention (D57, 2026-05-13):** the claim-to-test mapping above is enforced at code-review time via in-snippet sentinel comments. Python snippets author `# claim: ` on a line of its own immediately above the assert that reinforces that claim; JS snippets use `// claim: `. Multiple claims for the same test stack as adjacent sentinel lines (M1's parallel test reinforces both `dotproof` and `cosbounds`, so two `# claim:` lines precede the assert). Framework-level tests (numerical stability, dim-mismatch, zero-vec convention) carry **no** sentinel — they reinforce no prove-list claim. The cross-language invariant: the *set* of slugs declared in the Python snippet must equal the set declared in the JS snippet of the same `` block. Set equality, not list equality — order is irrelevant. Each declared slug must be present in the §11 reserved table OR module-prefixed (`m{NN}-…`); the same naming rule the cards/prove-list family already enforces via D47. Convention is **opt-in**: a snippet with zero `# claim:` / `// claim:` lines passes the lint trivially. This keeps runtime-smoke labs (`tools/smoke.html`) — which test the runtime, not a math claim — outside the contract while every module-lab snippet that authors even one sentinel inherits the cross-language coupling automatically. **Implementation status (D58, 2026-05-13):** Mechanized by `scripts/check-claim-tags.mjs` (wired into `npm run check` as the **10th lint**). Per `` block: extracts every Python `# claim: ` sentinel and every JS `// claim: ` sentinel from the runnable-tier snippets (T0 JS, T1 Python); if either snippet declares ≥1 sentinel, asserts set equality across the pair; validates every declared slug against the reserved table ∪ module-prefix regex (shared with `check-cards.mjs` D47). Snippets with zero sentinels are opt-out (silently skipped). Auto-discovery and skip semantics mirror `lint-labs.mjs` exactly (same `walk` + `findLabBlocks` shape, same `_lint-utils.mjs` imports). **Drift-detection footnote (coder E40, 2026-05-13):** The D57 sentinel regex now lives in *two* places — the build-time `extractClaims()` in `scripts/check-claim-tags.mjs` and the run-time `_parseClaims()` in `assets/runtime.js` (the tutor handoff path). `scripts/check-claim-regex.mjs` (coder E40, wired into `npm run check` as the **11th lint**) mechanizes the invariant that both regexes stay byte-identical and behavior-identical: asserts the canonical regex source literal appears verbatim in both files, then compiles each pair and runs them on a 10-row test matrix asserting per-input agreement on captured slug arrays. Mirrors the `check-gate-canonical.mjs` (E31) pattern. Closes the failure mode where a refactor changes one regex but not the other — a student sees a sentinel sail through the build but get silently dropped by the runtime tutor handoff, or vice versa. ### Code lab minimum language coverage (E12) Every `` MUST declare **at least `python` AND `js`** in its `langs="…"` attribute and ship the corresponding `` and `` children. These are the two runnable tiers (T1 Python via Pyodide, T0 JS native) — the only languages whose snippets execute in-browser, write `forge.state.labs[labId]`, and surface to the AI Tutor's lab-failure handoff (E15/E42). Without both, a student on either preferred language hits a tab whose Run button degrades to "open external sandbox" and the lab-state pill never resolves to green. **Rationale:** - The hub stat row "Labs passed: N / M attempted" (E18) only counts runnable-tier passes; a JAX-only lab silently subtracts from the visible-progress denominator without offering any path to add to the numerator. - The claim-sentinel cross-language coupling (D57) requires the same set of `# claim:` slugs in the Python snippet to appear as `// claim:` slugs in the JS snippet. A lab with only one runnable tier cannot satisfy or violate the contract; the rigor signal goes silent rather than green or red. - The "slide between languages — your code in each tab is saved separately" pedagogy in the template's lab intro implies at least two tabs to slide between. One tab is a single tab. **Authoring escape hatch (`lang-coverage-exempt`).** A lab may opt out by setting `lang-coverage-exempt=""` on the `` element, where `` is a one-line string explaining why the minimum does not apply (e.g., `"tier-IV demo: JAX-only intro to vmap"` for a future M14-style lab whose whole point is a single language's idiom). The attribute presence with a non-empty value disables check (E) for that block and is logged by the lint as a per-block exemption. The reason string is mandatory — silent opt-out is not allowed; the next reader of the file deserves to know why this lab broke the floor. **Implementation status (E12, 2026-05-13):** Mechanized by `scripts/lint-labs.mjs` check (E), wired into `npm run check` via the existing 8th-lint slot (no new lint script — the rule extends an already-present `` author-contract walker). Per-block: every lang in `MINIMUM_LANGS = {python, js}` must appear in the parsed `langs="…"` set after lower-casing and trimming; missing entries fail the lint with a clear "missing minimum coverage of …" message. The `lang-coverage-exempt="…"` attribute (non-empty value) disables this check for the block and is reported as `lint-labs: : lab "" lang-coverage-exempt: `. Today every authored lab in the repo (`_template/module.html`'s 8-lang prototype, `tools/smoke.html`'s 2-lang runtime smoke fixture) satisfies the contract, so the rule landed green on first run; its purpose is to catch the *next* author who omits one of the two from a M2-M16 lab. ### Boss Fight design (D11/D30) The Boss Fight (Section 06 in the template) is a *strictly harder* applied check than the Forge Gate (Section 07) — never the same problem twice. M1's prototype originally re-used the Forge Gate puzzle as the Boss Fight, which let students who solved one auto-pass the other with no second moment of insight. Don't repeat M1's original sin. **Authoring rules:** - The Boss Fight should require *strictly more* of the same machinery than the Gate. If the Gate is a $2\times 2$ matrix-vector product, the Boss Fight is a $3\times 3$ *or* a $2\times 2$ with a follow-up step (residual, normalization, projection, second-derived-scalar). - The Boss Fight may have multiple right answers (e.g., a vector AND a scalar derived from the vector). The Gate has exactly one canonical-form answer per the D10 normalization spec. - The Boss Fight is *not gated* — there is no SHA-256 check. It is a self-paced applied test the student grades against the rendered solution. The Gate is the hash-checked unlock. - If the Gate is a "compute X" puzzle, the Boss Fight should be a "compute X *and explain why*" puzzle — the explanation is what makes it strictly harder, even when the arithmetic floor is the same. - The inline note in Section 06 of `_template/module.html` documents this in author-comments; this §11 sub-section is the project-level contract. **Implementation status (D50, 2026-05-13):** Mechanized by `scripts/check-boss-vs-gate.mjs` (wired into `npm run check`). For each `module-NN-.html` at the repo root, the lint extracts the Boss Fight prompt (either a `

06Boss Fight

` block or a `

` containing `BOSS FIGHT…`) and the Forge Gate prompt (a `` attribute, a §07 forge-section, or a `FORGE GATE` callout). Both are normalized (HTML stripped, whitespace collapsed, lowercased) and compared. Lint fails if either anchor is missing or the normalized prompts are identical. Skips `_template/`, `*-smoke.*`, and `module-9N-*`. Modules that author both sections distinctly inherit the guarantee automatically. ### Forge Gate answer design (D22) A Forge Gate password should be derivable from the *concept* the module teaches, not from the student's ability to multiply. The M9 prototype's "vocab × dim = 38597376" tests arithmetic, not understanding — a student who plugged into a calculator passes; a student who deeply understands embedding sizes but typo'd the multiplication fails. Both outcomes are wrong signals for the tutor. **Authoring rules:** - Prefer answers that fall out of a 1-3 line *derivation* the student must write down. Examples: the cosine of the angle between two named vectors; the gradient of softmax-cross-entropy at a labelled point; the variance ratio that justifies $1/\sqrt{d_k}$. - Avoid answers that are pure *numeric outputs* of long arithmetic chains (vocab × dim, batch × seq × hidden, parameter counts beyond ~5 digits). If the only failure mode is "I miscounted zeros," the gate is not testing rigor — it is testing calculator hygiene. - When the answer *must* be numeric (because the module's whole point is the number — e.g., M3 SVD eigenvalues), constrain the input to 2-3 significant figures and document the rounding rule next to the input field. This collapses the calculator-hygiene failure mode. - Cross-reference D11/D30: the Boss Fight is where the multi-step arithmetic belongs. The Gate is the conceptual sanity check. - Audit targets: M3, M6, M9, M10, M12, M14 — when these modules land, every gate answer should be re-evaluated against this rule before the hash is published. **Implementation status (D69, 2026-05-14):** Mechanized by `scripts/check-gate-d22.mjs` (wired into `npm run check` as the 14th lint). D22 is qualitative ("answers test concept, not arithmetic") and resists static answer inspection because the answer is hashed; the lint therefore mechanizes the *author touchpoint* rather than the answer itself. Every Forge Gate element (either `` template style or legacy `` style inside `

`) MUST declare `data-gate-d22="..."` with one of two allowed values: **`conceptual`** (author has audited the gate against this rule and confirms the answer is conceptually derivable in 1-3 lines) or **`boss-pivot`** (author acknowledges the gate is arithmetic-heavy and cross-refs D11/D30 — multi-step arithmetic belongs in the Boss Fight; `boss-pivot` is intended as a temporary marker, with each instance a candidate for promotion to the Boss Fight side). Lint fails on missing attribute, on out-of-set values, and reports a per-class tally on success. M1 ships `data-gate-d22="conceptual"` on its ``; the template ships `data-gate-d22="conceptual"` on its `` element so every M2-M16 author inherits the declaration as a forcing function — the lint failure message quotes this sub-section directly, so the author cannot ship without consulting D22. ### References convention (D9/D39) Every module ships a closing References section (Section 09 in the template) with at least **three** entries spanning the intro/standard/advanced taxonomy. A rigorous course earns trust by pointing students at primary sources; a hand-wavy course buries them. **Authoring rules:** - **Minimum three entries**, ideally six (M1 ships six). At least one `intro` (a 3Blue1Brown video, a Khan Academy module, a Wikipedia article — something a beginner can open without intimidation), at least one `standard` (a textbook chapter, a foundational paper), and at least one `advanced` (the original paper, a numerical-analysis monograph, a language-spec section). - **Every entry is hyperlinked** (D39 contract). No bare citation text; the student must be able to click and read. Prefer DOI links for textbooks, arXiv abstracts for papers, official YouTube playlists for video; vendor-blog mirrors are last-resort. - **Primary sources preferred** over secondary explainers. If the module derives the He initialization, link to He et al. 2015, not a Medium post about it. Secondary explainers are fine to *supplement* a primary source, never to *replace* it — a student tracking the math should reach the source the field cites, not a paraphrase. - **One-line pointer per entry** — annotate each `

` with a short sentence telling the student *why* they would open this reference, not just *what* it is. "Chapter 3 — discusses the LoC proof Strang uses to motivate cosine similarity" beats "Strang Ch. 3" with no follow-up. - **Level-tag markup** — every entry carries a `` where `{level}` is one of `intro`, `standard`, `advanced`. The hub surfaces this taxonomy in the appendix index (see §6); the per-module tag keeps the level visible inside the module page itself. **Implementation status (D49, 2026-05-13):** Mechanized by `scripts/check-refs.mjs` (wired into `npm run check`). Five per-page checks: `

` entries; every `
` contains a non-empty ``; every `
` carries a `` (with class-attribute fallback); at least one entry per level present. Auto-discovers `module-NN-.html` at the repo root; skips `_template/`, filenames containing `-smoke`, and `module-9N-*` (smoke-fixture slot). --- *This framework is the contract. Any module that violates it should change the framework first.*