Conformance Runner Protocol
Defines the contract between the conformance corpus and any XTL implementation that wants to claim conformance.
What a runner is
A conformance runner is a small program that:
- Iterates over fixtures in
conformance/fixtures/ - Invokes the implementation under test on each fixture's
template.xlsx+data.xlsx - Compares the implementation's output to the fixture's
expected.xlsx(orexpected/directory) - Reports pass / fail / skip per fixture in a standard format
Each implementation provides its own runner (since invocation is language-specific), but all runners produce comparable output.
Fixture loading
A runner discovers fixtures by enumerating subdirectories of conformance/fixtures/. Each subdirectory is named <NNN>-<slug>/ (e.g., 001-basic-substitution).
For each fixture, the runner reads:
template.xlsx— input templatedata.xlsx— input source dataexpected.xlsx(single-output case) orexpected/directory of.xlsxfiles (multi-file group case, including zero-output cases)meta.yaml— fixture metadata
Static fixtures that expect zero output files use an empty expected/
directory.
Error fixtures omit expected.xlsx and expected/. They declare
expected_error in meta.yaml; the expected result is that the implementation
reports an error whose message contains the declared text.
Dynamic fixtures omit expected.xlsx and expected/. They declare
expected_dynamic in meta.yaml; the expected result is computed by the runner
from the runner-start timestamp and the declared assertion rules. Dynamic
fixtures are reserved for behavior that is explicitly time-dependent in the
spec, such as TODAY().
Required meta.yaml fields
description: string # one-line human description
spec_section: string # the spec section this fixture exercises
spec_version: string # minimum XTL version (e.g., "0.1")
tags: [string, ...] # filter tags (e.g., [substitution, repeat, aggregate])
tags is a fixture-side convenience for the --filter=<tag> CLI
flag. Tag values are NOT part of the conformance contract — runners
MUST treat them as opaque strings and SHOULD NOT reject a fixture
because its tag set differs from another fixture's. The reference
corpus uses lowercase, hyphen-separated tokens but does not enforce a
canonical taxonomy.
Optional fields:
verified_by: [hand | excel-formulas | manual-script | reference-impl]
expected_warnings: [string, ...] # warnings the impl should emit
expected_error: string # expected error message substring; no expected output is required
expected_error_code: string # optional ADR-0015 stable error code (e.g. "xl3/source/undeclared")
expected_dynamic: string # dynamic assertion kind; no expected output is required
comparison_stage: 1 | 2 # minimum comparison stage for static-output fixtures; default is 1
skip_reason: string # if fixture is currently broken
inputs: # host-supplied runtime inputs (ADR-0010)
- name: region
value: Seoul
The inputs block lists name/value pairs that the runner passes to
the implementation as runtime inputs (per ADR-0010's _inputs sheet).
Runners MUST forward these values to the implementation's conversion
entry point. Templates without an _inputs sheet ignore the field.
Stage-gating metadata:
comparison_stageapplies only to static-output fixtures. It defaults to1. Use2only when the fixture asserts workbook content that Stage 1 cannot observe, such as styles, merges, package parts, or binary media.expected_errorfixtures andexpected_dynamicfixtures do not use workbook comparison stages for pass/fail. A runner still reports the active run stage, but these fixtures keep their own error or dynamic assertion rules.expected_dynamicrequiresdynamic_cellsfor the currently definedutc_todayassertion kind. Static-output and error fixtures omitdynamic_cells.
A runner MUST mark an expected_error fixture as:
passwhen the implementation reports an error containingexpected_errorfailwhen the implementation succeedsfailwhen the implementation reports a different error
expected_error and expected_dynamic are mutually exclusive.
Dynamic assertions
Dynamic assertions make render-time behavior testable without committing a
stale expected.xlsx. A runner MUST capture a single runner-start timestamp
before executing the first fixture and use that timestamp for every dynamic
fixture in the run. This avoids midnight-boundary differences between fixtures
within the same report.
XTL 0.1 defines one dynamic assertion kind:
expected_dynamic: utc_today
dynamic_cells:
- sheet: Report
cell: A2
format: YYYY-MM-DD
For utc_today, the expected value for each listed cell is the UTC calendar
date from the runner-start timestamp, formatted with the listed XTL TEXT()
date format. The implementation output MUST contain the expected string value
at each listed sheet/cell coordinate.
A runner MUST mark an expected_dynamic fixture as:
passwhen the implementation succeeds and every listed dynamic cell matchesfailwhen the implementation reports an errorfailwhen any listed dynamic cell differs from the computed expected value
Runners that do not implement a declared expected_dynamic kind MUST mark the
fixture as skip and include a reason. They MUST NOT report it as passed.
Comparison stages
The conformance protocol has two comparison stages:
- Stage 1: cell-value comparison. The runner compares worksheet names and
non-auxiliary cell values after loading
.xlsxfiles through a spreadsheet library. This stage intentionally ignores styles, merges, page setup, embedded media, formulas beyond cached values, and package structure. It is sufficient for the XTL 0.1 bootstrap corpus while canonical OOXML comparison is being specified and implemented. - Stage 2: canonical OOXML comparison. The runner compares generated
.xlsxfiles after canonicalizing their OOXML packages. This is the target for full static-output conformance because it can catch layout, style, merge, sheet structure, and package regressions that Stage 1 cannot see.
Error fixtures and dynamic fixtures are not workbook-output comparisons. They
keep their expected_error and expected_dynamic pass/fail rules regardless of
comparison stage.
Reports SHOULD identify the comparison stage used for each run. An
implementation MUST NOT claim Stage 2 conformance from a Stage 1-only run.
Static-output fixtures MAY declare comparison_stage in meta.yaml. A runner
MUST skip a fixture whose declared comparison stage is greater than the runner's
active stage.
Stage 2 output comparison
Comparison is performed on canonicalized OOXML. The minimum canonicalization rules:
- Files within the zip MUST be compared by content, not by zip metadata (timestamps, compression, entry order, or compression level).
- Package part names MUST match after canonicalization. Missing or extra workbook parts are differences unless a later ADR marks the part volatile.
- XML files MUST be compared after parsing and re-serializing with deterministic namespace declarations, attribute order, quote style, and empty-element representation.
- XML element order MUST be preserved unless a later ADR explicitly marks a specific element collection as unordered. Relationship files are ordered package data, not sets, until such a rule exists.
- The following fields are stripped before comparison (they reflect generator metadata, not content):
cp:lastModifiedBy,dc:creator,dcterms:created,dcterms:modified- Any
<calcPr>calcIdattribute (Excel calc engine version) - Generated sheet ids and sheet part filenames when they can be resolved through workbook relationships and sheet names
- Default page setup values that ExcelJS may add or omit (
copies="1",firstPageNumber="1",useFirstPageNumber="1")
- Insignificant whitespace within text runs is preserved (it can be semantically meaningful).
- Cell
r(reference) attributes MUST match exactly; cell ordering within<row>MUST match. - Binary package parts, such as images, MUST be compared by exact bytes.
The JS reference runner includes a Stage 2 canonicalizer for conformance comparison. It is intentionally scoped to the OOXML produced by supported XTL fixtures plus the normalization rules above; it is not a general-purpose XML canonicalization library. In particular, it does not claim full XML C14N support, DTD/entity processing, semantic namespace rewriting, or application specific unordered-collection rules beyond those explicitly listed here. Fixtures that need additional OOXML equivalence rules should update this protocol first.
Known canonicalization gaps
These cases are NOT normalized by the current canonicalizer. They are treated as differences if they ever appear. See ADR-0006 amendment for the rationale.
- Default attribute equivalence. A boolean attribute that an
OOXML default specifies, when omitted vs. emitted as the default
value (e.g.,
applyFont="0"), is treated as a difference. - Color hex case.
rgb="FF000000"andrgb="ff000000"compare as different strings. - Namespace prefix bindings. Different prefixes bound to the same namespace URI are not unified.
When a cross-writer fixture exposes one of these gaps as a genuinely volatile difference (not a content difference dressed up as one), the protocol and reference canonicalizer should be extended together. Implementations MUST NOT silently relax these rules locally.
Runner CLI conventions
Implementations should expose a runner with this minimal interface:
<runner> [--fixture-dir=<path>] [--filter=<tag>] [--spec-version=<x.y>] [--comparison-stage=1|2] [--report=json|text]
Implementations that provide a Stage 2 canonicalizer SHOULD also expose a debugging command that prints canonical package content in deterministic part order:
<runner> canonicalize <input.xlsx> [--part=<canonical-part-name>]
When --part is omitted, the command SHOULD emit a JSON object keyed by
canonical package part name. When --part is present, it SHOULD emit only that
canonical part's content.
JSON report format:
{
"implementation": "xl3-js",
"version": "0.1.0-alpha.0",
"spec_version": "0.1",
"comparison_stage": 1,
"results": [
{
"fixture": "001-basic-substitution",
"status": "pass",
"duration_ms": 12
},
{
"fixture": "007-aggregate-sum",
"status": "fail",
"duration_ms": 8,
"diff": "cell B5: expected 1234, got 1234.0"
}
],
"summary": {
"total": 42,
"passed": 40,
"failed": 1,
"skipped": 1
}
}
Reporting conformance
An implementation reports its conformance level by linking to a public conformance run. The expected form:
xl3-py 0.2.0 — XTL 0.1 conformance: 38/42 (passes filter, repeat, aggregate; fails image-clone, _config-pattern-match, two date-edge cases)
The repo's IMPLEMENTATIONS.md lists known impls and their conformance levels.