entviz Spec Change Log¶
This file records what changed in each version of the entviz specification. The current algorithm is defined in spec.md.
Earlier versions (v1–v3) are archived in the project's git history (browse the docs/v1…docs/v5 folders, or earlier docs/index.md, at the corresponding release commits on GitHub).
What's new in v11¶
v11 adds first-class handling for DIDs (W3C Decentralized Identifiers) and the closely-related URNs (RFC 8141). The prior treatment recognized only a single-segment DID method-specific-id, tokenized every DID as base64url, stripped the method without binding it, surfaced the DID URL as a suffix, and could not parse a #fragment; URNs were not recognized at all (they fell to the UTF-8 fallback as one opaque blob). v11 replaces this with a single, principled generic path shared by both schemes. No other input type changes; all non-DID/URN corpus vectors are unaffected, and new DID + URN vectors are added to the corpus (so this is a breaking change only for DID/URN inputs, which had no deployed certified users).
- Method-specific-id may contain
:; the DID URL is dropped as a free annotation. Per the W3C DID Core ABNF,:is a legal segment separator inside the method-specific-id, so the body now ends only at the first/,?, or#. Everything from that point — path, query (including ION's?-ion-initial-state=…), and#fragment— is dereferencing context that varies independently of the identifier and is dropped (not entered into the core, not surfaced as a suffix), exactly like SWHID;…qualifiers. This unbreaks multi-segment DIDs (did:web:example.com:user:alice,did:webvh:<scid>:domain,did:ethr:0x89:0x…) and fragment-bearing DID URLs, which previously fell through to the UTF-8 fallback. See the new Decentralized Identifiers subsection in the normalization step. - The method name is identity and binds the fingerprint by prefix-fold. The same method-specific-id under a different method denotes a different DID (the swap test), so the
did:<method>:literal is kept as the prefix and the fingerprint hash input becomesprefix ‖ core, exactly as for SWHID/gitoid. Fixes the prior bug where the method was stripped and not bound, sodid:web:Xanddid:key:Xcollided in every fingerprint-driven channel. - The method-specific-id is the core, kept verbatim and case-preserved; tokenized as base64url. Internal separators, a leading multibase selector, a network/chain identifier, and any self-certifying hash all stay in the core — bound (via
prefix ‖ core) because the fingerprint hashes text, not decoded bytes. DID cores are not case-folded (DIDs are case-sensitive per DID Core; the consequence is fail-safe — case differences only ever produce a false negative). Tokenization is uniformly base64url: every non-hex alphabet in this spec already tokenizes on the same 4-character/24-bit boundary, so a per-method alphabet would change only the nucleus-color hint channel, never boundaries, the fingerprint, or the read-aloud text. The core is not percent-decoded (no parser ahead of the hash). - Label. A DID renders with no type and
did:<method>:as its self-describing prefix — the top strip readsdid:<method>:...(the no-type rule already used by SWHID/gitoid). - URNs ride the same path (RFC 8141). A URN (
urn:<NID>:<NSS>) is the same shape as a DID — NID ↔ method, NSS ↔ method-specific-id, the r-/q-/f-components (?+/?=/#) ↔ the DID URL — so it gets the same generic handling: NID bound by prefix-fold, NSS kept verbatim and base64url-tokenized, components dropped (RFC 8141 says they are not part of URN equivalence),urn:<nid>:...no-type label. Two differences from a DID: the NSS keeps/and ends only at?or#(not/); and per RFC 8141 theurn:<nid>:prefix is lowercased (NID case-insensitive) while the NSS case is preserved.urn:uuid:,urn:oid:,urn:isbn:etc. are not per-namespace special-cased. See the Uniform Resource Names subsection. - No per-method/per-namespace special-casing (deferred). The generic path binds a DID's or URN's full identity and is encoding-agnostic at the token boundary, so v11 visualizes every well-formed DID/URN — recognized or not — by one path, with no per-method table. Two additive, non-breaking refinements were deliberately deferred: reusing the Ethereum EIP-55 reject for
did:ethr/did:erc725, and dropping the long-form initial-state ofdid:ion/did:prism. See the Decentralized Identifiers subsection andthis.i:d1dm3th0.
Ports. Per the multi-impl roadmap, this change must propagate to the four sister-repo ports (entviz-js, -rs, -java, -go) once the Python reference and DID corpus vectors land. Each must reproduce Tier A for the new DID vectors. (Tracked as tick 5yau.)
What's new in v10¶
v10 ("casual avalanche") closes a measured perception gap. Over 100,000 one-character-neighbour pairs, ≈24% of the background-unchanged cases were casually colour-identical at baseline — concentrated in dense full-grid inputs (a random UUID vs. its one-character neighbour: ~61% within that quarter). v10 injects fingerprint signal into colour as a few rare, discordant singletons the eye catches pre-attentively, taking that quarter to ~0.33% colour-miss. These add casual salience only — the careful-comparison channels (surround pattern, colour bar, ellipse, blank positions, quartile marks) are unchanged, and this is explicitly not a collision-resistance claim. See the Casual avalanche (v10) section.
- Fingerprint-sourced surround edge colour on three cells. Three cells override the nearest-palette edge-colour rule and take their edge colour directly from the fingerprint: grid position 0 (top-left, the first-fixation cell) and the 1st/2nd quartile-ftok cells. Each uses
edge_palette[q & 0b11], whereqis that cell's own used-ftok quant. The nucleus background colour is unchanged (still entropy-derived and lossless); every other filled cell keeps the nearest-weighted_rgb_distancerule. The override is skipped when the cell is blank or its quartile ftok is null, and applies once if position 0 is also a quartile cell. Kept to ≤ 3 cells so the discordant hues stay pre-attentive singletons rather than confetti. See the Fingerprint-edge cells (v10) rule in the cell-rendering step. - Fingerprint blank fill (was
fill = none). A blank cell's rounded "pill" now fills its interior from the fingerprint: enumerating the filled blanks in cell-index order, the j-th takesedge_palette[digest[32 + j] & 0b11](digest bytes 32..59, disjoint from the ellipse's 60..63). The pill outline is retained (it preserves the "gap" reading and keeps a blank distinct from a nucleus). See the Blank fill (v10) rule. - Hybrid map-blank fill. If the map-bearing blank is the only blank, it is filled from the fingerprint like any filled blank — the common small-input case (LEI, small hex, 18-char base36) where casual-avalanche colour is most needed; otherwise it keeps the v9 white/gold anchor fill while its sibling blanks carry fingerprint colour, and the sole-blank map markers may be recoloured to luminance contrast. See the Map rendering rule.
- Conformance surface. The three fingerprint-edge cells' edge colours and the blank-pill fills are fingerprint-driven and part of the matched render model and Tier-B raster (edge colour is still exposed via
data-edge-color). No new hash is introduced — all of it reuses the primary fingerprint digest.
What's new in v9¶
- Crockford base32 middle cells (was hex). For >512-bit inputs, the 4 middle ("fingerprint") cells now render the second, domain-separated digest as 5 lowercase Crockford base32 characters per cell (the 24-bit big-endian value
second[3i]·2¹⁶ + second[3i+1]·2⁸ + second[3i+2], high-order zero-padded, alphabet0123456789abcdefghjkmnpqrstvwxyz— noi/l/o/u), replacing v6's 6 lowercase hex. It stays injective (32⁵ = 2²⁵ ≥ 2²⁴), so the ≈2⁹⁶ partial-preimage barrier (4 cells × 24 bits) is unchanged. 5 characters is the provable floor for "injective on 24 bits, single letter-case, homoglyph-clean": 4 characters would need a ≥ 64-symbol (6-bit) alphabet, forcing mixed case or punctuation (re-importing read-aloud "cap" syllables + a homoglyph surface), and base58 can't reach 24 bits in 4 chars (58⁴ < 2²⁴). The existing per-cell size rule auto-yields4/5 = 0.80×(10 pt at the 12 pt reference, up from hex's 0.75×). The "this is a digest, not your data" cue is improved: Crockford differs from a hex input by both alphabet and rendered size (v6 hex was weakest precisely on a hex input).DOMAIN_TAG = "entviz/fingerprint-middle/v6\0"is unchanged (itsv6is the construction version, not the spec version). Breaking change to >512-bit renders; accepted (no deployed users yet). See the Large-input handling subsection, the cell-text rendered-size rule, andthis.i:cr0ckmid/this.i:d1scr3t3. - Color-bar band order decoupled from height. The bands' vertical order is now set by each 2-bit pattern's first-appearance order while scanning the digest's 256 disjoint 2-bit slices (tie-break by pattern value,
00<01<10<11), independent of band heights (heights still eachcount^4in descending magnitude — only the vertical sequence changes). Through v8 the order was descending count =argsort(heights), carrying no information beyond the heights; decoupling adds a few reliable discrete bits (read the letters top to bottom) at zero glance and zero CVD cost, and does not affect the match task.data-color-bar-ranknow reflects the decoupled first-appearance order. See the color-bar step andthis.i:b4rm4rks/this.i:d1scr3t3. - Two color-bar markers (new discrete channel, always present). A small filled circle rides each gutter — left and right — in one of
K = clamp(floor(bar_height / 12px), 4, 16)equal fixed slots (independent of bands): left at slotsecond[12] mod K, right atsecond[13] mod K. Identity is carried by side, not shape — an earlier draft used a square (left) and a triangle (right), but shape discrimination was unreliable at the bar's scale (on a dark band the black halo vanishes, leaving a too-small inner glyph), and the left/right gutters already distinguish the two. The driving digestsecond = SHA-512(DOMAIN_TAG ‖ core)is now computed for every input (one extra SHA-512), so short inputs gain this channel too — closing the coverage hole where the blank-cell map vanishes on an exactly-filled grid.bar_widthis unchanged (markers live in ~4 px gutters inside the existing bar); left/right placement means the two can never overlap. Markers are drawn opaque — white fill + ~0.75 px black halo, notmix-blend-mode/difference compositing — for cross-rasterizer portability (the same reason the blank-cell map avoids blend modes, F-A6) and so a band-straddling marker stays visible (white core on the dark side of the cut, black halo on the light side); the contrast is pure lightness (CVD/grayscale-safe). Painted after the bands and letters, before the final gray border. Honest framing: tripwire-tier (~6 independent hard bits), not the security backbone. See the color-bar step andthis.i:b4rm4rks/this.i:d1scr3t3. - Conformance surface for the markers. The render model's color bar field gains the two marker slots and
K; the SVG profile adds required attributesdata-bar-marker-left,data-bar-marker-right(each the integer slot index) anddata-bar-slots(the integerK), and the normative paint-order list records the markers within the color-bar layer (after the band letters). No new hash is introduced — the markers reuse the existing second digest /DOMAIN_TAG. See the SVG profile, render model, and paint-order sections.
What's new in v8¶
The v8 release (PR #11, "blank-map-and-determinism") collects four conformance and accessibility fixes from the 2026-06-08 review milestone, and re-freezes the golden corpus.
- Deterministic snowflake detection — no wall clock (SPEC-F1). Snowflake classification previously read the current time, so an identical boundary decimal could render as
snowflakeat one moment andhexat another — a violation of the determinism MUST. v8 classifies a 17–20-digit decimal as a snowflake iff its value fits in a signed 64-bit integer (sign bit clear,< 2⁶³) — a property of the bit pattern, not of the date — and pins the epoch, bounds, and predicate in the spec. The comparison is identical either way; only the type label and tokenization differ. See the decimal-alphabet note in the tokenization step andthis.i:sn0wfl4k. - Blank-map marker positions as
(row,col)attributes (SPEC-F2). The two map markers had emitted booleandata-blank-map-min/max="true", leaving a Tier-A checker to reverse-engineer the marked cells from pixel geometry. v8 carries each marker's position as a literal"row,col"string indata-blank-map-min(min) anddata-blank-map-max(max), and flags the map-bearing blank withdata-cell-blank-map. See the SVG profile and the blank-cell step. - base32 disproof path canonicalizes to UPPER (SPEC-F3). A bare
base32fragment matched by alphabet-disproof was lowercased, but the spec mandates base32 → UPPER (RFC 4648); the wrong case changesSHA-512(core text)and therefore every channel, so a disproof-matched fragment diverged from the same value parsed by a specific base32 parser. v8 canonicalizes the disproof path to UPPER for per-alphabet consistency. See the alphabet-detection-by-disproof step. - Blank-map max marker is a shape, not just a hue (PSY-F1). The blank-cell map — the landmark a habituated reader checks first — distinguished max from min by colour alone (red vs blue dots), which collapse to near-equal grays under achromatopsia (ΔL* ≈ 7.8), so a reader could see that two cells were marked but not which was which. v8 gives the max marker a distinct shape, a red plus, while the min stays a blue dot, so the distinction survives total colour blindness; colour is retained as a redundant cue. See the blank-cell step and its rationale.
What's new in v7¶
- The fingerprint hashes canonical normalized text, not decoded bytes (now explicit). Earlier spec wording ("SHA-512 of the normalized entropy bytes") was ambiguous; v7 makes it normative (MUST hash the UTF-8 of the normalized core text; MUST NOT decode to raw bytes first). A consequence, now stated as an explicit non-goal: cross-encoding invariance — the same 32 bytes as hex vs base64, or one CID in two multibases, render as different entvizes. This is deliberate and fail-safe: it keeps every channel agreeing on identity (the text channel is verbatim and can't be made encoding-invariant), removes a decoder-malleability collision surface (distinct strings decoding to identical bytes), and keeps the multi-implementation fingerprint byte-reproducible. No behavioral change to the renderer — this ratifies and documents existing behavior. See the new spec subsection Why the fingerprint hashes text, not decoded bytes,
threat-model.md, andthis.i:h4shtext. - Identity-bearing prefixes bind the fingerprint (and, where possible, the cells). Each part of the input is classified presentation / identity / annotation (the swap test: hold the body fixed; if another legal prefix could sit there and change the value with the body unchanged, it's identity; if the body would have to re-encode, it's presentation — this cleanly separates multibase from multicodec). Identity material binds the fingerprint by one of two mechanisms: (a) kept in the core when it shares the body's alphabet and is contiguous, so it is rendered in the cells and hashed — used by the CESR derivation code (cell 0 now shows the code) and the LEI LOU issuer code (the core is now LOU + "00" + entity, not just the entity body); or (b) prefix-fold (
SHA-512(prefix ‖ core)) when it is a different alphabet from the body — used by the SWHID/gitoid object-type and hash-algorithm (without which a SWHID and a gitoid over the same git hash would collide). Multibase is presentation (already stripped); a CID's multicodec is identity but already lives inside the base32 core, so it was already bound. Affects CESR, LEI, SWHID, and gitoid entvizes; all presentation-prefix types (Ethereum, etc.) render identically to v6. Seethis.i:s3mpr3fx. - CESR code-table corrections. The blinding factor is the lowercase code
a(44 chars); it had been mis-entered as capitalZ, which is actually Tag11 (12 chars) — so reala…blinding factors failed to parse and a bogus 44-charZ…string would have been accepted. The FN-DSA (FIPS 206) post-quantum codesb/c/d/eand1AAQ/1AARwere added, and1AAB/1AAJrelabeled "pub/enc key" to match the spec table. The gallery KERI/CESR taxonomy was corrected:Bis a non-transferable AID, the usual transferable AID is the self-addressingE(Blake3-256 of the inception event), and a bareDis a verification key — not "the transferable AID". Authoritative source: the CESR specification master code table. - Conformance formalization (editorial — no output change). The spec gained RFC 2119 requirements language (a Notation and requirements language section; load-bearing instructions re-voiced to MUST/SHOULD, rationale marked Note/Rationale as non-normative) and a normative Conformance section defining: the abstract render model an implementation must compute; the three conformance tiers (A semantic model, B canonical raster as the visual authority, C browser smoke) and conformance levels; the equivalence relation (which SVG-serialization differences a checker ignores — namespace, attribute order, numeric formatting, salted ids, grouping); the SVG profile (required
data-*attributes,viewBox, and the normative paint order that fixes layering/occlusion); and the error conditions every implementation MUST enforce (EIP-55 reject, note-sanitization reject, out-of-range parameters). This adds no algorithm and changes no rendered pixel; it backs the cross-language conformance suite (compliance/) and the Rust/TypeScript implementations. Seethis.i:c0nf0rm1/c0mpsu1t.
What's new in v6¶
- Long-input middle = the fingerprint, not body slices. For >512-bit inputs, v5 filled the 4 middle text cells with body slices sampled at fingerprint-derived offsets. That made the middle text differ between inputs only probabilistically — a low-entropy/structured body could render identical middle cells for two different inputs, so a screen-reader / read-aloud comparison (which can't see the gestalt) could miss a difference. v6 fills the middle from a second, domain-separated fingerprint —
SHA-512("entviz/fingerprint-middle/v6\0" ‖ core)— rendered as hex (6 chars = 24 bits per cell), so the middle text is injective and guaranteed to avalanche on any input change for every alphabet, and is independent of the primary fingerprint that drives the gestalt. Head and tail stay real entropy (recognition + verification); matching the 4 displayed fingerprint tokens is an injective ≈2⁹⁶ partial preimage, independent of matching the gestalt. The middle nuclei are painted with the entviz background color (neutral, framed with a gold/white border) since they no longer carry entropy in their bg. Blank placement for long inputs also now uses the same median/quartile shift as short inputs — v5's two fixed separator blanks are gone — so a long input's blank layout varies per input instead of being identical for every long input. See the "Large-input handling" subsection in the spec. - Wider color bar. The color bar's width doubles from
box_heighttobar_width = 2·box_height, so the per-band color letters (w/g/r/b/k) render legibly. The bar width is now a named geometry term. The letters render at the cell-text size, bottom-anchored within each band. See the geometry and color-bar steps in the spec. - Redesigned blank-cell marker. v5's white-disc-with-clock-hands marker is replaced. Every blank cell carries a black-outlined rounded rectangle; the first blank cell additionally becomes a map — a miniature scale model of the grid, filled white (or gold on a white-background entviz), with a red dot at the maxftok cell's position and a blue dot at the minftok cell's. This conveys positions (not directions) and uses no
mix-blend-mode, so it renders identically in browsers and non-browser rasterizers (closing adversarial finding F-A6 for this channel). See the blank-cell step in the spec. - Clamped ellipse overlay. The ellipse's
rx/rybounds change from[nucleus_height, d_far − cell_width]to[0.22·d_far, 0.58·d_far], so the overlay covers a noticeable but partial share of the grid on every grid size — never an invisible sliver, never swamping the grid. See the ellipse-overlay step in the spec andreviews/ellipse-audit-2026-06-02.md. - Refreshed font fallback chain. The monospace fallback chain is reordered and extended to cover iOS and Android explicitly (not just Linux/Windows/macOS). See the font-family note in the spec.
What's new in v5¶
The v5 changes are concentrated in three places:
- Head + tail + fingerprint-selected middle slices for long inputs. v4 truncated >512-bit inputs to head-256 + tail-256 only; an attacker who could match those 64 bytes had a byte-identical text channel and only had to grind the fingerprint-driven gestalt channels. v5 reserves four of the 22 cells for middle slices sampled at byte offsets derived from the fingerprint, so two long inputs that share their head and tail still differ in the text channel unless the attacker can also match four specific 3-byte windows inside the body — and those windows move as a function of the full input. See the tokenization step in the algorithm and the "Large-input handling" subsection in the spec.
- Loud truncation marker. The quiet
^…$marker on the top label strip is replaced by a bold redfingerprint ofprefix (read as "fingerprint of \<the type>" once the type label follows it). The marker now reads as a warning rather than a footnote. The original entropy byte length is conveyed by the type-label parenthetical (e.g.hex(200),b64(119)) immediately after the marker, so it is not repeated inside the marker itself. See the label-strips step in the spec. - Color-bar letters. Each color-bar band now carries a centered lowercase letter naming its color (
w,g,r,b,k). This gives the color bar a verbal label that survives complete color blindness, monochrome display, and CSS color filtering, and makes spot-check verification ("what's the top band?") possible without trusting hue discrimination. See the color-bar step in the spec.
The v4 base — described in the main spec — is otherwise unchanged from the v4 spec. Implementations that supported v4 need only make the three changes above to support v5.
What v4 changed¶
v4 kept v2's fingerprint, large-input handling skeleton, and overall structure, and kept v3's color bar skew, color bar frame, ellipse overlay rework, and hex font fitting. The substantive v4 changes are concentrated in the edge channel and a couple of geometry/layout choices that follow from it:
- The v3 edge channel — 6 edge rects per cell, each filled with a cubist or polygon shape using a per-edge XOR rotation through a 4-color edge palette — is replaced by a 24-box surround per cell. Bit i of the ftok's quant (LSB = bit 0) controls whether box i is filled or empty. The surround tiles the entire region around the nucleus with no corner rects.
- Each cell has a single edge color chosen as the palette entry (one of the 4 non-bg colors) perceptually closest to that cell's nucleus background. Per-edge color rotation,
color_shift, andshape_shiftare gone. - Nucleus geometry decouples from font size in the vertical direction:
nucleus_width = 3·font_size_px(unchanged) butnucleus_height = 1.25·font_size_px(was equal to font_size_px). The 25% vertical extra makes room for the glyph descenders of monospace fonts whose bounding boxes extend below the em-box (most of them); v3 had this descender-protrusion latent but it was masked by sparsely-filled edge shapes. With v4's densely solid surround, descenders protruding into the surround region became visible. - Cell aspect ratio is therefore 3:2 (not v3's 2:1). cell_width = nucleus_width + 2·box_width = 10·box_width = 3.75·font_size_px; cell_height = nucleus_height + 2·box_height = 4·box_height = 2.5·font_size_px. Surround boxes are no longer square:
box_width = nucleus_width/8 = 0.375·font_size_px(derived from the horizontal tiling — 10 top-row boxes span nucleus_width + 2·box_width);box_height = nucleus_height/2 = 0.625·font_size_px(derived from the vertical tiling — 2 side-column boxes stack to nucleus_height). - The shape count summary (SCS) is removed entirely; there are no shapes left to count. The bounding rect contracts by (nucleus_height + GM) on the bottom.
- The color bar's data source changes: v3 tallied per-edge color usage; v4 tallies the four 2-bit patterns (00, 01, 10, 11) across the 256 disjoint 2-bit slices of the SHA-512 digest. The count⁴ skew, descending sort, and rendering geometry are unchanged.
- A small SVG-portability fix for the ellipse overlay: the clipPath id is salted with the fingerprint and grid dimensions so that multiple entvizes embedded in the same HTML document do not collide on a shared id (which silently makes the browser resolve every
url(#…)to the first matching id document-wide).