/ specifications / sha256::86e79689b8db8493737cb0dc25b18cae8b3ecc3921d1d2ff6f7b1252b5ebe1e6

[spec] SKEIN Mesh — Emoji Address Encoding

spiritengine · 2026-06-01 13:43 · status: open


Provenance
content hash: sha256::86e79689b8db8493737cb0dc25b18cae8b3ecc3921d1d2ff6f7b1252b5ebe1e6
signature: UNSIGNED — Sigstore signing not yet wired into publish


SKEIN Mesh — Emoji Address Encoding

SKEIN addresses are text-canonical (alias::myproject::sha256::<digest>). On top of that text form sits an optional, fully reversible emoji encoding — a compact, pasteable representation of an address that still resolves to exactly the same content. It is a layer over the canonical text, not a separate addressing scheme.

What it encodes, and why it resolves

A five-emoji folio identity carries 50 bits, while a full content digest is 256 bits — so the emoji form cannot carry the whole digest. Instead it encodes the station completely and the folio as a 50-bit short-hash prefix.

Resolution works by:

  1. Decoding the station exactly (the station is always fully encoded).
  2. Asking that station to resolve the 50-bit prefix to a full folio.
  3. Recovering the full 256-bit digest from the resolved folio — never from the emoji itself.

Because the station is always encoded in full, resolution always reaches the right station; only the folio prefix is expanded there.

Alphabet

The alphabet is 1024 single-codepoint emoji, drawn from Unicode 9.0 or earlier, with no zero-width joiners, modifiers, or variation selectors — visually distinct and culturally neutral. 1024 is near the practical ceiling for clean single-codepoint emoji: larger alphabets would require joiner sequences that break codepoint parsing and degrade on some platforms, and would not help anyway, since station encoding is bottlenecked at two characters per emoji regardless.

Code ranges

The 1024 values are disjoint ranges, decoded by position:

Canonical encoder

Greedy left-to-right two-character pairing: a singleton is emitted only on a forced boundary or a final odd character, and a digit terminates the current cluster and is emitted as its own digit emoji.

The decoder rejects non-canonical input

The decoder maps each emoji to its fixed fragment by range, then re-encodes canonically and rejects the input if the result differs. This is what makes the emoji → text → emoji round-trip hold: a non-canonical stream is invalid input, never something silently accepted.

Stream layout

The stream is [brand][route?][station…][type][identity×5], decoded right-anchored:

The identity is fixed at exactly five emoji, which is what makes the right-anchored decode unambiguous.

Fixed-length trade-off

Because the emoji identity cannot grow, two folios in one station that collide on 50 bits share an identical emoji address and fall back to the text full hash, which always works. Collisions are improbable in practice — the 50% birthday point is on the order of tens of millions of folios in a single station — and the fallback is graceful.

Status

The encoding's constraints are fixed; the remaining work is curating the final 1024-emoji alphabet, locking the encoding specification, and the rendering fallback.