Codecs & canonical content

The codec seam

Prism Core is pure. It defines a container — a tree of path-addressed prims holding typed values, wired by connections — and a single seam, the codec, for translating that container to and from foreign formats. The native .prisma text and .prism binary live in the core; every other format is a codec, implemented in a separate library so the core never learns a file format.

A codec is bidirectional, and both directions are fallible:

struct Codec {
  std::string name() const;
  std::optional<std::string> encode(const Document&) const;    // Document → bytes, or nullopt
  std::optional<Document> decode(std::string_view) const;      // bytes → Document, or nullopt
};

encode may decline a Document it cannot represent: asking a plain-text codec to write an arbitrary scene returns nothing rather than garbage. That fallibility is what lets the CLI dispatch blindly over a registry of codecs and still fail cleanly.

Canonical content

The interesting part is what the codecs aim at. A codec does not translate a file straight into ad-hoc prims; it targets a canonical model for a kind of content, and that model is shared by every codec of its kind. The JSON, Markdown and HTML codecs all read and write the same document model. An image codec for PNG and one for OpenEXR would target the same image model.

That shared target is the whole point. It is why a .png and an .exr would land in the same structure inside Prism, and why Prism can sit in the middle of a conversion as a neutral form:

prism convert notes.md   notes.html     # markdown → document model → html
prism convert data.json  data.prisma    # json → a navigable Prism document

Convert notes.md to notes.html and the two files parse to the identical document model — same headings, same lists, same links — because both codecs speak it. There is no per-pair adapter; there is one model and a codec on each side.

The content kinds

Each kind of content has one canonical model, and Prism's native shape is the scene/structured kind:

Document — headings, paragraphs, lists, links, emphasis, code, tables, blockquotes. Markdown, HTML, plain text.
Vector — a structural tree of shapes and groups with their attributes. SVG.
Image — typed channels: floating-point, multilayer, arbitrary named channels (the cryptomatte-style AOV case). PNG, EXR.
Scene/structured — Prism's native form, and the JSON mapping: objects and arrays become path-addressed prims so every node is reachable and editable.

Animation is not a separate kind. It is the time-sample axis cutting across all of them: an animated GIF is the image kind plus time; a moving camera is the scene kind plus time.

These do not need four unrelated schemas, because they reduce to one substrate. An image channel, a vertex position, a vector control point and a heading's text are all typed array values; layers and AOVs are just named properties — the same mechanism that carries a material parameter. That is the deep reason the canonical models can be shared at all.

A lossless contract

Every codec is held to a stated contract: a within-format round-trip is the identity — convert doc.md → doc.prisma → doc.md and you get your file back — and cross-format degradation is defined and tabulated, never silent. A construct a target format cannot express degrades in a documented way (an HTML-only island, say, becomes a verbatim raw block on the way to Markdown), so you always know what a conversion will and will not preserve.

The currently shipping codecs cover JSON, Markdown, HTML, SVG, plain text (as lines) and arbitrary blobs (as bytes) — enough that nothing you hand the CLI is ever stranded, and enough to build these docs entirely out of converted Markdown.

‹ The .prism binary format Bundles ›