Class reference
Codecs & canonical content
Codecs
A Document is Kinogaki Core's one native shape: a tree of path-addressed Elements holding typed Values, wired by connections. The native .prisma text and .prism binary serialize that shape directly. Every other format is a codec — a bidirectional translator between foreign bytes and a Document. The codec layer is how a Markdown note, an HTML page, a JSON tree, or an SVG drawing become a Document you can read, query, edit, and emit back out in any format that can represent it.
This is also the layer the rest of Kinogaki renders through. The issue tracker and these docs are authored in Markdown, normalized into the document model, and emitted as the HTML page you are reading. The pipeline is decode → Document → encode, and the Document in the middle is plain, diff-clean, and inspectable.
Orienting example
Read a Markdown file, look at it as a Document, write it back out as HTML:
#include "kinogaki/Codecs.h"
using namespace kinogaki;
Document doc;
doc.load("notes.md"); // Markdown → Document (Codec::Auto picks it from .md)
std::string prisma = doc.toString(); // the document model, as .prisma ASCII
doc.save("notes.html", Codec::Html); // emit the same model as HTMLimport kinogaki as kg
doc = kg.Document()
doc.load("notes.md") # Markdown → Document (codec from the extension)
prisma = doc.to_string() # the document model, as .prisma ASCII
doc.save("notes.html", kg.Codec.HTML) # emit the same model as HTMLMarkdown and HTML target the same document model, so loading through one codec and saving through another is a real conversion with no per-pair adapter. There is one model and a codec on each side.
The Codec selector
Codec is the one value you pass to every read or write. It names the format. Auto resolves from a file's extension on load/save, or sniffs the native bytes on decode.
enum class Codec { Auto, Prism, PrismBinary, Json, Markdown, Html, Svg, Text, Blob };class Codec(enum.IntEnum):
AUTO = 0
PRISM = 1 # .prisma ASCII
PRISM_BINARY = 2 # .prism binary
JSON = 3
MARKDOWN = 4
HTML = 5
SVG = 6
TEXT = 7
BLOB = 8The integer values match kinogaki::Codec and the C ABI, so the Python IntEnum is the same selector across the wall.
| Codec | Format | Target model | Notes | | --- | --- | --- | --- | | Auto | by extension / sniff | — | the default for load and save | | Prism | .prisma ASCII | native | the default for toString | | PrismBinary | .prism binary | native | the only codec that honors compress | | Json | JSON | scene/structured | encode renders any Document | | Markdown | Markdown | document | CommonMark common-core subset | | Html | HTML | document | superset of the Markdown subset | | Svg | SVG | vector | structural element tree, not a blob | | Text | plain text | document (text) | lossless lines, byte-exact round-trip | | Blob | arbitrary bytes | document (blob) | lossless bytes, byte-exact round-trip |
codecName, codecByName, and codecForPath map between a Codec and its name/extension:
const char* codecName(Codec codec); // "markdown", "html", …
std::optional<Codec> codecByName(std::string_view name); // "markdown"/"md"/"json"/… → Codec
Codec codecForPath(std::string_view path); // by file extension; Blob fallback
Encode and decode
The two in-memory primitives convert bytes to and from a Document. Both directions are fallible.
std::optional<Document> decode(std::string_view bytes, Codec codec, ParseError* err = nullptr);
std::optional<std::string> encode(const Document& doc, Codec codec, bool compress = false);
decode normalizes foreign bytes (or native .prisma/.prism) into a Document. On malformed input it returns nullopt and fills err with a located diagnosis — codecs fail closed, they never hand back a half-parsed Document. encode renders a Document as the codec's bytes, and may decline: asking a text codec to write an arbitrary scene returns nullopt rather than garbage. That declining is what lets a tool dispatch blindly over the codec set and still fail cleanly. compress applies to PrismBinary only.
def decode(data: "str | bytes", codec: Codec) -> Document # raises PrismError if it can't parse
def encode(doc: Document, codec: Codec, *, compress=False) -> "str | bytes" # raises if it declines
In Python the fallibility surfaces as a raised PrismError instead of nullopt. Text codecs return str; PRISM_BINARY and BLOB return bytes.
The Document methods
The everyday surface is the methods on a Document — load, loadString, toString, save. They wrap decode/encode with file I/O and an Auto-by-extension default. load/loadString replace the document's contents.
bool load(const std::string& path, Codec codec = Codec::Auto); // Auto → codec by extension
bool loadString(std::string_view bytes, Codec codec);
std::string toString(Codec codec = Codec::Prism) const; // .prisma ASCII by default; any codec otherwise
bool save(const std::string& path, Codec codec = Codec::Auto) const; // Auto → format by extensiondef load(self, path: str, codec: Codec = Codec.AUTO) -> bool # replaces contents; False on failure
def load_string(self, data: "str | bytes", codec: Codec) -> bool # replaces contents; False on failure
def to_string(self, codec: Codec = Codec.PRISM) -> "str | bytes" # raises if the codec declines
def save(self, path: str, codec: Codec = Codec.AUTO) -> bool # False on write failure or declineload and save return false (Python: False) on a read/parse/write failure or a declined encode. toString raises in Python if the codec declines the document; in C++ a declining codec yields an empty string.
A worked example: Markdown → Document → HTML
Start with a Markdown note:
# Notes
A paragraph with *emphasis*.
load lands it on the document model, a tree of typed Elements. A document's blocks and inline runs are ordered body content, so they are nameless Elements (/document/[0], [1], …): their order is their identity, and the file reads as the structure itself.
Document doc;
doc.load("notes.md"); // Markdown → Document
std::string prisma = doc.toString(); // the document model, as .prisma ASCIIdoc = kg.Document()
doc.load("notes.md")
prisma = doc.to_string()The Document, written as .prisma:
#prisma 3.0
def document "document" {
def heading {
int32 level = 1
def text {
str text = "Notes"
}
}
def paragraph {
def text {
str text = "A paragraph with "
}
def emphasis {
def text {
str text = "emphasis"
}
}
def text {
str text = "."
}
}
}
Save that same Document as HTML. The HTML codec walks the shared model straight through:
doc.save("notes.html", Codec::Html);doc.save("notes.html", kg.Codec.HTML)<h1>Notes</h1>
<p>A paragraph with <em>emphasis</em>.</p>
The heading, paragraph, emphasis, and text Elements are the model both codecs speak: load through one, save through the other.
The content vocabulary
A codec does not invent ad-hoc Elements per file. It targets a canonical model for a kind of content, and that model is shared by every codec of its kind. The document model below is shared by the Markdown and HTML codecs, which is exactly why md → Document → html works.
The document is rooted at /document. Its blocks are ordered anonymous children, and a block's inline content is likewise ordered anonymous children, so the tree carries the meaning with no "0"/"1" noise.
Block kinds:
document— the root.heading—int level(1–6); optionalstr id(used by the docs for anchors). Children are inline.paragraph— a run of inline content.list—bool ordered; children areitems.item— a list entry; children are blocks (lists nest).codeblock—str language,str text. A verbatim fenced block.blockquote— children are blocks.thematicBreak— a horizontal rule, no children.
Inline kinds:
text—str text. The leaf that carries characters.emphasis—*em*/<em>; children are inline.strong—**strong**/<strong>; children are inline.code—str text. Inline code span.link—str href; children are inline (the link text).image—str src,str alt.linebreak— a hard break, no children.
HTML is a superset of the Markdown subset, so any model from a Markdown document renders to clean HTML. The HTML codec also carries the document-v1 constructs these docs use beyond CommonMark core: heading.id (<h* id>), definitionList/term/definition (<dl>/<dt>/<dd>), figure/caption (<figure>/<figcaption>), note (<div class="note">), and a foreign <svg> island kept verbatim as a single rawHtml block — the HTML-only escape hatch. Unknown tags are flattened, their text kept. Markdown constructs outside the subset (tables, raw HTML, reference links) are not modelled.
The other kinds
Each kind of content has one canonical model; Prism's native shape is the scene/structured kind.
- Scene/structured (
Json): JSON maps hierarchically. An object becomes an element of typeobject(each scalar member a property, each object/array member a child element); an array becomes an element of typearraykeyed by index; numbers →float64, strings →str, bools →bool, null → an emptystr[]. Rooted at/root.encodealways succeeds — it renders any Document as JSON. - Vector (
Svg): a structural element tree, not a blob. Each SVG element becomes an Element whose type is its tag (svg,g,rect,path,text, …), attributes become string properties (viewBox,d,cx, …), children become ordered children, and character data becomes a#textelement. Rooted at/svg;encodedeclines a Document with no/svgroot. - Lossless wrappers (
Text,Blob): a plain-text file becomes a/documentof typetextholdinglines : str[]split on\n; any bytes become a/documentof typeblobholdingdata : u8[]. Both round-trip byte-for-byte.encodedeclines a Document that is not a document of that kind.
All kinds reduce to one substrate, which is why they can share it: an image channel, a vertex position, a vector control point, and a heading's text are all typed array Values; layers and AOVs are named properties, the same mechanism that carries a material parameter.
Bundle
A Document is already a path-addressed hierarchy, which is exactly what a directory tree is — so a folder of files is a Document. Bundle is the filesystem convention on top, letting a whole site or project pack into one Document and ship as one compressed .prism.
- a
folderelement is a directory; - a file is a typed content element whose
nameproperty holds the real filename, and whose type selects how it is stored: adocumentorsvgfile keeps a structural child subtree (the decoded content — live, diffable, linkable); any other file is ablobholding raw bytes in adatau8[] (byte-exact for a PNG or a stylesheet).
Filenames are data, not path segments: a path segment is [A-Za-z0-9_] and dotted names do not survive the binary crate, so each element carries a safe segment plus its name. The render-as rule keeps storage and presentation separate: a file's type is the storage truth, the name's extension is the materialise target. A document named page.html materialises as HTML; the same content named page.md materialises as Markdown.
#include "kinogaki/codecs/Bundle.h"
using namespace kinogaki::codecs;
Document site;
Path root = bundle::root();
Path pages = bundle::addFolder(site, root, "pages");
bundle::addFile(site, pages, "index.html", htmlBytes); // kind chosen from the extension
bundle::addFile(site, root, "style.css", cssBytes); // → an opaque blob
std::optional<std::string> html = bundle::materializeFile(site, /* the index path */);
addFile picks the kind from the extension (html/htm/md/markdown → a document decoded via the matching codec; svg → a structural svg; anything else → a blob). materializeFile reads a file element back to bytes, honoring the render-as rule by choosing the codec from the file's name extension. The functions are filesystem-free and pure — they operate on a Document, and a tool walks the real directory and calls them.
A lossless contract
Every codec is held to a stated contract. A within-format round-trip is the identity: doc.md → doc.prisma → doc.md returns your file. Cross-format degradation is defined and tabulated — a construct a target format cannot express degrades in a documented way (an HTML-only island becomes a verbatim raw block on the way to Markdown), so you always know what a conversion preserves. The shipping codecs cover JSON, Markdown, HTML, SVG, plain text, and arbitrary blobs — enough that anything you hand the tooling converts cleanly, and enough to build these docs entirely out of converted Markdown.