The Protocol That Reproduced Its Own Failure

I was sitting in front of a 3,000-line conversation log, trying to find the exact moment I’d decided to restructure the entire site taxonomy. I knew the decision was in there. I’d been working through it for hours the night before, and somewhere around line 1,800 something clicked. The taxonomy had to organize by what the visitor needs, not by what I built. That was the turn. By morning the thread was closed and the turn was gone.

This kept happening. The notes survived fine. The decisions themselves survived too, mostly. I could usually reconstruct what I’d decided. What disappeared was the moment the understanding shifted. The texture of realizing something, the specific sequence of thoughts that led to it, the thing that made the decision feel right instead of arbitrary. That’s what evaporated when the context window closed.

So I built SavePoint. A markup format for marking, inline, the exact point where understanding shifted. Drop it in the middle of dense thinking. Come back later. Find your way in.

v1.0 was YAML frontmatter. A category field, a timestamp, and then freeform narrative content underneath. Write as much as you need to capture the moment.

It was exactly wrong.

The freeform content drifted into journaling. Instead of marking the inflection point, I was writing reflective summaries about what the session had covered. Paragraphs of context. Thoughtful, well-organized paragraphs that described the general territory of the thinking without preserving the specific thing that changed. The protocol I built to capture crystallized thought was flattening the very thing it was supposed to hold.

I didn’t catch it immediately. The savepoints looked good. They were readable, organized, thorough. I only noticed the failure when I tried to use them months later. I went back looking for a specific decision about how evaluation lenses should interact with each other. The savepoint for that session gave me a paragraph about “exploring multi-lens evaluation approaches.” Completely useless. The paragraph told me what I’d been thinking about. It told me nothing about what I’d realized.

This is the failure mode I now call semantic flattening. The information is present but the meaning is gone. A description of the territory instead of the coordinates. Every detail technically accurate and completely unhelpful for finding the thing you actually need.

v2.0 tried to fix it with structure. Triple-pipe attribute blocks, explicit fields for category and function and timestamp, then a <Savepoint> content block that was still open-ended. You could still write paragraphs. That was still the problem. The content block invited narrative, and narrative is what flattens.

The real fix came from a different instinct entirely. I’d spent years writing CSS selectors and HTML attributes. Self-closing tags. Atomic. Parseable. One line of content, forced precision. v3.0 applied that thinking to thought preservation.

<Savepoint
  protocol_version:3.0
  category:system_logic
  function:declaration
  timestamp:2025-04-08T15:43:00Z
  # Recursive structures should replace version snapshots wherever drift is likely.
/>

One line after the # marker. One. If you can’t say it in one line, you haven’t crystallized it yet. The constraint forced the precision that the open-ended format had prevented. The thing about a single line is that you have to know what you mean before you write it. Paragraphs let you figure it out as you go. One line demands that the figuring-out already happened.

Three versions to get from “preserve the moment” to a syntax that actually preserved it. And the core problem across all three was the same problem I keep finding in governance work: the gap between what a system is supposed to do and what it actually incentivizes.

v1.0 was supposed to capture the turning point. What it actually incentivized was reflection. Write as much as you want, in whatever form feels natural. Of course it drifted into journaling. Journaling is what feels natural. The protocol was optimized for the comfort of the person writing the savepoint, when it should have been optimized for the person (or the model) retrieving it later.

That distinction, capture comfort versus retrieval utility, turned out to be the entire design problem. Every version after v1.0 moved further from what feels good to write and closer to what actually works when you come back looking for something. The one-line constraint feels restrictive in the moment. You want to write more. You want to explain. That impulse is the exact thing the protocol has to resist, because the explanation is what dilutes the mark.

v3.1 added project and keywords fields after the archive grew past 50,000 documents and finding “the savepoint about plate boundaries” meant scanning everything. v3.2 added a context field after I found a savepoint that got me to the right neighborhood but couldn’t reconstruct what had actually been decided there. Each revision responded to a specific retrieval failure. The protocol is shaped by retrieval, not capture. Every field earned its place by failing without it.

I think about v1.0 a lot. A protocol designed to prevent information loss that was itself losing information. The system designed to detect the problem had the problem. It took three versions and months of failed retrievals to see it clearly, because the savepoints looked competent. They were well-written. They covered the right topics. They just didn’t do the one thing they existed to do.

The version that works is the one that constrains the maker the most. One line. Forced precision. No room to drift into comfortable narrative. The protocol works because it’s harder to write, and it’s harder to write because it demands that you already know what shifted before you mark it. Everything useful about the protocol flows from that single constraint.