Tokenization: When the System Becomes Teachable
Tacit knowledge stays tacit until you break it into named, discrete, teachable units. The joinery repo has 60+ codified skills. Each one is a tokenized piece of methodology.
I was trying to explain to a Claude session how I evaluate a piece of copy. The specific sequence: what I look at first, what I check second, what passes and what fails and why. I’d done this evaluation hundreds of times. I could do it in my head in about thirty seconds. But explaining it took twenty minutes of circling, backtracking, and restating, because the knowledge was compressed into a single gesture that I’d never decomposed.
That session produced the grip test. A named, discrete, five-point evaluation that checks whether copy lands with a stranger in the first three seconds. Once it had a name and a structure, I could hand it to any agent, any collaborator, any future version of myself, and they could run it without needing the twenty-minute explanation. The knowledge was the same. The accessibility changed completely.
This is what I mean by tokenization. Taking a piece of methodology that lives as tacit skill (the kind of thing a practitioner can do but can’t easily articulate) and breaking it into a named, bounded, teachable unit. A token.
The term comes from how language models process text. A tokenizer breaks continuous language into discrete chunks that the model can operate on. The chunks aren’t meaningful by themselves. They become meaningful through their relationships and their sequence. I’m using the term the same way, applied to methodology instead of language. You take a continuous skill (evaluating copy, assessing brand fidelity, reading a room’s energy) and you break it into discrete operations that can be named, taught, and composed with other operations.
The joinery repository has over 60 codified skills right now. [VERIFY: current skill count] Each one is a tokenized piece of methodology. Some are coordinators that orchestrate multiple operations. Some are atomics that do one specific thing. The grip test is an atomic. The steward coordinator dispatches multiple atomics in sequence. The lens array runs several evaluation frameworks in parallel and surfaces where they agree and disagree.
Here’s what I didn’t expect: the act of tokenization changes the methodology.
When I decomposed the grip test from an intuitive gesture into a five-point checklist, I discovered that step three (checking whether the copy makes a specific claim versus a generic assertion) was doing twice the work of the other steps. It was actually two checks compressed into one. I’d been running both checks simultaneously in my head and experiencing them as a single operation. The tokenization revealed the hidden joint.
This happens consistently. Every time I take a tacit skill and break it into explicit steps, I find compressions, redundancies, and gaps. The tacit version works because my brain handles the ambiguity. The tokenized version has to be precise enough for a different processor (a collaborator, an AI agent, a future session with no memory of the current one) to execute. That precision requirement forces the methodology to get better.
There’s a direct parallel to atomic design in component architecture. Brad Frost’s model breaks interfaces into atoms, molecules, organisms, templates, and pages. Each atom (a button, an input, a label) is a discrete unit with a defined behavior. You compose atoms into molecules, molecules into organisms, and so on. The power isn’t in any individual atom. It’s in the composability. Because each piece has a clear boundary and a defined interface, you can rearrange them without breaking things.
Skill atomics work the same way. The grip test is an atom. The voice protocol is an atom. The knowledge traversal skill is an atom. The steward coordinator is an organism: it composes multiple atoms into a sequence that produces a complete evaluation. I can swap out atoms. I can add new ones. I can rearrange the sequence. The system is modular because each piece was tokenized with clean boundaries.
The opposite of tokenization is the master craftsman problem. A master craftsman can produce extraordinary work, but the knowledge lives in their hands, their eye, their accumulated intuition. When they retire or leave, the knowledge goes with them. Apprenticeship is the traditional solution: years of proximity, watching, absorbing. But apprenticeship doesn’t scale, and it doesn’t transfer to contexts the master never encountered.
Tokenization is the structural alternative. You can’t tokenize everything (the master’s taste, their sense of when to break the rules, their ability to see the whole while working the part) but you can tokenize far more than most practitioners realize. The 60+ skills in joinery represent methodology that used to live entirely in my head. Now it lives in files that any session can load, any collaborator can run, and any future project can reference.
The threshold moment is when the system becomes teachable. Before tokenization, the methodology is personal. It works for me because I built it and I carry the full context. After tokenization, the methodology is transferable. Someone (or something) without my context can execute the individual operations and produce results that are recognizably governed by the same principles.
I’m not claiming the tokenized version is as good as the tacit version. It isn’t. The tacit version has nuance, flexibility, and contextual sensitivity that the tokens can’t fully capture. But the tokenized version is deployable at scale, consistent across sessions, and improvable through explicit revision. I can update a skill file and every future execution reflects the update. I can’t update my intuition that cleanly.
The real test is whether the tokenized system produces output that the tacit practitioner recognizes as their own. When I run the full evaluation stack (steward coordinator dispatching the grip test, the voice protocol, the lens array, the knowledge traversal) against a piece of copy, does the result match what I would have concluded through intuition alone?
Most of the time, yes. Sometimes the system catches things I would have missed. Occasionally it misses things I would have caught. The gap between the two is where the next tokenization happens. Every mismatch reveals a piece of tacit knowledge that hasn’t been decomposed yet.
That’s the ongoing work. The methodology gets more teachable every time a tacit operation becomes an explicit token. The system doesn’t replace the practitioner. It makes the practitioner’s methodology available to more contexts than one human can physically occupy.