TOKEN CRUNCH
meta
2026-05-09 3 min read

What Token Crunch Is and Why It Exists

Most AI engineering content is written for people who want to feel informed, not for people who need to ship. Token Crunch exists for the second group.

person Token Crunch · fast signal

Most of what gets written about AI engineering is written for the wrong person. It’s written for someone who wants to feel up to date — who wants to know that the new frontier model exists and that other people are excited about it. That’s not useless, but it’s not engineering.

Token Crunch is written for the person who has to actually build something. The senior engineer who’s been told the team is adding LLMs to the product and now has to figure out what that actually means. The person who read three Reddit threads about a new model, found four contradictory benchmarks, and still doesn’t know whether it’s worth the inference cost. The on-call engineer who wants to understand failure modes before they see them at 3am.

What we actually do

Every article here starts with a claim. Not a vibe, not an impression — a falsifiable statement about how something works. Before any prose gets written, the pipeline runs experiments to find out whether the claim holds. API calls, code runs, research queries against the published literature. If the experiment refutes the claim, the article is rewritten around the refutation, not the original hypothesis.

The writing is done by an automated editorial pipeline — Temporal workflows, Gemini models, evidence-backed drafts — reviewed and approved by me before anything goes live. I’m not hiding that. The audience for this publication builds AI systems for a living; they would see through anything that pretended otherwise. The pipeline’s rigour is part of the value proposition: numbers in these articles trace to an experiment result or a cited source, not to a model’s confident confabulation.

Who I am and why I built this

My name is Vinzenz Feenstra. I’ve been building systems professionally since 2005 — antivirus update pipelines at AVG, foreign exchange infrastructure at Barclays, and most recently the in-place upgrade tooling that ships with every supported RHEL release (Leapp — I hold the most commits to that framework because I invented it). Since February 2026 I’ve been a Senior Staff AI Engineer for LLM Applications, which means I spend my days in exactly the environment this publication is aimed at.

I built Token Crunch because I got tired of the same pattern on Reddit and YouTube: everyone immediately jumping to the latest frontier model, expensive and overhyped, while the interesting engineering work was happening with smaller models that nobody bothered to benchmark seriously. I want to run those benchmarks. I want to form opinions that are grounded in evidence rather than launch-day excitement. And I want to give other engineers something concrete to point to when the hype cycle spins up again — which it will, soon, about something else.

The thesis is simple: amplify the signal, mute the noise.

What to expect

Practical essays on engineering tradeoffs. Benchmarks with honest methodology and reproducible setups. Teardowns of agentic tooling — what it claims, what it ships. Field reports from systems under real load. The stuff that matters when you’re trying to make a decision, not when you’re trying to sound informed at a standup.

If you build with LLMs for a living, this is for you. The newsletter goes out when there’s something worth saying.


Find me on LinkedIn, X (@evilissimo), or GitHub (vinzenz).

Signal vs Noise

More like this, in your inbox

Practitioner-focused AI engineering analysis. No hype, no hand-waving. Unsubscribe any time.