txtfold: surfacing signal in large inputs for LLMs
We've been living through the free-lunch era of LLM usage. Three levers are going to define what comes after.
I made a tool called txtfold. It takes a large file, think log files, big JSON dumps, anything with a lot of repetition, and surfaces the interesting bits in a form that's suited for LLM consumption. It's written in Rust, with a CLI and Python and JS/TS bindings, and it's open source.
I think we’re facing a pivotal moment. We’ve all been enjoying frontier model access at prices that don’t reflect the underlying cost structure, because VC money was paying the difference. That’s about to change1. The capex is real, the providers are burning money to acquire us as users, and that can’t continue indefinitely.
Because of that, I think we are all going to be learning how to use LLMs more effectively. I can see three levers:
Routing. We are starting to see this, but still, as of right now, model selection is more or less a config option. The big-model-for-everything pattern is wasteful: a subtask that parses a git commit message and pipes it back to an orchestrator does not need a frontier model. Smaller models aren't just cheaper either; they're faster and have lower latency variance, which compounds in agentic loops where you're chaining many calls. I expect high-granularity routing, between model sizes, between providers, and between hosted and local, to become a natural part of any coding tool.
Caching. My intuition right now is that by far the most tokens are wasted by re-scanning the same parts of the code base when starting new tasks. I know that Claude Code does some caching internally, but I am thinking along the lines of “manual caching” by adding English prose in the form of comments and markdown files everywhere. I go into some details about this in The Robot is not a Junior Developer. It’s a Senior Developer caught in Groundhog Day. You can see how I do this in practice in the root AGENTS.md file. It’s also why I recently upgraded Kanel to support generating markdown documentation of a PostgreSQL database, so the canonical description of the schema exists once, in a stable form.
Trimming. Tools like rtk help optimize how many tokens are spent on output from tool calls. Speaking like a caveman can apparently help, though I haven’t tried that myself. And this is the lever txtfold pulls: reducing large bodies of text to something a model can actually work with.
When I started creating txtfold, I had a vision of compression in mind. I'd hit two situations that motivated it: a pile of large log files, and a 40MB JSON file. I knew that they would contain mostly repetitive data, with a few interesting outliers. I also knew that they were too large for the context window of any LLM, so even though I could just tell one to look at them, I didn’t trust what they would actually do.
That led me to thinking about something like RLE or LZW, but producing text. Such algorithms are, after all, designed exactly for retaining important information while contracting repetition. This turned out to be a dead end because the output was still very verbose, and the losslessness wasn't buying anything useful.
So I went the other way: lossy but deterministic reduction. The important bit here is that there is no AI inside txtfold trying to assess what is important and what is not. It’s all purely algorithmic. Instead, you can let your LLM play with it and configuration options until you get a useful summary out. Once you have such a config (“recipe”), you can apply it repeatedly to new text from the same source and get consistent results.
That determinism matters. An LLM-based summarizer is a black box you have to re-trust on every input. A txtfold recipe is a thing you can read, version, and reason about.
txtfold is a small bet on one of three levers. The other two need their own tools, many of which already exist. If you're building in this space, I'd love to hear about it.
https://kristiandupont.github.io/txtfold/
I should be cautious with such predictions. 20 years ago, I asserted that a concurrency wave was about to hit the industry. If you had told me that in the year 2026, I am counting not the number of giga- or even kilo-cores in my CPU, but only just above single digits, I would never have believed you! Anyhoo..


