Anthropic is investigating two cache bugs in its Claude Code tool after a Redditor apparently reverse-engineered the standalone binary and traced them to massive hidden spikes in token usage.
The OP, skibidi-toaleta-2137, spent days poking at the 228MB ELF file with Ghidra, a MITM proxy, and radare2. He found that the flaws can quietly turn a normal conversation into something that costs 10 to 20 times more than expected. For some paid subscribers on Max or Pro plans, that already means hitting session limits after just a handful of messages.
The first bug reportedly sits inside Anthropic’s custom Bun fork. It does a string replacement on every API request looking for a special billing sentinel. If your chat history happens to mention anything billing-related, the replacement can hit the wrong spot and break the cache prefix. The result is a full rebuild instead of a cheap cache read. The second bug is simpler and hits every single time. Using the --resume flag forces a complete cache miss on the entire conversation history. Only the system prompt gets cached. Everything else gets rebuilt from scratch.
The poster laid out the exact root causes, complete with GitHub issue links (1,2) and even a verification script you can run yourself. He also posted clear workarounds. For the standalone binary bug, switching to the npx version of the package avoids the replacement entirely. The resume issue has no easy fix short of downgrading to an older build and losing newer features.
When this discovery blew up on X after Alex Volkov made a post about it, product lead Lydia Hallie posted that the team is actively investigating why users are hitting usage limits faster than expected. She called it the top priority and promised more details soon.
Another staffer, Thariq, replied directly in the comments of Volkov’s post and said they are digging in, though he noted prompt cache bugs can be subtle.
This follows the complaints we covered earlier. Earlier, we reported how Claude Max subscribers were watching their quotas drain with no clear explanation. Then, a few days later, Anthropic offered an explanation tied to peak-hour demand. The new technical analysis suggests the real culprit may have been these cache issues all along.
Plenty of users in the Reddit comments said the findings finally make sense of their own weird spikes. One called it “a pretty bold business strategy” to roll out changes with zero changelog. Others are already talking about canceling or switching to rival tools like Codex.
Anthropic has not confirmed the bugs yet. For now, the company is still gathering reports and looking at the data. The Redditor’s post at least gives everyone a way to test it themselves and a few temporary ways to keep costs down while the team works.


