Anthropic is investigating two cache bugs in its Claude Code tool after a Redditor apparently reverse-engineered the standalone binary and traced them to massive hidden spikes in token usage.

The OP, skibidi-toaleta-2137, spent days poking at the 228MB ELF file with Ghidra, a MITM proxy, and radare2. He found that the flaws can quietly turn a normal conversation into something that costs 10 to 20 times more than expected. For some paid subscribers on Max or Pro plans, that already means hitting session limits after just a handful of messages.

claude-code-cache-bugs-reddit

The first bug reportedly sits inside Anthropic’s custom Bun fork. It does a string replacement on every API request looking for a special billing sentinel. If your chat history happens to mention anything billing-related, the replacement can hit the wrong spot and break the cache prefix. The result is a full rebuild instead of a cheap cache read. The second bug is simpler and hits every single time. Using the --resume flag forces a complete cache miss on the entire conversation history. Only the system prompt gets cached. Everything else gets rebuilt from scratch.

The poster laid out the exact root causes, complete with GitHub issue links (1,2) and even a verification script you can run yourself. He also posted clear workarounds. For the standalone binary bug, switching to the npx version of the package avoids the replacement entirely. The resume issue has no easy fix short of downgrading to an older build and losing newer features.

When this discovery blew up on X after Alex Volkov made a post about it, product lead Lydia Hallie posted that the team is actively investigating why users are hitting usage limits faster than expected. She called it the top priority and promised more details soon.

claude-code-lead-usage-limit-bug-response

Another staffer, Thariq, replied directly in the comments of Volkov’s post and said they are digging in, though he noted prompt cache bugs can be subtle.

anthropic-staffer-claude-code-prompt-cache-bug-report-response

This follows the complaints we covered earlier. Earlier, we reported how Claude Max subscribers were watching their quotas drain with no clear explanation. Then, a few days later, Anthropic offered an explanation tied to peak-hour demand. The new technical analysis suggests the real culprit may have been these cache issues all along.

Plenty of users in the Reddit comments said the findings finally make sense of their own weird spikes. One called it “a pretty bold business strategy” to roll out changes with zero changelog. Others are already talking about canceling or switching to rival tools like Codex.

Anthropic has not confirmed the bugs yet. For now, the company is still gathering reports and looking at the data. The Redditor’s post at least gives everyone a way to test it themselves and a few temporary ways to keep costs down while the team works.

We stand out from the tech-media crowd because we break news stories; we mainly bring you stuff that you won’t find anywhere in the mainstream tech media. Our stories have been picked up by some of the world’s most popular websites and media outlets—more info is available here.

Dwayne Cubbins
2714 Posts

I cover fast-moving stories across apps, online platforms, and everyday tech — phones, wearables, consoles, and whatever else people are fighting with this week. Bugs, rollouts, scams, policy enforcement, and the occasional internet-culture rabbit hole are all fair game. My goal is simple — make confusing tech news readable. When I'm not working, I'm working out or chilling with my dog. Got a tip? You can find me on X @dcubbins.

Next article View Article

Anthropic’s Claude Code source appears to have been leaked via npm registry map file

Anthropic’s Claude Code source code has just leaked online through a .map file in its npm registry package. Developer Chaofan Shou spotted it first. Posting as @Fried_rice on X...
Mar 31, 2026 2 Min Read