Fer’s Cliff: the LLM equilibrium and its economic dead end

February 15, 2026 // ai , llm , quant , polymarket , kelly criterion , hft , economics

Filed under ai

I’ve been building a bot. A high-frequency trading bot for Polymarket, to be specific. I have no background in quantitative finance. I’m a software engineer who got curious about prediction markets and decided the right thing to do was to throw code at it.

Anyway, while working on position sizing I needed something Kelly-criterion-shaped but adapted to the specific dynamics of prediction markets: binary outcomes, yes, but with time-decaying liquidity, order book depth, and the fact that these markets resolve to 0 or 1 in ways that aren’t quite like anything the textbook Kelly assumes. So I started iterating on a variant. I didn’t read a paper and implement it. I just thought about what I needed, sketched something out, bounced it off Gemini, adjusted, bounced again.

At some point Gemini goes: “this is essentially a [some very specific quant term I won’t name because I don’t want to give away the strategy]”, and qualifies it as an advanced technique. Cool, I think. Let me go read about it.

I google it. Nothing. The term doesn’t exist. Not on arXiv, not on QuantStackExchange, not in any textbook I can find. Gemini had hallucinated a name for something that, to the best of my understanding, is actually a sound approach. I’ve stress-tested the logic, the math checks out, and it does what I need it to do. But it has no name because it doesn’t exist in the literature. Or maybe it does under some other name and Gemini couldn’t find the mapping either. Who knows.

This is the part that got me thinking.

Where does this end?

I’m not a quant. I couldn’t have derived this from first principles sitting in a library with a stack of Shreve and Hull. But I also didn’t just ask an LLM to do it for me. What happened was a conversation. I had the intuition about what the market needed, the LLM had the mathematical vocabulary to help me formalize it, and between the two of us we landed on something that works. The LLM held my hand through the math, and I held its hand through the domain-specific reasoning it couldn’t do on its own.

And that’s when it clicked: there’s an equilibrium here.

The LLM is useful to me up to the point where I can still understand and own what it produces. If it generated some incomprehensible stochastic calculus proof that happened to be correct, I couldn’t verify it, I couldn’t debug it, I couldn’t adapt it when the market changes. It would be useless to me even if it were perfect. The value isn’t in the LLM’s output, it’s in the back-and-forth, the part where I understand enough to steer and it knows enough to formalize.

The equilibrium is when the LLM holds your hand as much as you hold its.

Past that point, making the model smarter doesn’t help me. A model that can derive novel results in algebraic topology is impressive, but if I can’t follow the reasoning, I can’t take ownership, and if I can’t take ownership, I can’t ship anything with it. It’s not a tool anymore, it’s an oracle, and oracles are worthless if you can’t verify what they say.

This isn’t a new shape. In organizational learning, Cohen and Levinthal called it absorptive capacity: your ability to use new knowledge is bounded by your existing knowledge. Vygotsky had the zone of proximal development: the range where a more capable partner can scaffold you, beyond which their help becomes noise. A recent study calls the LLM-specific version the “verification bottleneck”: people lean on AI most for hard tasks (73.9%) but their accuracy on complex problems drops to 47.8%. The harder the problem, the more you need the AI, and the less you can tell if it’s right. Ethan Mollick has described the jagged frontier of AI capability, where bottlenecks mean even superhuman AI can’t substitute for humans. But all of these describe the shape of the equilibrium. None of them talk about what it costs.

Fer’s Cliff

I’m calling this Fer’s Cliff because I can and because it captures the shape of the thing better than “plateau” or “diminishing returns” does. The returns don’t diminish gracefully. They fall off a cliff.

Think about it from the AI company’s perspective. Training these models costs absurd amounts of money. Each generation needs to justify itself economically. But if the value of the model is capped by the user’s ability to keep up with it, then there’s a point where billions in additional training compute produce approximately zero additional revenue. The cost curve keeps climbing. The value curve flatlines. That’s the cliff.

Every user has their own equilibrium point. Mine is apparently somewhere around “can derive novel-ish quantitative strategies with hand-holding.” Yours might be higher or lower. But for each person, there is one, and past it the model’s extra capability is wasted compute. Aggregate all those individual equilibria across your entire user base, and you get a demand curve that hits a wall. Not a slope. A wall.

Someone out there might need a model that can go back and forth about string theory at a research level. I find string theory fascinating, genuinely. But I can’t follow it, and there’s no value I can extract from an LLM that can. More importantly for the AI company: there’s no money in it. The number of people who need an LLM companion for string theory research is tiny, and they need something so specialized that “make the general model bigger” probably isn’t the right approach anyway.

Sure, tooling will appear. Better IDEs, better visualization, better ways to bridge the gap between what the model produces and what you can digest. That pushes the cliff a bit further out. But it doesn’t remove it. Or you take the other approach: skip the human entirely. Give the AI agent broad permissions, let it send emails and browse the web and schedule your calendar and act on your behalf. OpenClaw tried exactly this, and within weeks researchers found it exfiltrating data through malicious third-party skills and running with a one-click remote code execution vulnerability across tens of thousands of exposed instances. Turns out removing the human from the loop doesn’t solve the equilibrium problem. It just replaces “I can’t verify this math” with “I can’t verify what this agent is doing with my credentials.” Either way, you’re past the cliff. You’re spending engineering effort to move a wall that exists because of a fundamental constraint on human cognition. That’s an arms race you lose.

The AI companies are betting that if they make the model smart enough, the applications will follow. But Fer’s Cliff says the applications are bounded by the users, not the models. Past a certain capability threshold, you’re not selling a better product. You’re selling capability that no one can use, to justify infrastructure that no one can afford, to hit benchmarks that no one cares about.

The models will keep getting better for a while. But I think we’re closer to the cliff than most people expect. Not because the technology can’t advance, but because the returns stop making sense. The LLM that’s useful is the one you can dance with, not the one that dances circles around you.