Chapter 2: The Economic Reprieve and the Jevons Paradox
An economic anomaly so profound that it shattered Silicon Valley's established financial models drove the radical disruption of the global AI market in early 2025. To understand why 125 million people integrated DeepSeek into their daily lives within months, one must look past the interface and into the balance sheets. The "DeepSeek Lock-In Effect" begins with a collapse in the cost of intelligence—a technical achievement that turned a luxury computational resource into a ubiquitous commodity.
This chapter argues that DeepSeek’s radical cost efficiency served as the primary catalyst for mass cognitive lock-in. By training the V3 model for approximately $5.5 million—a mere 1/18th of the estimated cost of OpenAI’s GPT-4—DeepSeek didn't just compete on price; it triggered a Jevons Paradox. This economic phenomenon posits that as a resource becomes more efficient to use, total consumption of that resource increases rather than decreases. By lowering the financial and computational barriers to entry to near-zero, DeepSeek pulled an entire global cohort of users into the AI ecosystem who had previously been priced out, creating a "first-mover" advantage rooted in affordability.
The Math of the Anomaly: $5.5 Million vs. The Billions
In the years leading up to 2025, the narrative of AI development was one of "compute-intensive" dominance. Leading firms like OpenAI, Google, and Anthropic operated under the assumption that frontier-level performance required exponentially increasing budgets for hardware and electricity. Training a flagship Large Language Model (LLM) like GPT-4 was widely understood to cost upwards of $100 million, while the infrastructure required to support its successor was projected into the billions.
DeepSeek’s V3 model effectively ended this era of escalating extravagance. The technical report for DeepSeek-V3 revealed a training cost of roughly $5.57 million. When placed alongside the $100 million-plus budgets of its Western peers, this figure represents a fundamental change in the physics of AI production.
This efficiency was not achieved through a single "silver bullet" but through a series of architectural optimizations designed to squeeze maximum intelligence out of limited hardware. The two primary pillars of this efficiency are the Mixture-of-Experts (MoE) architecture and Multi-head Latent Attention (MLA).
The Mixture-of-Experts (MoE) architecture is a strategy for managing a model's vast parameter count without requiring the full weight of those parameters for every single calculation. While DeepSeek-V3 boasts a staggering 671 billion total parameters, it is designed to be "sparse." During any given inference—the moment the AI generates a response to a user—it activates only about 37 billion parameters, or roughly 5.5% of its total capacity.
In practical terms, this means that while the model has the "knowledge base" of a 671B parameter giant, it only incurs the computational "electricity bill" of a much smaller model. This sparsity allows DeepSeek to serve millions of users simultaneously at a fraction of the hardware overhead required by "dense" models, where every parameter must be activated for every word generated.
The second pillar, Multi-head Latent Attention (MLA), addresses the "memory wall" that often slows down AI performance. Traditional LLMs require massive amounts of Key-Value (KV) cache memory to keep track of the context of a conversation. As conversations get longer, this memory requirement grows, leading to slower responses and higher serving costs. MLA utilizes a latent vector to compress these KV caches by 80% to 95%. By reducing the data that needs to be moved across the chip during processing, DeepSeek achieved a throughput—the speed at which it can deliver text—that made high-speed reasoning accessible to the average smartphone user in regions with limited bandwidth.
These are not merely technical trivia. They are the economic foundations of the lock-in effect. Because the cost of "producing" a token of thought was so low, DeepSeek could offer its services for free or at a price point that Western competitors found unsustainable. This lead to a "reprieve" from the scaling laws that had previously suggested AI would remain an elite, expensive tool for the foreseeable future.
The Jevons Paradox: Why Efficiency Created a Surge
In 1865, economist William Stanley Jevons observed a strange trend in the English coal industry. Improvements in steam engine efficiency, which allowed engines to do more work with less coal, did not lead to a decrease in coal consumption. Instead, because coal-driven power was now cheaper and more productive, more industries adopted it, leading to a massive spike in total coal demand.
In January 2025, Microsoft CEO Satya Nadella explicitly applied this concept to the DeepSeek moment. Writing on LinkedIn, Nadella noted, “As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.” This was a crucial admission from the head of the company most invested in the high-cost OpenAI ecosystem. Nadella recognized that DeepSeek had changed AI from a "scarce high-value resource" into a "commoditized utility."
The Jevons Paradox explains why the market didn't just shift existing AI users from ChatGPT to DeepSeek. Instead, the total market expanded. The radical drop in cost made it feasible for a university student in Jakarta, a small business owner in Nairobi, or a junior developer in Bangalore to use frontier-level AI for every minor task. When the cost of a query drops toward zero, the number of things we find "worth asking" an AI increases exponentially.
Research from State Street’s 2025 analysis of the AI market confirmed this expansion. The study found that DeepSeek’s affordability unlocked AI for "thin-margin industries"—sectors like local logistics, basic customer service, and even entry-level education—that previously couldn't afford Western AI tools. These users didn't just "try" DeepSeek; they embedded it into their operations because, for the first time, the math of AI actually worked for them.
The Bruegel policy brief further emphasized this "technological and economic reprieve." For years, European and Global South stakeholders had worried that the "pre-training" race would create a permanent digital divide, where only a handful of trillion-dollar American and Chinese firms could afford to build intelligent systems. DeepSeek proved that high-level reasoning could be achieved through "post-training efficiency" rather than just computational "brute force." This made AI more accessible, sure, but it also meant that DeepSeek would become the "default" AI experience for the next billion users.
The Accessibility Moat
The concept of a "moat" in business usually refers to a competitive advantage that protects a company from rivals. While OpenAI and Google were building moats with proprietary data and huge hardware setups, DeepSeek was building theirs on accessibility.
In many parts of the developing world, the barrier to AI adoption wasn't just the subscription price of $20 a month—though that represents a significant percentage of median income in many It was the infrastructure needed to actually run the thing. DeepSeek’s MLA architecture meant that the model was snappy and responsive even on mid-range hardware and inconsistent internet connections.
By making the model "open-weight," DeepSeek also allowed local developers to host the model on their own servers. This removed the "latency tax" of sending data to servers in Virginia or California. When an Indonesian startup can run a model as capable as GPT-4 on its own local cloud for a fraction of the cost of an API call to the U.S., the choice is not about brand loyalty; it is about survival.
As these startups and even government agencies (like the Indonesian government, which announced plans for local, DeepSeek-powered infrastructure in late 2025) build their systems on this tech, they're not just grabbing a tool. They are constructing a digital environment where every piece of software is tailored to DeepSeek’s specific logic and performance characteristics. This is the structural layer of lock-in: when your entire business infrastructure is optimized for a $5.5 million model’s efficiency, switching to a $100 million model feels like an act of financial sabotage.
Acknowledging the Counter-Argument: Is Cheap Always Better?
Those who argue against the "DeepSeek as a commodity" idea often point to something called the "Proprietary Preference" phenomenon. Analysis from the Peterson Institute for International Economics (PIIE) in early 2026 suggested that high-value corporate users in the West still lean toward closed-source, proprietary models like those from OpenAI or Anthropic. The argument goes like this: for a huge Fortune 500 company, a 90% cost cut isn't as important as the security, reliability, and familiar "brand-name" support they get from a big Western player.
Mert Demirer’s research highlighted that 90% of large enterprise users prioritize reliability over cost. This suggests that the "Lock-In Effect" might have a ceiling. If the world’s wealthiest companies stay with Western models, does DeepSeek’s dominance among the masses actually matter?
The answer lies in the demographic data discussed in Chapter 1. While the Fortune 500 might be slow to move, the workforce of 2030 is currently composed of the 18-to-24-year-olds who are using DeepSeek because it is free, fast, and accessible. In economics, the "bottom of the pyramid" often dictates the future of the "top." Just as the cheap, accessible internet of the 1990s eventually forced every "high-value" legacy business to adapt to its protocols, the commoditized intelligence of DeepSeek is creating a bottom-up pressure that will eventually reach the enterprise level.
Furthermore, the "cost vs. quality" gap is closing. Sure, Western models might still have a slight edge for really complex medical or legal stuff, but DeepSeek-V3 and later versions have shown that for 95% of everyday tasks—like coding help, summarizing documents, or drafting creative content—the "cheap" model is just as good as the "expensive" one. If a tool is 10 times cheaper and 98% as good, the economic rationalism of the Jevons Paradox predicts a total market takeover.
Breaking the Scaling Myth
The most significant psychological impact of DeepSeek’s $5.5 million training cost was the destruction of the "Scaling Myth." For three years, the AI industry operated under the belief that the only way to get smarter models was to add more GPUs. This belief created a massive speculative bubble in hardware stocks and energy markets.
DeepSeek proved that getting more "intelligence per dollar" is way more important than just having a high "total FLOPS" (Floating Point Operations) number. A ScienceDirect study from 2025 noted that DeepSeek’s release caused a sharp negative reaction in GPU provider stock prices. It wasn't because people suddenly thought we wouldn't need chips anymore; it was because they realized the days of just throwing hardware at the problem were over.
This shift in the industry's direction is a key part of the lock-in. When the entire world of AI development moves toward copying DeepSeek’s MoE and MLA architectures to save money, they are effectively adopting the DeepSeek standard. Even when users think they are using a "different" model in the future, if that model is built on the architectural efficiencies DeepSeek pioneered, they are still operating within the cognitive and technical parameters DeepSeek established.
The Foundation of Habit
The economic "reprieve" described by Bruegel and the Jevons-driven demand explosion described by Nadella were necessary precursors. Without the radical cost drop, there would be no mass adoption. Without mass adoption, there would be no universal habit formation.
DeepSeek's brilliance was in understanding that the future of AI wouldn't be decided in a top-secret lab with a ten-billion-dollar budget. It would be decided on the screens of people with five bucks in their pocket and a burning curiosity. By lowering the entry price to zero, DeepSeek became the "first AI" for a hundred million people.
In the world of technology, being "first" is often more important than being "best." The first tool a person learns to use becomes their mental model for how all such tools should work. For a developer in Mumbai who picked DeepSeek-Coder because it was the only option with a totally free tier and a fast API, the "DeepSeek way" of coding is now practically their second language.
By the time Western companies finally dropped their prices to compete—releasing "mini" and "flash" versions of their main models in late 2025—it was too late. The users had already spent months training their own brains to respond to DeepSeek’s specific logic. This brings us to the core of the psychological argument. Economics got the users in the door; the way the machine thinks is what locked them inside.
Summary and Transition
This chapter has established that DeepSeek’s disruption was, at its core, an economic revolution. The $5.5 million training cost and the sparse Mixture-of-Experts architecture collapsed the price of intelligence. This triggered the Jevons Paradox, pulling a massive global cohort of users into an ecosystem that was suddenly, for the first time, truly "too cheap to ignore."
We have seen how this "accessibility moat" allowed DeepSeek to capture the Global South and the developer layer, creating an infrastructure that is now structurally dependent on DeepSeek’s pricing and performance. This mass adoption, driven by efficiency, provided the raw material for the next stage of the lock-in: the human brain.
As users flooded into the DeepSeek ecosystem, they didn't just use the tool; they learned to talk to it. They adjusted their sentences, their requests, and their problem-solving styles to match the quirks of the R1 and V3 models. They became "DeepSeek-literate."
In the next chapter, we'll move from the world of economics to the world of how we think. We'll dive into how this "skill-based habit of use" creates a kind of mental rut that's almost impossible to get out of. We will look at why "unlearning" the DeepSeek interaction style is a more significant barrier to competition than any subscription fee, and how "stealth friction" ensures that once a user is calibrated to a specific AI's reasoning, they are effectively locked in for life.
Chapter 2 Sources
- Cost and technical architecture: https://pmc.ncbi.nlm.nih.gov/articles/PMC11898396/, https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1576992/full, https://cap.csail.mit.edu/research/deepseek-what-you-need-know
- Jevons Paradox applications: https://fortune.com/2025/01/27/microsoft-ceo-satya-nadella-deepseek-optimism-jevons-paradox/, https://www.npr.org/sections/planet-money/2025/02/04/g-s1-46018/ai-deepseek-economics-jevons-paradox, https://www.statestreet.com/us/en/insights/deepseek-disruption-ai-advancement, https://www.bruegel.org/policy-brief/how-deepseek-has-changed-artificial-intelligence-and-what-it-means-europe
- Market disruption analysis: https://www.sciencedirect.com/science/article/pii/S154461232500707X, https://www.piie.com/blogs/realtime-economics/2026/how-ai-boom-shrugged-deepseek-shock-and-keeps-gaining-steam
- Counterargument data: https://www.piie.com/blogs/realtime-economics/2026/how-ai-boom-shrugged-deepseek-shock-and-keeps-gaining-steam, https://www.nencmediagroup.com/deepseek-reshapes-ai-markets-in-2026-efficiency-wave-reprices-chips-cloud-spend-and-compliance/
Comments (0)
No comments yet. Be the first to share your thoughts!