This week, Chinese AI lab DeepSeek released R1, a reasoning-focused large language model that’s turning heads across the AI community. Not because reasoning models are new — OpenAI’s o1 has been available since September — but because of what DeepSeek has achieved and how they’ve released it. R1 matches or exceeds o1-preview on most major benchmarks, the full model weights are available under an MIT license, and the technical paper suggests it was trained at a fraction of the cost that US labs typically spend.
I’ve spent the past two days reading the paper, testing the model, and talking to colleagues about the implications. This is one of those releases that deserves more than a headline.
What Makes R1 Different#
DeepSeek R1 is a reasoning model, meaning it’s designed to “think through” complex problems step by step before producing an answer. This is the same paradigm that OpenAI introduced with o1 — the model generates a chain of thought, exploring different approaches and self-correcting before arriving at a final answer. The difference is that OpenAI keeps o1’s chain of thought hidden and the model proprietary. DeepSeek shows you the reasoning and gives you the weights.
The benchmark numbers are impressive. R1 scores competitively with o1 on math (AIME, MATH-500), coding (Codeforces, SWE-Bench), and general reasoning tasks. On some benchmarks, it outperforms o1-preview. These aren’t cherry-picked results — the scores are broadly strong across categories.
But what’s technically fascinating is the training approach. The technical paper describes a process that starts with pure reinforcement learning on a base model, without supervised fine-tuning first. This “cold start” RL approach led to emergent reasoning behaviors — the model learned to decompose problems, verify intermediate steps, and re-examine its assumptions, all from the reward signal alone. They then used this RL-trained model to generate synthetic data for a more polished final version.
The paper is refreshingly detailed. While OpenAI’s o1 system card was notably sparse on technical details, DeepSeek provides enough information to understand and potentially reproduce their approach. This transparency is valuable for the research community regardless of what you think about the geopolitics.
The Cost Question#
Perhaps the most provocative aspect of DeepSeek’s work is the claimed training cost. While exact figures aren’t published in the paper, estimates based on their described compute setup suggest R1 was trained for roughly $5-6 million — a fraction of the hundreds of millions that frontier labs in the US reportedly spend on their latest models.
There are important caveats here. DeepSeek builds on their existing V3 base model, so the total investment is higher than just the R1 training run. They may benefit from lower labor costs. And comparing training costs across organizations is notoriously difficult because of different accounting practices and infrastructure setups.
Still, even accounting for these factors, the efficiency is remarkable. DeepSeek reportedly used around 2,000 NVIDIA H800 GPUs (the China-export-compliant variant of the H100) for their training runs. If they’re achieving frontier-competitive results with this setup, it challenges the narrative that you need 100,000+ H100 clusters and billions in investment to build competitive AI models.
This has implications for the entire AI industry. If the scaling laws are less about brute-force compute and more about clever training approaches, the moat around well-funded US labs is narrower than many assumed. And for smaller companies and research labs, it means competitive AI development might be more accessible than the current “compute is everything” narrative suggests.
The Open-Source Impact#
R1 is released under the MIT license — the most permissive common open-source license. You can use it commercially, modify it, distribute it, and build products on top of it without restrictions. DeepSeek also released six distilled versions, ranging from 1.5B to 70B parameters, built by distilling R1’s reasoning capabilities into smaller Qwen and Llama-based models.
The distilled models are particularly useful. The 32B distilled version performs remarkably well relative to its size — it outperforms o1-mini on several benchmarks while being small enough to run on consumer hardware or a single cloud GPU. For developers who want reasoning capabilities in their applications without the cost of running a 671B parameter model, these distilled versions are immediately practical.
This release enriches the open-source AI ecosystem significantly. We now have open reasoning models that are genuinely competitive with the best proprietary offerings. Combined with Meta’s Llama, Mistral’s models, and others, the open-source AI stack is reaching a point where you can build sophisticated AI applications entirely on open models.
The Geopolitical Dimension#
I’d be remiss not to acknowledge the elephant in the room. DeepSeek is a Chinese company. Last week I wrote about the Biden administration’s AI Diffusion Rule, which aims to control China’s access to advanced AI chips. And here’s a Chinese lab producing frontier-competitive models using export-restricted hardware variants.
This creates an uncomfortable tension in the US policy narrative. If Chinese labs can match US model capabilities despite chip restrictions, the strategic value of those restrictions becomes less clear. The counterargument is that restrictions slow progress and prevent access to the very best hardware, which may matter at the true frontier. But R1 suggests the gap, if it exists, is narrower than many policymakers assumed.
For developers and engineers, the geopolitics is mostly noise. What matters is whether the model is good, whether you can use it, and whether it’s safe and reliable for your use case. On those practical dimensions, R1 delivers.
My Take#
I’ve been building software for three decades, and I’ve learned to be skeptical of “X killer” claims. But DeepSeek R1 is genuinely significant. Not because it “kills” o1 — both are excellent models with different trade-offs — but because it demonstrates that the frontier of AI isn’t a walled garden.
The fact that a fully open-source model can compete with the best proprietary reasoning models fundamentally changes the conversation about AI strategy. If you’re an enterprise evaluating AI vendors, you now have a credible open-source reasoning option. If you’re a startup, you can build on R1 without API costs or vendor lock-in. If you’re a researcher, you can study and improve upon a frontier reasoning model instead of just probing it through an API.
We’re still early in understanding what R1 means for the industry. But sitting here today, testing a locally-running reasoning model that rivals the best in the world, available under an MIT license — this feels like one of those moments that shifts the landscape. The open-source AI movement just got a very powerful new data point in its favor.
