Is DeepSeek a Cheater? That's How We Met!
I was going to write this article about the AI industry structure. But at its core, this blog is about strategizing your career and understanding markets for the benefit of your wealth.
One of the biggest lessons I’ve learned, that I write about in MegaWealth: Careers, is that there are critical moments in life. In those moments, even if you are heads down working hard, you need to pause and figure out what is going on.
The DeepSeek release is one of those moments. Let's take a pause and analyze this together.
Back in the day... when I was 19, I was dating a horse rider with Olympic ambitions. He was 21. While we were dating, I found out he was sleeping with someone else.
I took some time to process. We hadn't seen each other in a few weeks. He didn't know I knew.
I asked him out for dinner, intending to break up with him. On the drive there, he said he realized how much I meant to him and wanted to get married. He didn’t say this, but I knew he needed to marry someone to make the Olympic citizenship cutoff and he’d known me the longest. Instantly, without thinking, I blurted out: "Ha!" Not my most tactful moment.
Later, he married the woman he was sleeping with when we were dating. And, after that, she was surprised to find out that he was cheating on her. I felt sad for her. At the same time, I thought: "That's how you met!"
And that brings us to the DeepSeek update.
1. Why is DeepSeek so much more efficient than our Silicon Valley darlings?
Instead of starting from scratch to train its model, DeepSeek got OpenAI's Chat GPT to train it by asking ChatGPT a bunch of questions. This is called distillation. It's like having a teacher teach you physics rather than reinventing the wheel yourself. Far faster and more efficient.
Here's where it gets a bit "Ironic"... like "that's how you met", sleeping with another woman's man, and later being surprised when he cheats on you, ironic.
OpenAI is upset. They feel used. Distillation is against their terms of service. This is the same OpenAI that scraped the web having its AI "learn" from copyrighted material (see NY Times lawsuit, among others).
NVDIA doesn't benefit from their customers creating a new model by picking up where the other one left off. They want everyone to start from scratch so they have to buy more GPUs. But, once again, that's how you met.
One of the best pieces of advice my first billionaire boss gave me was that building a historical perspective, especially early in my career, would be a huge advantage.
The innovation NVDIA made in graphics cards was the exact same innovation DeepSeek just made in AI models. At the time, Intel was running computer graphics through its CPU, or central processing unit, and video games were taking up a bunch of compute cycles, driving increased demand for Intel's most expensive chips.
NVDIA figured out that at any given moment, in any given game, the player is only looking at one angle, one view, of one room.
Furthermore, once the game view is loaded, the only relevant bits are the ones that are changing. So why not only process those relevant bits and not the entire scene? This led to the birth of NVDIA and the GPU, the graphics processing unit. The GPU took the video game processing tasks away from Intel's CPU and reduced demand for Intel's most expensive chips.
So when NVDIA stock drops and there's a panic that DeepSeek cheated by using distillation and finding efficiencies, instead of starting from scratch, I am tempted, once again, to think: "That's how you met!"
2. Did DeepSeek only cost $5.8 million to build?
The build-from-scratch cost is likely to be much higher. But, these models can quote costs how they want to.
Think of it like the cost of owning a car. Lease vs buy vs payment vs maintenance. If you bought for cash (or the Chinese govt loaned you the car), then your costs in a particular month might just be gas.
The rumors that are floating to the top currently seem to be that the Chinese got 10K NVDA H100s (the same ones OpenAI is training on) just before imports were shut off, and that DeepSeek got access to them.
The timing also doesn't seem like a stretch. Trump comes into office, slaps more tariffs on China, and China announces an LLM that beats all of ours, wiping out hundreds of billions of dollars of market cap in a single day. Here they are motivated to report the costs in a way that causes the biggest headlines.
3. How much more efficient is DeepSeek?
DeepSeek isn't just a distillation of ChatGPT. As you would expect from the latest model release, it improved nearly every step of its training pipeline, from data loading, parallelization strategies, and memory optimization.
We already established that the cost numbers aren't apples to apples, because they likely only included the final training run, but what you can compare is the cost of running these models, i.e. deployment costs. At high volumes, DeepSeek saves 80% vs ChatGPT.
DeepSeek just proved creating new cutting-edge models using distillation is possible, at far lower costs than the billions of dollars Anthropic and OpenAI have raised. DeepSeek is the best-performing model on multiple metrics, not a cheap lower-quality copy.
This increases the pool of their future competitors, lowers barriers to entry, and should increase innovation.
4. What are the implications of DeepSeek being Open Source?
DeepSeek is open source, i.e. it put its code and model on GitHub, (not fully open source, in that it didn't share its training data). This pressures OpenAI to be true to its name, because it stopped publishing its models several years ago for "safety" reasons.
Because it is open source, with the right hardware, you can download the code from GitHub and use it locally. I have seen videos of users running these models on $2,000 of hardware. Both Microsoft Azure and Amazon Web Services are already offering DeepSeek's models to their customers.
5. What does DeepSeek mean for OpenAI?
OpenAI responded by lowering pricing and increasing free services. But how do they reasonably respond to a model that is more than 90% cheaper, especially after OpenAI raised $6 billion (at $157 billion valuation) mere months ago?
Microsoft is OpenAI's largest investor, but it's clear they are separate entities. OpenAI prioritizes using Microsoft Azure to process it's data at favorable rates, but OpenAI can use other cloud providers as Azure was unable to keep up with demand.
Microsoft offering DeepSeek on Azure is a natural competitive response to DeepSeek being #1 on Apple's App store, above ChatGPT, being free to download off GitHub, and being offered by their largest cloud competitor Amazon Web Services. But it's a huge blow to OpenAI.
6. What are the risks of downloading the DeepSeek App?
Yes, it's #1 on Apple's App store. But the terms of service, just like TikTok, allow your data to go back to servers in China. That's up to you. Because they are similar, I will continue to use ChatGPT, knowing that many US-based companies will be creating apps using the DeepSeek code.
We get up in arms about data privacy, creating laws and regulations, but then we toss refrain to the wind for a viral video app or to be on the latest trend.
7. What does DeepSeek mean for AI going forward?
My brilliant friend Dr. Radhika Dirks, founder of multiple AI companies, once said to me that it made no sense that each AI model started from scratch, taking in data as if no AI model had ever learned anything in the past. With DeepSeek, those days are over.
There will be a time for starting a model from scratch.
But distillation, aka having current models teach the next models, will play a much bigger role going forward.
This means we should see:
- Rapid cost declines for AI use.
- Even faster advancement in AI, as each model builds on the best.
- More mistakes (fondly called hallucinations) may be taken forward from one model to the next, and
- An acceleration of specific vertical applications being built on these AI models as well as case-specific AI models.
If you enjoyed this analysis and want to learn how to do it yourself, check out my book MegaWealth: Investing where I share how to think like a professional investor.
My next article will dive into what the AI industry may look like going forward and how AI may impact existing industries.
Useful references: