DeepSeek's surprisingly inexpensive AI model challenges industry giants. Despite claims of a mere $6 million training cost for DeepSeek V3, a closer look reveals a far more substantial investment.
DeepSeek's self-introduction: "Hi, I was created so you can ask anything and get an answer that might even surprise you," highlights its ambition. This AI has significantly impacted the market, notably causing a major NVIDIA stock drop.
Image: ensigame.com
DeepSeek V3's innovative architecture is key to its performance:
- Multi-token Prediction (MTP): Predicts multiple words simultaneously, boosting accuracy and speed.
- Mixture of Experts (MoE): Employs 256 neural networks, activating eight for each token, accelerating training and improving performance.
- Multi-head Latent Attention (MLA): Repeatedly extracts key details, minimizing information loss and enhancing nuance understanding.
Image: ensigame.com
However, SemiAnalysis revealed DeepSeek's extensive infrastructure: approximately 50,000 Nvidia Hopper GPUs (including H800, H100, and H20 units) spread across multiple data centers. The total server investment is estimated at $1.6 billion, with operational costs reaching $944 million.
DeepSeek, a subsidiary of High-Flyer, owns its data centers, offering control and faster innovation implementation. Its self-funded status enhances agility. High salaries (over $1.3 million annually for some researchers) attract top Chinese talent.
Image: ensigame.com
The $6 million training cost claim is misleading, representing only pre-training GPU usage, excluding research, refinement, data processing, and infrastructure. DeepSeek's total AI investment surpasses $500 million. Its lean structure facilitates efficient innovation.
Image: ensigame.com
DeepSeek's success showcases the potential of a well-funded independent AI company. However, its "budget-friendly" narrative is exaggerated; billions in investment, technological advancements, and a strong team are crucial factors. Despite this, DeepSeek's costs still significantly undercut competitors (e.g., $5 million for R1 versus $100 million for ChatGPT4o).