From Computing Power Competition to Algorithm Innovation: The New Paradigm of AI Led by DeepSeek

IOBC Capital
2025-03-25 11:34:32
Collection
DeepSeek, as a star product of algorithm breakthroughs, what is its relationship with NVIDIA's computing power supply? I would like to first discuss the significance of computing power and algorithms for industry development.

Author: BadBot, IOBC Capital

Just last night, DeepSeek released the V3 version update on Hugging Face ------ DeepSeek-V3-0324, with model parameters of 685 billion, significantly improving code capabilities, UI design, reasoning abilities, and more.

At the recently concluded 2025 GTC conference, Jensen Huang highly praised DeepSeek, emphasizing that the market's previous understanding that DeepSeek's efficient model would reduce the demand for Nvidia chips was incorrect; future computing demands will only increase, not decrease.

As a star product of algorithm breakthroughs, what is the relationship between DeepSeek and Nvidia's computing power supply? I would like to first discuss the significance of computing power and algorithms for industry development. Image

The Symbiotic Evolution of Computing Power and Algorithms

In the field of AI, the enhancement of computing power provides a foundation for running more complex algorithms, enabling models to process larger amounts of data and learn more complex patterns; while the optimization of algorithms allows for more efficient utilization of computing power, improving the efficiency of computational resource usage.

The symbiotic relationship between computing power and algorithms is reshaping the AI industry landscape:

  • Technological Route Differentiation: Companies like OpenAI pursue the construction of ultra-large computing clusters, while DeepSeek focuses on optimizing algorithm efficiency, forming different technical schools.

  • Industrial Chain Restructuring: Nvidia has become the dominant player in AI computing power through the CUDA ecosystem, while cloud service providers lower deployment thresholds through elastic computing services.

  • Resource Allocation Adjustment: Enterprises seek a balance between investment in hardware infrastructure and the development of efficient algorithms.

  • Rise of Open Source Communities: Open-source models like DeepSeek and LLaMA enable the sharing of algorithm innovations and computing power optimization results, accelerating technological iteration and diffusion.

DeepSeek's Technological Innovations

The explosive popularity of DeepSeek is undoubtedly linked to its technological innovations, which I will explain in simple terms for better understanding.

Model Architecture Optimization

DeepSeek adopts a combination architecture of Transformer + MOE (Mixture of Experts) and introduces a Multi-Head Latent Attention (MLA) mechanism. This architecture is like a super team, where the Transformer handles routine tasks, while the MOE acts as a group of experts, each with their own area of expertise. When faced with specific problems, the most skilled expert handles it, greatly improving the model's efficiency and accuracy. The MLA mechanism allows the model to flexibly focus on different important details when processing information, further enhancing the model's performance.

Training Method Innovation

DeepSeek has proposed an FP8 mixed precision training framework. This framework acts like an intelligent resource allocator, dynamically selecting the appropriate computational precision based on the needs of different stages during training. When high-precision calculations are needed, it uses higher precision to ensure the model's accuracy; when lower precision is acceptable, it reduces precision to save computational resources, increase training speed, and reduce memory usage.

Inference Efficiency Improvement

During the inference phase, DeepSeek introduces Multi-token Prediction (MTP) technology. Traditional inference methods predict one token at a time step by step. MTP technology can predict multiple tokens at once, significantly speeding up the inference process while also reducing inference costs.

Breakthrough in Reinforcement Learning Algorithms

DeepSeek's new reinforcement learning algorithm GRPO (Generalized Reward-Penalized Optimization) optimizes the model training process. Reinforcement learning acts like a coach for the model, guiding it to learn better behaviors through rewards and penalties. Traditional reinforcement learning algorithms may consume a lot of computational resources during this process, while DeepSeek's new algorithm is more efficient, capable of improving model performance while reducing unnecessary computations, thus achieving a balance between performance and cost.

These innovations are not isolated technical points but form a complete technical system that reduces computing power requirements across the entire chain from training to inference. Ordinary consumer-grade graphics cards can now run powerful AI models, significantly lowering the threshold for AI applications, allowing more developers and enterprises to participate in AI innovation.

Image

Impact on Nvidia

Many people believe that DeepSeek has bypassed the CUDA layer, thus freeing itself from dependence on Nvidia. In reality, DeepSeek directly optimizes algorithms through Nvidia's PTX (Parallel Thread Execution) layer. PTX is an intermediate representation language between high-level CUDA code and actual GPU instructions. By operating at this level, DeepSeek can achieve more refined performance tuning.

The impact on Nvidia is twofold: on one hand, DeepSeek is actually more deeply bound to Nvidia's hardware and CUDA ecosystem, and the lowering of AI application thresholds may expand the overall market size; on the other hand, DeepSeek's algorithm optimization may change the market's demand structure for high-end chips, with some AI models that originally required GPUs like the H100 now potentially running efficiently on A100 or even consumer-grade graphics cards.

Significance for China's AI Industry

DeepSeek's algorithm optimization provides a technological breakthrough path for China's AI industry. In the context of high-end chip constraints, the idea of "software compensating for hardware" alleviates dependence on top imported chips.

Upstream, efficient algorithms reduce the pressure on computing power demands, allowing computing service providers to extend hardware usage cycles through software optimization and improve return on investment. Downstream, optimized open-source models lower the threshold for AI application development. Many small and medium-sized enterprises can develop competitive applications based on the DeepSeek model without needing large amounts of computing resources, leading to the emergence of more vertical AI solution offerings.

Far-reaching Impact on Web3 + AI

Decentralized AI Infrastructure

DeepSeek's algorithm optimization provides new momentum for Web3 AI infrastructure. The innovative architecture, efficient algorithms, and lower computing power requirements make decentralized AI inference possible. The MoE architecture is naturally suitable for distributed deployment, where different nodes can hold different expert networks without requiring a single node to store the complete model, significantly reducing the storage and computational requirements of a single node, thus enhancing the model's flexibility and efficiency.

The FP8 training framework further reduces the demand for high-end computing resources, allowing more computing resources to join the node network. This not only lowers the threshold for participating in decentralized AI computation but also enhances the overall computational capacity and efficiency of the network.

Multi-Agent System

  • Intelligent Trading Strategy Optimization: Through the collaborative operation of agents analyzing real-time market data, predicting short-term price fluctuations, executing on-chain trades, and supervising trading results, users can achieve higher returns.

  • Automated Execution of Smart Contracts: Agents monitoring smart contracts, executing smart contracts, and supervising execution results work together to automate more complex business logic.

  • Personalized Portfolio Management: AI helps users find the best staking or liquidity provision opportunities in real-time based on their risk preferences, investment goals, and financial situations.

"We can only see a short future, but it's enough to realize there is much work to be done." DeepSeek is seeking breakthroughs through algorithm innovation under computing power constraints, paving a differentiated development path for China's AI industry. Lowering application thresholds, promoting the integration of Web3 and AI, reducing dependence on high-end chips, and empowering financial innovation are reshaping the digital economy landscape. The future development of AI is no longer just a competition of computing power but a competition of collaborative optimization between computing power and algorithms. In this new race, innovators like DeepSeek are redefining the rules of the game with Chinese wisdom.

Related tags
ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators