DeepSeek launches NSA for ultra-fast long-context training and inference

2025-02-18 16:37:45
Collection

ChainCatcher news, according to Jin10, DeepSeek has launched NSA.

DeepSeek claims that NSA is a hardware-consistent and natively trainable sparse attention mechanism designed for ultra-fast long-context training and inference. By optimizing the design for modern hardware, NSA accelerates inference speed while reducing pre-training costs without compromising performance.

In general benchmarks, long-context tasks, and instruction-based reasoning, its performance is comparable to or even better than that of full attention models.

Related tags
ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
Related tags
ChainCatcher Building the Web3 world with innovators