DeepSeek-V3.2: Bridging Open LLM Efficiency and Excellence

DeepSeek-V3.2: Bridging Open LLM Efficiency and Excellence

33分钟 ·
播放数144
·
评论数0

The document introduces DeepSeek-V3.2, a new open Large Language Model designed to bridge the performance gap between open-source and proprietary systems by focusing on efficiency and advanced reasoning. Its core architectural contribution is the DeepSeek Sparse Attention (DSA) mechanism, which significantly lowers computational complexity in long-context scenarios compared to standard attention. DeepSeek-V3.2 also relies on a scalable Reinforcement Learning framework and a novel pipeline for generating large-scale agentic data, leading to substantial improvements in its capacity for tool use and generalization. Standard DeepSeek-V3.2 achieves performance similar to GPT-5 on reasoning tasks and notably closes the capability gap on complex agentic benchmarks. Crucially, the experimental variant, DeepSeek-V3.2-Speciale, demonstrates state-of-the-art results, including gold-medal performance in the IMO 2025 and IOI 2025, setting a new milestone for open-source LLMs.