A highly efficient MoE model with 671B total parameters (37B activated), leveraging MLA for fast inference and a 128K token context window. Advanced chain-of-thought support and API compatibility with V2.

DeepSeek V3
Provider: Deepseek
Provider: Deepseek