DeepSeek’s new AI model appears to be one of the best ‘open’ challengers yet

DeepSeek V3 is a powerful open AI model by Chinese lab, outperforming top models with 671B parameters.

: DeepSeek has released DeepSeek V3, a powerful 'open' AI model developed with 671 billion parameters and trained on 14.8 trillion tokens. It outperforms models like Meta's Llama 3.1 405B and OpenAI's GPT-4o in coding tasks. Developed on a limited budget, the model was trained in just two months using Nvidia H800 GPUs. DeepSeek has political limitations due to regulatory requirements in China.

DeepSeek's new model, DeepSeek V3, has emerged as a leading 'open' AI model with a vast structure of 671 billion parameters and 14.8 trillion tokens in its dataset, outshining its competitors in various benchmarks, including Meta's and OpenAI's models. This model is designed to handle diverse tasks like coding and translating and has demonstrated superior performance in coding competitions on platforms like Codeforces.

Despite its low training budget of $5.5 million and limited resources of 2048 GPUs over two months, the model embodies remarkable achievement on a frontier-grade level. However, due to its size, DeepSeek V3 demands high-end hardware to operate efficiently and faster than its predecessor, DeepSeek V2, processing 60 tokens per second.

Being a Chinese-developed model, DeepSeek V3's political responses are constrained to align with core socialist values, reflective of the government's control over tech firms. It’s backed by High-Flyer Capital Management, which supports the development of advanced AI, positioning itself against closed-source models like OpenAI.