DeepSeek's new AI continues to cause a stir in the tech world with its outstanding performance. Photo: SCMP . |
DeepSeek officially introduces DeepSeek V3-0324, the latest version in the V3 large language model (LLM) series.
Like previous versions, this model is released as open source for free through the Hugging Face platform, with significant improvements over previous versions, especially in the areas of reasoning and programming.
Specifically, according to OpenRouter, DeepSeek V3-0324 is built using Mixture of Experts (MoE), a machine learning method that is very popular in some Chinese AI models and has 685 billion parameters.
According to early evaluations, the model has shown impressive performance on a variety of tasks. Meanwhile, a Reddit post shows that DeepSeek V3-0324 has caught up with Google's Sonnet 3.7 model in a test of programming code generation.
Sources also point out that DeepSeek V3-0324 is capable of generating long code snippets without any errors. AI Analytics Vidhya tested the model and recorded its ability to generate 700 lines of code smoothly.
On X, DeepSeek V3-0324’s application also created a big buzz. To prove it, Deepanshu Sharma’s account posted a video showing that this AI model can smoothly generate a complete website with more than 800 lines of code.
DeepSeek became the most notable Chinese AI company in December 2024 when it released DeepSeek-V3, a model that achieved performance comparable to GPT-4o but used a fraction of the computational resources.
Not long after, DeepSeek followed up with its DeepSeek-R1 reasoning model. According to TechCrunch , R1 outperformed OpenAI’s o1 on benchmarks like AIME, MATH-500, and SWE-bench Verified.
At the same time, the $5.6 million figure to train the final stage of DeepSeek's model is also shocking, compared to the hundreds of millions of dollars that leading US companies have to spend to train their models.
Source: https://znews.vn/at-chu-bai-moi-cua-deepseek-lo-dien-post1540831.html
Comment (0)