From OpenAI CEO Sam Altman to Google scientist Andrew Ng, the world's most brilliant AI minds appreciate DeepSeek's open-source approach after the Chinese startup launched two cutting-edge AI models.

The Hangzhou-based company stunned the global AI industry with its open-source reasoning model R1.

Released on January 20, the model shows comparable performance to closed-source models from OpenAI – the developer of ChatGPT – but the training costs are said to be much lower.

deepseek wsj
AI chatbot developed by DeepSeek has received millions of downloads worldwide. Photo: WSJ

DeepSeek V3 – the foundational large language model – was released a few weeks ago and cost just $5.5 million to train, according to DeepSeek.

The company's announcement raised questions about whether tech companies were overspending on graphics chips (GPUs) for AI training, leading to a sell-off in related tech stocks.

Last week, in an “Ask Me Anything” on Reddit, Altman argued that OpenAI was wrong and needed to find a different approach to open source.

The company has always taken a closed approach, keeping details such as specific training methods and the energy costs of its models secret.

“That said, not everyone at OpenAI shares this view,” and “it’s not our highest priority right now,” the OpenAI CEO admitted.

Andrew Ng, founder of Google Brain and former chief scientist at Baidu, said products from DeepSeek and its compatriots show that China is quickly catching up with the US in AI.

“When ChatGPT launched in November 2022, the US was significantly ahead of China in generative AI… but in fact, that gap has been rapidly eroding over the past two years,” he wrote on X. “With models from China like Qwen, Kimi, InternVL, and DeepSeek, China is clearly closing the gap, and in areas like video generation, there have been times when China has appeared to be ahead.”

The Qwen model was developed by Alibaba, while Kimi and InternVL are products of the startup Moonshot AI and the Shanghai AI Lab.

If the US continues to block open source, China will dominate this part of the supply chain and many businesses will eventually adopt models that reflect Chinese values ​​more than American ones, Ng said.

A number of US companies are looking to implement DeepSeek’s model into their products. For example, users of Nvidia’s NIM service have been able to access the R1 model since last week, while Microsoft is also supporting R1 on its Azure cloud and GitHub. Amazon allows customers to build applications using R1 through AWS.

However, some experts also believe that DeepSeek's success should not be exaggerated. Meta's chief AI scientist Yann LeCun said that the idea that "China will surpass the US in AI" because of DeepSeek is wrong.

Rather, “open source models are surpassing proprietary models,” he wrote on Threads.

DeepSeek – a startup spun out of founder Liang Wenfeng’s hedge fund High-Flyer in May 2023 – still faces skepticism about its actual costs and AI model training methods.

Fudan University computer science professor Zheng Xiaoqing pointed out that the cost of training DeepSeek V3 does not include costs related to testing and research, according to the startup's technical report.

DeepSeek's success comes from “technical optimization,” he said, so it doesn't have a major impact on chip procurement or shipments.

(According to SCMP)