Jack Ma's Ant Group enters China's low-cost AI market. Photo: Bloomberg . |
Ant Group, a company backed by billionaire Jack Ma, is developing a technique that could cut training costs for its AI technology by 20% using semiconductor chips sourced from China, according to Bloomberg .
The company uses chips purchased from Alibaba and Huawei, as well as applying the Mixture of Experts machine learning method, seen in DeepSeek R1, to train its AI.
Despite the cost cuts, Ant Group's results are comparable to those of AI companies using today's most powerful chips, such as Nvidia's H800.
The company is primarily using and gradually switching to alternatives from AMD and Chinese chips for its latest AI models.
With high-performance software, Ant Group spent 6.35 million yuan ( $880,000 ) to train 1 trillion tokens. But with optimization, the number dropped to 5.1 million yuan. Tokens are units of information that a model receives to learn about the world and provide useful feedback.
This marks Ant's entry into the increasingly accelerating AI race between China and the US, since DeepSeek showed that models can be trained for much less than the billions of dollars spent by OpenAI or Google.
Nvidia's H800, while not the most advanced chip, is still a powerful processor and is banned from export to China by the US, so Chinese companies are scrambling to find alternatives to keep up with the competition.
Ant Group previously released research that suggested its models sometimes outperformed Meta Platforms’ on certain benchmarks. If true, the models would mark a quantum leap in Chinese AI with significantly reduced development costs.
This achievement is due to DeepSeek's MoE machine learning method, which increases efficiency and reduces computational costs. Specifically, this method helps AI models divide problems into smaller parts and only need to activate a small portion of the data that is enough to handle the tasks.
However, training MoE models still requires high-performance chips like Nvidia’s graphics processing units (GPUs). As the title of Ant’s research paper, “Developing MoE models without high-end GPUs,” suggests, the company is trying to break that barrier.
This goes against Nvidia’s strategy, with CEO Jensen Huang arguing that computing demand will continue to grow even as more efficient models like DeepSeek R1 emerge.
He believes companies will need more powerful chips to continue to grow revenue, rather than cheaper ones to cut costs. So Nvidia is sticking to its strategy of developing GPUs with more processing cores, transistors, and memory capacity.
Meanwhile, Ant plans to leverage recent breakthroughs in large language models it has developed, including Ling-Plus and Ling-Lite, to provide AI solutions for industries including healthcare and finance.
The company acquired Chinese online platform Haodf.com in 2025 to boost its artificial intelligence services in the healthcare sector, and also owns AI life assistant app Zhixiaobao and AI financial consulting service Maxiaocai.
In the paper, Ant said Ling-Lite outperformed one of Meta's Llama models on a key measure of English understanding.
Both Ling-Lite and Ling-Plus outperformed DeepSeek's equivalent models on Chinese language tests.
The Ling models have also been made publicly available. Ling-Lite has 16.8 billion parameters, Ling-Plus has 290 billion, which is considered quite large in the field of language modeling, compared to 1,800 billion for ChatGPT GPT-4.5, and 671 for DeepSeek R1.
However, Ant has encountered some stability challenges during training. The company said that even small changes in the model's hardware or structure can lead to sudden increases in the model's error rate.
Source: https://znews.vn/cong-ty-cua-jack-ma-lai-gay-chu-y-post1540514.html
Comment (0)