The AI (artificial intelligence) model s1 created by American researchers is said to have a running cost of only $50 but provides reasoning capabilities equivalent to OpenAI's much more expensive o1 model. The appearance of s1 comes after the impressive success of DeepSeek, which has caused a stir in Silicon Valley in recent days.
The 'Cheap AI' War Is Getting Livelier Since the Emergence of DeepSeek
The team has made the source code for s1 publicly available on GitHub, along with the code and data used to build the model. A paper published last week explains the process of developing the model, highlighting the clever techniques they used. Rather than starting from scratch with a new reasoning model, the team used an existing language model and performed a “fine-tuning” process by distilling reasoning capabilities from Google’s Gemini 2.0 Flash Thinking Experimental model.
AI's operating costs are 'under $50'
Training the s1 model took just 30 minutes, using 16 Nvidia H100 GPUs. Although each GPU costs around $25,000, the cost of renting the process was less than $50 thanks to cloud computing services. In particular, the team discovered a useful trick: instructing the model to “wait” before giving a final answer, which improved its reasoning and produced better solutions.
While the s1 has achieved remarkable results at a low cost, there are concerns about the model’s scalability. Using Google’s model as a “teacher” raises questions about the s1’s ability to compete with today’s leading AI models. Google will likely be keeping a close eye on the situation, especially in light of the ongoing lawsuit between OpenAI and DeepSeek.
Source: https://thanhnien.vn/my-tao-ra-mo-hinh-ai-sieu-re-hoat-dong-tuong-tu-gpt-o1-185250207182535164.htm
Comment (0)