Researchers at Stanford and Washington University spent just $50 to create a reasoning AI model.
Programming and mathematical tests show that S1 (the model's name) performs on par with state-of-the-art reasoning AI models such as OpenAI's o1 and DeepSeek's R1.
Notably, S1 is an open source model, available on the GitHub repository for anyone to access.
The development team shared that they started from a basic available model, then refined it through “distillation” — the process of extracting the “reasoning” ability from another AI model by training on its answers.
Specifically, S1 is distilled from Google's Gemini 2.0 Flash Thinking Experimental model. The distillation process is similar to the way scientists at Berkeley University did to create the model at a cost of about 450 USD (about 11.3 million VND).
The researchers behind s1 have found the simplest way to achieve strong reasoning performance and “scale up during testing,” meaning allowing the AI model to think more before answering a question.
This is one of the breakthroughs in OpenAI's o1, which DeepSeek and other AI labs have tried to replicate through various techniques.
The S1 paper shows that reasoning models can be distilled with a fairly small dataset through a process called supervised fine-tuning (SFT), in which an AI model is explicitly instructed to mimic certain behaviors in the dataset.
SFT is generally cheaper than the large-scale reinforcement learning approach that DeepSeek used to train the R1 model.
Google is providing free access to Gemini 2.0 Flash Thinking Experimental, albeit with a daily frequency limit, via the Google AI Studio platform.
However, Google's terms prohibit reverse engineering its models to develop services that compete with the company's AI products.
S1 is based on a small AI model available for free download from Alibaba-owned AI lab Qwen. To train S1, the researchers created a dataset of 1,000 carefully selected questions, along with answers and the “thought” process behind each answer from Google’s Gemini 2.0 Flash Thinking Experimental.
The training process took less than 30 minutes with 16 Nvidia H100 GPUs, yet still produced strong results on several AI metrics. The cost of renting the necessary computing power was only about $20, said Niklas Muennighoff, a researcher at Stanford.
The researchers used a trick to get S1 to check its work and extend its “thinking time,” such as asking the model to wait by adding the word “wait” to its reasoning process, which helped the model come up with a more accurate answer.
By 2025, Meta, Google, and Microsoft plan to invest hundreds of billions of dollars in AI infrastructure, some of which will be used to train next-generation AI models. That level of investment may still be necessary to drive innovation in AI.
Distillation has proven to be a good way to replicate AI model capabilities at low cost, but it does not create new AI models that are superior to what exists today.
(According to TechCrunch)
Source: https://vietnamnet.vn/he-lo-bi-mat-tao-ra-mo-hinh-ai-ly-luan-sieu-re-chua-den-2-trieu-dong-2369052.html
Comment (0)