VinBigdata announced the successful construction of a large Vietnamese language model on August 21, creating a foundation for mastering generative AI technology.
Large Language Models (LLMs) are models trained using deep learning techniques on huge text or image datasets. These models are capable of understanding knowledge, generating text, and performing various natural language processing tasks. They are considered the key to developing Generative AI technology - capable of generating new content and ideas in many different forms (text, images, audio, etc.).
With the successful construction of a large Vietnamese language model, VinBigdata will integrate technology to make VinBase (a comprehensive multi-cognitive artificial intelligence platform) a generative AI platform in Vietnam, while providing development solutions based on this technology such as Generative AI chatbot, callbot or new generation ViVi virtual assistant... This technology helps increase the naturalness of machine communication, while supporting users to search and synthesize information faster and simpler than before.
Professor Vu Ha Van - Director of Science of VinBigdata Company said that in the world, there have been a number of large corporations that have successfully researched and launched products based on large language models such as OpenAI with ChatGPT or Google with Bard. In Vietnam, VinBigdata is invested by Vingroup to build a large Vietnamese language model. According to Mr. Van, this model focuses on solving three core problems including improving accuracy, reducing infrastructure costs and ensuring security.
"Instead of needing about 175 billion parameters like ChatGPT, VinBigdata can create a large language model with several billion parameters but still has the ability to generate highly authentic documents, focusing on Vietnamese data and Vietnamese knowledge," said the unit's leader.
Mastering technology, self-development from the first steps, building a large Vietnamese language model, is considered a step forward to help VinBigdata bring generative AI technology into the ecosystem of products and services on the market. Currently, the enterprise has initially applied new technology on the VinBase KB product line (VinBase Knowledge Base Portal). The product has the ability to retrieve information and automatically generate answers based on information collected from extremely large data sets in the knowledge system.
It is expected that in December this year, Vingroup's members will launch two main product lines: VinBase2.0 and the ViGPT application. VinBase 2.0 is an AI platform that creates multiple perceptions with solutions serving businesses and government agencies. Meanwhile, the ViGPT application is introduced by the enterprise as "Vietnamese version of ChatGPT", open for the community to access and test. With ViGPT, users can ask and answer questions about specific information about Vietnam (regulations, legal documents) or local information (history, literature, scenic spots, local specialties).
VinBigdata Joint Stock Company has a database system of up to 3,500 Terabytes. The system has hundreds of thousands of hours of voice data, images, and information that are cleaned, processed, classified, and used for AI training along with R&D infrastructure. This infrastructure includes dozens of NvidiaDGX A100 server clusters. The company also owns a team of Vietnamese professors, scientists, and technology experts from all over the world.
Hoai Phuong
Source link
Comment (0)