Recently, Baidu Baike – the Chinese equivalent of Wikipedia – updated its robots.txt file – the file that instructs search engines about which web addresses they can access – and completely blocked Googlebot and Bingbot from indexing content from the platform.
Photo: Shutterstock
The move shows Baidu is trying to protect its online assets amid growing demand for big data to develop artificial intelligence (AI) models and applications.
Following Baidu Baike's robots.txt update, an SCMP investigation found that many items from the platform still appeared in Google and Bing search results, possibly from previously archived old content.
More than two years after OpenAI launched ChatGPT, many of the world's major AI developers are signing deals with content publishers to access quality content for their GenAI projects.
OpenAI signed a deal with Time magazine in June to access the magazine's entire archive of more than 100 years of history.
Cao Phong (according to SCMP)
Source: https://www.congluan.vn/baidu-chan-google-va-bing-thu-thap-noi-dung-truc-tuyen-post309081.html
Comment (0)