Recently, Baidu Baike — a Chinese-language Wikipedia-like service — updated its robots.txt file — a file that instructs search engines about which web addresses they can access — and completely blocked Googlebot and Bingbot from indexing content from the platform.
Photo: Shutterstock
The move shows Baidu's efforts to protect its online assets amid growing demand for big data to develop artificial intelligence (AI) models and applications.
Following Baidu Baike's robots.txt update, an SCMP investigation found that many items from the platform were still appearing in Google and Bing search results, likely from previously archived old content.
More than two years after OpenAI launched ChatGPT, many of the world's major AI developers are signing deals with content publishers to access quality content for their GenAI projects.
OpenAI signed a deal with Time magazine in June to access the magazine's entire archive of more than 100 years of history.
Cao Phong (according to SCMP)
Source: https://www.congluan.vn/baidu-chan-google-va-bing-thu-thap-noi-dung-truc-tuyen-post309081.html
Comment (0)