AI developers rely on collecting large amounts of data from a variety of sources to create large language models. This is the technology behind chatbots like OpenAI's ChatGPT and Anthropic's competitor Claude.
Anthropic was founded by a group of former OpenAI researchers with the promise of developing “responsible” AI systems.
However, Matt Barrie, CEO of Freelancer.com, has accused the San Francisco-based company of hacking into the freelance journalism portal, which receives millions of visitors a day.
Anthropic has created some of the world's most advanced chatbots, rivaling OpenAI's ChatGPT, Photo: Jakub Porzycki
According to data shared with the Financial Times, Freelancer.com received 3.5 million visits from a web “crawler” linked to Anthropic within four hours.
Barrie added that traffic from these bots continued to increase even after Freelancer.com attempted to deny the requests, using standard web protocols to instruct the crawler. Barrie then decided to block traffic from Anthropic's internet addresses altogether.
Kyle Wiens, CEO of iFixit.com, said his electronics repair site received 1 million visits from Anthropic bots in 24 hours. “We had a lot of alerts (due to the high traffic),” he said.
Wiens said iFixit's terms of service prohibit using their data for machine learning purposes. "My first message to Anthropic is: if you're using this data to train your model, that's illegal. My second message is: that's not polite internet behavior," he said.
Data collection isn't new, but it has increased dramatically in the past two years due to the AI race. That has created new costs for websites.
Social network X's move to automatically collect user data to train chatbots may violate European privacy rules. Photo: Reuters
Europe's data protection watchdog is investigating social network X's decision to allow user data to be automatically fed to artificial intelligence startup xAI.
Specifically, on July 26, X users discovered that they had accidentally allowed their posts and interactions with the Grok chatbot to be used to "train and refine" xAI's systems.
This move was done without explicit user consent to share data. The setting cannot be changed on X's mobile app, only on the desktop version.
Privacy experts have questioned whether X’s move violates the rules of the EU’s General Data Protection Regulation, which requires companies that collect or use personal data to first obtain an individual’s consent and disclose why they are doing so. If Ireland’s regulator opens an investigation, X could face fines or penalties.
Last month, Meta paused its plans to train AI on data from Facebook and Instagram platforms in Europe after receiving a request from the Irish DPC regarding GDPR compliance issues. Meta said this marked “a setback for European innovation and competition in AI development.”
Ngoc Anh (according to FT)
Source: https://www.congluan.vn/hang-loat-cong-ty-ai-bi-cao-buoc-thu-thap-du-lieu-trai-phep-post305394.html
Comment (0)