Social media giants must address data theft risks

As a consequence of the joint declaration signed by regulators from a dozen international privacy watchdogs, including the UK's ICO, Canada's OPC and Hong Kong's OPCCD, has urged major social media platforms to protect users' public posts against scraping, warning that they face legal liability for doing so in most markets.

"In most jurisdictions, personal information that is 'publicly available', 'publicly accessible' or 'public in nature' on the Internet is subject to privacy and data protection laws," they comment. “Therefore, individuals and companies that collect such personal information are responsible for ensuring that they comply with these and other applicable laws. However, social media companies and operators of other websites that host publicly accessible personal information (SMC and other websites) also have data protection obligations with respect to data extraction by third parties from their services. on-line. These obligations will generally apply to personal information, whether that information is publicly accessible or not. Bulk extraction of personal data may constitute a reportable data breach in many jurisdictions.”

The timing of the statement, which was also signed by privacy regulators in Australia, Switzerland, Norway, New Zealand, Colombia, Jersey, Morocco, Argentina and Mexico, all members of the international law enforcement cooperation working group from the Global Privacy Assembly, coincides with the current hype around generative AI models, which typically require large amounts of data for training and could encourage more entities to explore the Internet in a bid to acquire data sets, and Jump on the generative AI bandwagon.

High-profile examples of such systems, such as OpenAI's large ChatGPT language model, have relied (at least in part) on data published online to train their systems, and a class-action lawsuit filed against the US company in June, which which CNN Business reported on alleges that he secretly extracted “massive amounts of personal data from the Internet.”

Among the privacy risks highlighted by regulators are the use of data mining for targeted cyber attacks such as social engineering and phishing; identity fraud; and for tracking, profiling and surveillance of people, such as using data to populate facial recognition databases and provide unauthorized access to authorities, a clear blow to Clearview AI, which has faced a series of controls by international regulators, including several across the EU, over its use of mined data to power a facial recognition identification tool it sold to authorities and other users.

They also warn that the extracted data may be used for unauthorized political or intelligence gathering purposes, including by foreign governments or intelligence agencies. In addition, they can also be used to generate direct marketing or unwanted spam.

They don't directly cite training AI models as one of these "key" privacy risks, but generative AI tools that have been trained on people's data without their knowledge or consent could be repurposed for several of the cases. Use malicious citing. including to impersonate people for targeted cyber attacks, identity fraud or to monitor/surveillance people.

In addition to making the statement public, regulators note that a copy was sent directly to YouTube's parent company, Alphabet; ByteDance, parent of TikTok; Meta (owner of Instagram, Facebook and Threads); Microsoft (LinkedIn); Sina Corp (Weibo); and X (aka the platform formerly known as Twitter), so the major global social media platforms are clearly front and center as international watchdogs consider the privacy risks posed by data mining.

Of course, some platforms have already had major data scandals involving data theft, such as the 2018 Cambridge Analytics data misuse scandal that hit Facebook after a developer on its platform was able to extract data from millions of users. users without their knowledge or consent as a result of the lax permissions the company applied; or the $275 million General Data Protection Regulation (GDPR) fine imposed on Facebook last year in connection with a data mining incident that affected 530 million users as a result of insecure product design. This latest incident is also the subject of a lawsuit by an Irish digital rights group challenging the DPA's conclusion that there was no security breach.

While the regulators' joint statement contains a clear signal of the need to be proactive in protecting user information from scraping, there is no proportionately clear warning to accompany the message not to act and protect data. of people lead to coercive measures, which runs the risk of somewhat diluting the impact of the declaration.

Instead, watchdogs urge platforms to “carefully consider the legality of different types of data mining in the jurisdictions that apply to them and implement measures to protect against illegal data mining.”

“Techniques for extracting value from publicly accessible data are constantly emerging and evolving. Data security is a dynamic responsibility and vigilance is paramount,” they further comment. “Since no safeguard will adequately protect against all potential privacy harms associated with data mining, SMCs and other websites should implement multi-layered technical and procedural controls to mitigate risks.”

Recommended measures to limit user data mining risks mentioned in the letter include having designated internal teams/roles focused on data mining risks; 'rate cap' the number of visits per hour or day from one account to other account profiles and limit access if unusual activity is detected; and monitor how quickly and aggressively a new account begins searching for other users and taking action to respond to abnormal activity.

They also suggest that platforms take steps to detect scrapers by identifying patterns in bot activity, such as having systems to detect suspicious activity on IP addresses.

Taking steps to detect bots, such as implementing CAPTCHA and blocking the IP address where data mining activity is identified, is another recommendation, although bots can solve CAPTCHAs, so that advice already seems outdated.

Other recommended measures are for platforms to take appropriate legal action against scrapers, such as sending 'cease and desist' letters; demand deletion of deleted information; obtain confirmation of deletion; and take other legal actions to enforce terms and conditions that prohibit data mining.

Platforms may also be required to notify affected individuals and privacy regulators under existing data breach laws, these watchdogs warn.

Social media giants who were sent a copy of the letter are encouraged to respond with comments within a month demonstrating how they will meet regulators' expectations.

People are asked to "think long term"

The letter also includes some advice for people to take steps to help protect themselves against the risks of scraping, including suggesting that web users pay attention to the platforms' privacy policies; think carefully about what they choose to share online; and make use of any settings that allow them to control the visibility of their posts.

"Ultimately, we encourage people to think long term," they add. “How would a person feel years later about the information they share today? While SMCs and other websites may offer tools to remove or hide information, that same information can live forever on the web if it has been indexed or removed and then shared.”

The letter also urges people who are concerned that their data has been “illegally or improperly” extracted to contact the platform or website in question and, if they do not receive a satisfactory response, suggests they file a complaint with the corresponding data protection authority. Regulators are therefore encouraging users to be more vigilant about scraping, which could ultimately lead to an increase in investigations and controls in this area.

The dozen international regulators signing the joint statement all come from markets outside the European Union. But, as noted above, EU data protection regulators are already active on data mining risks through measures taken under the bloc's GDPR.

They are also closely monitoring the evolution of security services. Artificial Intelligence generative, so the concerns raised in the letter seem broadly aligned with issues already on the radar of the bloc's data protection authorities.

Notably, Italy's privacy watchdog slapped ChatGPT with a local stop-processing order earlier this year, causing a brief outage of service while OpenAI scrambled with disclosures and checks. Google's Bard AI chatbot took longer to launch in the EU than other regions after its top EU privacy regulator in Ireland raised similar concerns. But EU DPAs are simultaneously coordinating how best to apply local data protection rules to these novel AI chatbots, including with regard to the crucial question of the legality of the data processing used to train the models. GPRD frame light. Therefore, decisions on the fundamental legality of tools like ChatGPT remain pending in the EU.

Earlier this year, France's DPA, the CNIL, also warned that protection against data theft will be a key element of an AI action plan it announced in May.

next >>

X experiments with a $1/year fee for new users

Social media giants must address data theft risks

People are asked to "think long term"

SUBSCRIBE TO TRPLANE.COM

Publish on TRPlane.com

MORE PUBLICATIONS