235 million TikTok, YouTube and Instagram accounts compromised in web scraping blunder

Security researchers have discovered a publicly accessibly database that contains sensitive information from 235 million users of Instagram, TikTok and YouTube.

The Comparitech report claims that the database – which contains a wealth of information, including users’ names and contact information – belongs to Deep Social, an organisation that gathers personal data from social media accounts and sells them to marketers.

When the researchers tried to contact the organisation, they learned that it was no longer in business and were directed instead to a Hong Kong-based company called Social Data.

Although Social Data denied having any connection to Deep Social, it did acknowledge the breach and was able to secure the exposed database with a password.

Social media scraping

The database was the result of extensive web scraping – an automated technique that gathers and sorts information from websites.

Organisations might perform web scraping to search for financial data to conduct market research, or to collect site data before a website migration.

It is legal to web-scrape personal data, but many social media sites prohibit the practice, due in part to the extensive amounts of information they collect.

Indeed, Facebook and Instagram banned Deep Social from their marketing APIs in 2018, and threatened legal action if the organisation continued to scrape data from their users’ profiles.

Deep Social later announced it would be reducing its operations and has since shut down completely.


It’s unclear how Social Data had access to the database, given that it denied any relationship with the now-defunct Deep Social.

A spokesperson for the organisation defended the practice of web scraping, telling Comparitech, “all of the data is available freely to ANYONE with Internet access. I would appreciate it if you could ensure that this is made clear.”

They added: “Anyone could phish or contact any person that indicates telephone and email on his social network profile description in the same way even without the existence of the database.”

This may be technically true, but we’d argue that it doesn’t help the organisation’s cause. After all, no good defence of alleged privacy accusations should begin by comparing what you do to cyber crime.

And although the information is publicly accessible, users wouldn’t expect this information to be collated and used for marketing purposes – and even if they did, the organisation has a legal obligation to keep the data secure.

That means preventing it from being exposed or misused as well as from being hacked, something that Social Data’s spokesperson appears to have overlooked.

“Please, note that the negative connotation that the data has been hacked implies that the information was obtained surreptitiously. This is simply not true,” they wrote.

However, what happened here – with the organisation leaving the information available online – is just as damaging, because bad actors are constantly looking for misconfigurations that can give them access to data without the need to launch an attack.

Comparitech says it doesn’t know how long the data was exposed before it discovered the database on 1 August, but its research has found that criminals can find and attack unsecured databases within hours of being exposed.

All the latest cyber security news and advice

Do you want the latest advice on how to manage your cyber security risks? IT Governance has a wide variety of webinars and  green papers, providing free expert website.

You can also subscribe to our Weekly Round-up to receive the latest cyber security news and to take a look at our latest offers and resources.

The Weekly Round-up: subscribe now