[ad_1]
Anonymising data is ‘not enough to protect privacy’ as a person’s identity can easily be pieced together from bits of information, study warns
- Under GDPR rules, organisations can only sell personal data by ‘anonymising’ it
- This means stripping it of identifiable details, such as name and email address
- However, machine-learning could be used to reverse this by third-party buyers
Data privacy laws requiring the anoymisation of a person’s data are failing to stop people being identified as the wealth of information snippets available can be assembled like a jigsaw to find someone’s true identity.
The little data nuggets allows a wider picture to be pieced together from information such as postcode, gender and date of birth and can often reveal a person’s name.
A study has warned that despite heightened privacy laws in the wake of GDPR, rolled out following the Cambridge Analytica scandal last year, people are still exposed.
Companies now often sell anonymised data to third parties for a variety of uses, including for analytics and reviewing audience participation.
Scroll down for video
Risk: Although companies such as Facebook are forced to strip personal information from any data they share, researchers from Imperial College London showed machine-learning could be used to reverse this process by third-party buyers – even with incomplete datasets (stock)
That is done by stripping the data of identifying characteristics like names and email addresses, so that individuals cannot, in theory, be identified.
After this process, the data’s no longer subject to data protection regulations, so it can be freely used and sold.
But researchers from Imperial College London and the University of Louvain in Belgium showed machine-learning could be used to reverse this process.
They created an online computer tool that could correctly ‘re-identify’ 99.98 per cent of Americans in any available ‘anonymised’ dataset by using just 15 characteristics, including age, gender, and marital status.
Study first author Dr Luc Rocher, of UC Louvain, said: ‘While there might be a lot of people who are in their thirties, male, and living in New York City, far fewer of them were also born on January 5, are driving a red sports car, and live with two kids – both girls – and one dog.’
The tool first asked a user to put in the first part of their postcode, gender, and date of birth and estimated the probability they could be identified from an ‘anonymous’ dataset.
The estimate dramatically increased as the user gave more personal details such as marital status, number of vehicles, house ownership status and employment status.
Not enough? The EU General Data Protection Regulation (GDPR) was rolled out in the wake of the Facebook and Cambridge Analytica scandal, early last year, but may be insufficient (stock)
Study senior author Dr Yves-Alexandre de Montjoye, of Imperial’s Department of Computing, said: ‘This is pretty standard information for companies to ask for.
‘Although they are bound by GDPR guidelines, they’re free to sell the data to anyone once it’s anonymised. Our research shows just how easily – and how accurately – individuals can be traced once this happens.’
‘Companies and governments have downplayed the risk of re-identification by arguing that the datasets they sell are always incomplete.
‘Our findings contradict this and demonstrate that an attacker could easily and accurately estimate the likelihood that the record they found belongs to the person they are looking for.’
The findings were published in the journal Nature Communications.
[ad_2]
READ SOURCE