The collision between Big Data and privacy law
Now available at the Social Science Research Network. First published in the Australian Journal of Telecommunications and the Digital Economy.
Abstract
We live in an age where billionaires are self-made on the back of the most intangible of assets – the information they have about us. The digital economy is awash with data. It’s a new and endlessly re-useable raw material, increasingly left behind by ordinary people going about their lives online. Many information businesses proceed on the basis that raw data is up for grabs; if an entrepreneur is clever enough to find a new vein of it, they can feel entitled to tap it in any way they like. However, some tacit assumptions underpinning today’s digital business models are naive. Conventional data protection laws, older than the Internet, limit how Personal Information is allowed to flow. These laws turn out to be surprisingly powerful in the face of ‘Big Data’ and the ‘Internet of Things’. On the other hand, orthodox privacy management was not framed for new Personal Information being synthesised tomorrow from raw data collected today. This paper seeks to bridge a conceptual gap between data analytics and privacy, and sets out extended Privacy Principles to better deal with Big Data.
Extract
Introduction
‘Big Data’ is a broad term capturing the extraction of knowledge and insights from unstructured data. While data processing and analysis is as old as computing, the term ‘Big Data’ has recently attained special meaning, thanks to the vast rivers of raw data that course unseen through the digital economy, and the propensity for entrepreneurs to tap that resource for their own profit, or to build new analytic tools for enterprises. Big Data represents one of the biggest challenges to privacy and data protection society has seen. Never before has so much Personal Information been available so freely to so many.
Big Data promises vast benefits for a great many stakeholders (Michael & Miller 2013: 22-24) but the benefits may be jeopardized by the excesses of a few overly zealous businesses. Some online business models are propelled by a naive assumption that data in the ‘public domain’ is up for grabs. Many think the law has not kept pace with technology, but technologists often underestimate the strength of conventional data protection laws and regulations. In particular, technology neutral privacy principles are largely blind to the methods of collection, and barely distinguish between directly and indirectly collected data. As a consequence, the extraction of Personal Information from raw data constitutes an act of collection and as such is subject to longstanding privacy statutes. Privacy laws such as that of Australia don’t even use the words ‘public’ and ‘private’ to qualify the data flows concerned (Privacy Act 1988).
On the other hand, orthodox privacy policies and static data usage agreements do not cater for the way Personal Information can be synthesised tomorrow from raw data collected today. Privacy management must evolve to become more dynamic, instead of being preoccupied with unwieldy policy documents and simplistic technical notices about cookies.
Thus the fit between Big Data and data privacy standards is complex and sometimes surprising. While existing laws are not to be underestimated, there is a need for data privacy principles to be extended, to help individuals remain abreast of what’s being done with information about them, and to foster transparency regarding the new ways for personal information to be generated.
Conclusion: Making Big Data privacy real
A Big Data dashboard like the one described [in this paper] could serve several parallel purposes in aid of progressive privacy principles. It could reveal dynamically to users what PII can be collected about them through Big Data; it could engage users in a fair and transparent exchange of value-for-PII transaction; and it could enable dynamic consent where users are able to opt in to Big Data processes, and opt out and in again, over time, as their understanding of the PII bargain evolves.
Big Data holds big promises, for the benefit of many. There are grand plans for population-wide electronic health records, new personalised financial services that leverage massive retail databases, and electricity grid management systems that draw on real-time consumption data from smart meters in homes, to extend the life of aging ‘poles and wires’ while reducing greenhouse gas emissions. The value to individuals and operators alike of these programs is amplified as computing power grows, new algorithms are researched, and more and more data sets are joined together. Likewise, the privacy risks are compounded. The potential value of Personal Information in the modern Big Data landscape cannot be represented in a static business model, and neither can the privacy pros and cons be captured in a fixed policy document. New user interfaces and visualisations like a ‘Big Data dashboard’ are needed to bring dynamic extensions to traditional privacy principles, and help people appreciate and intelligently negotiate the insights that can be extracted about them from the raw material that is data.