The European Court of Justice recently ruled on the so-called "Right to be Forgotten" granting members of the public limited rights to request that search engines like Google suppress links to Personal Information under some circumstances. The decision has been roundly criticised by technologists and by American libertarians -- acting out the now familiar ritualised privacy arguments around human rights, freedom of speech, free market forces and freedom to innovate (and hence the bad pun in the title of this article). Surprisingly even some privacy advocates like Jules Polonetsky (quoted in The New Yorker) has a problem with the ECJ judgement because he seems to think it's extremist.
Of the various objections, the one I want to answer here is that search engines should not have to censor "facts" retrieved from the "public domain".
On September 30, I am participating in a live panel discussion of the Right To Be Forgotten, hosted by the IEEE; you can register here and download a video recording of the session later.
Update: recording now available here.
In an address on August 18, the European Union's Justice Commissioner Martine Reicherts made the following points about the Right to be Forgotten (RTBF):
- "[The European Court of Justice] said that individuals have the right to ask companies operating search engines to remove links with personal information about them -- under certain conditions. This applies when information is inaccurate, for example, or inadequate, irrelevant, outdated or excessive for the purposes of data processing. The Court explicitly ruled that the right to be forgotten is not absolute, but that it will always need to be balanced against other fundamental rights, such as the freedom of expression and the freedom of the media -- which, by the way, are not absolute rights either".
In the current (September 29, 2014) issue of New Yorker, senior legal analyst Jeffrey Toobin looks at RTBF in the article "The Solace of Oblivion". It's a balanced review of a complex issue, which acknowledges the transatlantic polarization of privacy rights and freedom of speech.
Toobin interviewed Kent Walker, Google's general counsel. Walker said Google likes to think of itself as a "card catalogue": "We don't create the information. We make it accessible. A decision like [the ECJ's], which makes us decide what goes inside the card catalogue, forces us into a role we don't want."
But there's a great deal more to search than Walker lets on.
Google certainly does create fresh Personal Information, and in stupendous quantities. Their search engine is the bedrock of a hundred billion dollar business, founded on a mission to "organize the world's information". Google search is an incredible machine, the result of one of the world's biggest ever and ongoing software R&D projects. Few of us now can imagine life without Internet search and instant access to limitless information that would otherwise be utterly invisible. Search really is magic – just as Arthur C. Clarke said any sufficiently advanced technology would be.
On its face therefore, no search result is a passive reproduction of data from a "public domain". Google makes the public domain public.
But while search is free, it is hyper profitable, for the whole point of it is to underpin a gigantic advertising business. The search engine might not create the raw facts and figures in response to our queries, but it covertly creates and collects symbiotic metadata, complicating the picture. Google monitors our search histories, interests, reactions and habits, as well as details of the devices we're using, when and where and even how we are using them, all in order to divine our deep predilections. These insights are then provided in various ways to Google's paying customers (advertisers) and are also fed back into the search engine, to continuously tune it. The things we see courtesy of Google are shaped not only by their page ranking metrics but also by the company's knowledge of our preferences (which it forms by watching us across the whole portfolio of search, Gmail, maps, YouTube, and the Google+ social network). When we search for something, Google tries to predict what we really want to know.
In the modern vernacular, Google hacks the public domain.
The collection and monetization of personal metadata is inextricably linked to the machinery of search. The information Google serves up to us is shaped and transformed to such an extent, in the service of Google's business objectives, that it should be regarded as synthetic and therefore the responsibility of the company. Their search algorithms are famously secret, putting them beyond peer review; nevertheless, there is a whole body of academic work now on the subtle and untoward influences that Google exerts as it filters and shapes the version of reality it thinks we need to see.
Some objections to the RTBF ruling see it as censorship, or meddling with the "truth". But what exactly is the state of the truth that Google purportedly serves up? Search results are influenced by many arbitrary factors of Google's choosing; we don't know what those factors are, but they are dictated by Google's business interests. So in principle, why is an individual's interests in having some influence over search results any less worthy than Google's? The "right to be forgotten" is an unfortunate misnomer: it is really more of a 'limited right to have search results filtered differently'.
If Google's machinery reveals Personal Information that was hitherto impossible to find, then why shouldn't it at least participate in protecting the interests of the people affected? I don't deny that modern technology and hyper-connectivity creates new challenges for the law, and that traditional notions of privacy may be shifting. But it's not a step-change, and in the meantime, we need to tread carefully. There are as many unintended consequences and problems in the new technology as there are in the established laws. The powerful owners of benefactors of these technologies should accept some responsibility for the privacy impacts. With its talents and resources, Google could rise to the challenge of better managing privacy, instead of pleading that it's not their problem.
Another week, another security collaboration launch!
"Simply Secure" calls itself “a small but growing organization [with] expertise in usability research, design, software development, and product management". Their mission has to do with improving the security functions that built-in so badly in most software today. Simply Secure is backed by Google and Dropbox, and supported by a diverse advisory board.
It's early days (actually early day, singular) so it might be churlish to point out that Simply Secure's strategic messaging is a little uneven ... except that the words being used to describe it shed light on the clarity of the thinking.
My first exposure to Simply Secure came last night, when I read an article in the Guardian by Cory Doctorow (who is one of their advisers). Doctorow places enormous emphasis on privacy; the word “privacy" outnumbers “security" 16 to three in the body of his column. Another admittedly shorter report about the launch by The Next Web doesn't mention privacy at all. And then there's the Simply Secure blog post, which cites privacy a great deal but every single time in conjunction with security, as in “security and privacy". That repeated phrasing conveys, to me at least, some discomfort. As I say, it's early days and the team is doubtless sorting out how to weigh and progress these closely related objectives.
But I hope they do it quickly. On the face of it, Simply Secure might only scratch the surface of privacy.
Doctorow's Guardian article is mostly concerned with encryption and the terrible implementations that have plagued us since the dawn of the Internet. It's definitely important that we improve here – and radically. If the Simply Secure initiative does nothing but make encryption easier to integrate into commodity software, that would be a great thing. I'm all for it. But it won't necessarily or even probably lead to better privacy, because privacy is about restraint not secrecy or anonymity.
As we go about our lives, we actually want to be known by others, but we want those who know us to be restrained in what they do with the knowledge they have about us. Privacy is the protection you need when your affairs are not secret.
I know Doctorow knows this – I've seen his terrific little speech on the steps on Comic-Con about PRISM. So I'm confused by his focus on cryptography.
How far does encryption get us? If we're using social networks, or if we're shopping and opting in to loyalty programs or selected targeted marketing, or if we're sharing our medical records with relatives, medicos, hospitals and researchers, then encryption becomes moot. We need mechanisms to restrain what the receivers of our personal information do with it. We all know the business model at work behind “free" online services; using encryption to protect privacy in social networking for instance would be like using an armoured van to deliver your valuables to Bernie Madoff.
Another limitation of user-centric or user-managed encryption has to do with Big Data. A great deal of personal information about us is created and collected unseen behind our backs, by sensors, and by analytics processes than manage to work out who we are by linking disparate data streams together. How could SS ameliorate those sorts of problems? If the SS vision includes encryption at rest as well as in transit, then how will the user control or even see all the secondary uses of their encrypted personal information?
There's a combativeness in Doctorow's explanation of Simply Secure and his tweets from yesterday on the topic. His aim is expressly to thwart the surveillance state, which in his view includes a symbiosis (if not conspiracy) between government and internet companies, where the former gets their dirty work done by the latter. I'm sure he and I both find that abhorrent in equal measure. But I argue the proper response to these egregious behaviours is political not technological (and political in the broad sense; I love that Snowden talks as much about accountability, legal processes, transparency and research as he does about encryption). If you think the government is exploiting the exploiters, then DIY encryption is a pretty narrow counter-measure. This is not the sort of society we want to live in, so let's work to change the establishment, rather than try to take it on in a crypto shoot-out.
Yes security technology is important but it's not nearly as important for privacy as the Rule of Law. Data privacy regimes instil restraint. The majority of businesses come to know that they are not at liberty to over-collect personal information, nor to re-use personal information unexpectedly and without consent. A minority of organisations flout data privacy principles, for example by slyly refining raw data into valuable personal knowledge, exploiting the trust citizens and users put in them. Some of these outfits flourish in the United States – the Canary Islands of privacy. Worldwide, the policing of privacy is patchy indeed, yet there have been spectacular legal victories in Europe and elsewhere against the excessive practices of really big companies like Facebook with their biometric data mining of photo albums, and Google's drift net-like harvesting of traffic from unencrypted Wi-Fi networks.
Pragmatically, I'm afraid encryption is such a fragile privacy measure. Once secrecy is penetrated, we need regulations to stem exploitation of our personal information.
By all means, let's improve cryptographic engineering and I wish the Simply Secure initiative all the best. So long as they don't call security privacy.
I have a new academic paper due to be published in October, in the Australian Journal of Telecommunications and the Digital Economy. Here is an extract.
Update: see Telecommunications Society members page.
The collision between Big Data and privacy law
We live in an age where billionaires are self-made on the back of the most intangible of assets – the information they have about us. The digital economy is awash with data. It's a new and endlessly re-useable raw material, increasingly left behind by ordinary people going about their lives online. Many information businesses proceed on the basis that raw data is up for grabs; if an entrepreneur is clever enough to find a new vein of it, they can feel entitled to tap it in any way they like. However, some tacit assumptions underpinning today's digital business models are naive. Conventional data protection laws, older than the Internet, limit how Personal Information is allowed to flow. These laws turn out to be surprisingly powerful in the face of 'Big Data' and the 'Internet of Things'. On the other hand, orthodox privacy management was not framed for new Personal Information being synthesised tomorrow from raw data collected today. This paper seeks to bridge a conceptual gap between data analytics and privacy, and sets out extended Privacy Principles to better deal with Big Data.
'Big Data' is a broad term capturing the extraction of knowledge and insights from unstructured data. While data processing and analysis is as old as computing, the term 'Big Data' has recently attained special meaning, thanks to the vast rivers of raw data that course unseen through the digital economy, and the propensity for entrepreneurs to tap that resource for their own profit, or to build new analytic tools for enterprises. Big Data represents one of the biggest challenges to privacy and data protection society has seen. Never before has so much Personal Information been available so freely to so many.
Big Data promises vast benefits for a great many stakeholders (Michael & Miller 2013: 22-24) but the benefits may be jeopardized by the excesses of a few overly zealous businesses. Some online business models are propelled by a naive assumption that data in the 'public domain' is up for grabs. Many think the law has not kept pace with technology, but technologists often underestimate the strength of conventional data protection laws and regulations. In particular, technology neutral privacy principles are largely blind to the methods of collection, and barely distinguish between directly and indirectly collected data. As a consequence, the extraction of Personal Information from raw data constitutes an act of collection and as such is subject to longstanding privacy statutes. Privacy laws such as that of Australia don't even use the words 'public' and 'private' to qualify the data flows concerned (Privacy Act 1988).
On the other hand, orthodox privacy policies and static data usage agreements do not cater for the way Personal Information can be synthesised tomorrow from raw data collected today. Privacy management must evolve to become more dynamic, instead of being preoccupied with unwieldy policy documents and simplistic technical notices about cookies.
Thus the fit between Big Data and data privacy standards is complex and sometimes surprising. While existing laws are not to be underestimated, there is a need for data privacy principles to be extended, to help individuals remain abreast of what's being done with information about them, and to foster transparency regarding the new ways for personal information to be generated.
Conclusion: Making Big Data privacy real
A Big Data dashboard like the one described could serve several parallel purposes in aid of progressive privacy principles. It could reveal dynamically to users what PII can be collected about them through Big Data; it could engage users in a fair and transparent exchange of value-for-PII transaction; and it could enable dynamic consent where users are able to opt in to Big Data processes, and opt out and in again, over time, as their understanding of the PII bargain evolves.
Big Data holds big promises, for the benefit of many. There are grand plans for population-wide electronic health records, new personalised financial services that leverage massive retail databases, and electricity grid management systems that draw on real-time consumption data from smart meters in homes, to extend the life of aging 'poles and wires' while reducing greenhouse gas emissions. The value to individuals and operators alike of these programs is amplified as computing power grows, new algorithms are researched, and more and more data sets are joined together. Likewise, the privacy risks are compounded. The potential value of Personal Information in the modern Big Data landscape cannot be represented in a static business model, and neither can the privacy pros and cons be captured in a fixed policy document. New user interfaces and visualisations like a 'Big Data dashboard' are needed to bring dynamic extensions to traditional privacy principles, and help people appreciate and intelligently negotiate the insights that can be extracted about them from the raw material that is data.
A Social Media Week Sydney event #SMWSydney
Law Lounge, Sydney University Law School
New Law School Building
Eastern Ave, Camperdown
Fri, Sep 26 - 10:00 AM - 11:30 AM
How can you navigate privacy fact and fiction, without the geeks and lawyers boring each other to death?
It's often said that technology has outpaced privacy law. Many digital businesses seem empowered by this brash belief. And so they proceed with apparent impunity to collect and monetise as much Personal Information as they can get their hands on.
But it's a myth!
Some of the biggest corporations in the world, including Google and Facebook, have been forcefully brought to book by privacy regulations. So, we have to ask ourselves:
- what does privacy law really mean for social media in Australia?
- is privacy "good for business"?
- is privacy "not a technology issue"?
- how can digital businesses navigate fact & fiction, without their geeks and lawyers boring each other to death?
In this Social Media Week Master Class I will:
- unpack what's "creepy" about certain online practices
- show how to rate data privacy issues objectively
- analyse classic misadventures with geolocation, facial recognition, and predicting when shoppers are pregnant
- critique photo tagging and crowd-sourced surveillance
- explain why Snapchat is worth more than three billion dollars
- analyse the regulatory implications of Big Data, Biometrics, Wearables and The Internet of Things.
We couldn't have timed this Master Class better, coming two weeks after the announcement of the Apple Watch, which will figure prominently in the class!
So please come along, for a fun and in-depth a look at social media, digital technology, the law, and decency.
About the presenter
Steve Wilson is a technologist, who stumbled into privacy 12 years ago. He rejected those well meaning slogans (like "Privacy Is Good For Business!") and instead dug into the relationships between information technology and information privacy. Now he researches and develops design patterns to help sort out privacy, alongside all the other competing requirements of security, cost, usability and revenue. His latest publications include:
- "Big Privacy: The new standard for Big Data Privacy" from Constellation Research, and
- "The collision between Big Data and privacy law" due out in October in the Australian Journal of Telecommunications and the Digital Economy.
You can be forgiven if the FIDO Alliance is not on your radar screen. It was launched barely 18 months ago, to help solve the "password crisis" online, but it's already proven to be one of most influential security bodies yet.
The typical Internet user has dozens of accounts and passwords. Not only are they a pain in the arse, poor password practices are increasingly implicated in fraud and terrible misadventures like the recent "iCloud Hack" which exposed celebrities' personal details.
With so many of our assets, our business and our daily lives happening in cyberspace, we desperately need better ways to prove who we are online – and even more importantly, prove what we entitled to do there.
The FIDO Alliance is a new consortium of identity management vendors, product companies and service providers working on strong authentication standards. FIDO’s vision is to tap the powers of smart devices – smart phones today and wearables tomorrow – to log users on to online services more securely and more conveniently.
FIDO was founded by Lenovo, PayPal, and security technology companies AGNITiO, Nok Nok Labs and Validity Sensors, and launched in February 2013. Since then the Alliance has grown to over 130 members. Two new authentication standards have been published for peer review, half a dozen companies showcased FIDO-Ready solutions at the 2014 Consumer Electronic Show (CES) in Las Vegas, and PayPal has released its ground-breaking pay-by-fingerprint app for the Samsung Galaxy S5.
The FIDO Alliance includes technology heavyweights like Google, Lenovo, Microsoft and Samsung; payments giants Discover, MasterCard, PayPal and Visa; financial services companies such as Aetna, Bank of America and Goldman Sachs; and e-commerce players like Netflix and Salesforce.com. There are also a couple of dozen biometrics vendors, many leading Identity and Access Management (IDAM) solutions and services, and almost every cell phone SIM and smartcard supplier in the world.
I have been watching FIDO since its inception and reporting on it for Constellation Research. The third update in my series of research reports on FIDO is now available and can be downloaded here. The report looks in depth at what the Alliance has to offer vendors and end user communities, its critical success factors, and how and why this body is poised to shake up authentication like never before.
Update 22 September 2014
Last week, Apple suddenly went from silent to expansive on privacy, and the thrust of my blog straight after the Apple Watch announcement is now wrong. Apple posted a letter from CEO Tim Cook at www.apple.com/privacy along with a document that sets outs how "We’ve built privacy into the things you use every day".
The paper is very interesting. It's a sophisticated and balanced account of policy, business strategy and technology elements that go to create privacy. Apple highlights that they:
- forswear the exploitation of customer data
- do not scan content or messages
- do not let their small "iAd" business take data from other Apple departments
- require certain privacy protective practices on the part of their health app developers.
They have also provided quite decent information about how Siri and health data is handled.
Apple's stated privacy posture is all about respect and self-restraint. Setting out these principles and commitments is a very welcome development indeed. I congratulate them.
Today Apple launched their much anticipated wrist watch, described by CEO Tim Cook as "the most personal device they have ever developed". He got that right!
Rather more than a watch, it's a sort of guardian angel. The Apple Watch has Siri built-in, along with new haptic sensors and buzzers, a heartbeat monitor, accelerometer, and naturally the GPS and Wi-Fi geolocation capability to track your speed and position throughout the day. So they say "Apple Watch is an all-day fitness tracker and a highly advanced sports watch in a single device".
The Apple Watch will be a paragon of digital disruption. To understand and master disruption today requires the coordination of mobility, Big Data, the cloud and user interfaces. These cannot be treated as isolated technologies, so when a company like Apple controls them all, at scale, real transformation follows.
Thus Apple is one of the few businesses that can make promises like this: "Over time, Apple Watch gets to know you the way a good personal trainer would". In this we hear echoes of the smarts that power Siri, and we are reminded that amid the novel intimacy we have with these devices, many serious privacy problems have yet to be resolved.
The Apple Event today was a play in four acts:
Act I: the iPhone 6 release;
Act II: Apple Pay launch;
Act III: the Apple Watch announcement;
Act IV: U2 played live and released their new album free on iTunes!
It was fascinating to watch the thematic differences across these stanzas. With Apple Pay, they stressed security and privacy; we were told about the Secure Element, the way card numbers are replaced by random numbers (tokenization), and an architecture where Apple cannot see how much you spend nor where you spend it. On the other hand, when it came to the Apple Watch and its integrated health sensors, privacy wasn't mentioned, not at all. We are left to deduce that aggregating personal health data at Apple's servers is a part of a broader plan.
With Siri, Apple sadly fails all these tests.See Update 22 September 2014 above.
It's been left to journalists to try and find out what Apple does with the information it mines from Siri. Wired magazine discovered eventually that Apple retains masked Siri voice recordings for six months; it then purportedly de-identifies them and keeps them for a further 18 months, for research. Yet even these explanations don't touch on the extracted contents of the communications, nor the metadata, like the trends and correlations that go to Siri's learning. If the purpose of Siri is ostensibly to automate the operation of the iPhone and its apps, then Apple should be refrain from using the by-products of Siri's voice processing for anything else.
But we just don't know what they do, and Apple imposes no self-restraint.See Update 22 September 2014 above.
We should hope for radically greater transparency with the Apple Watch and its health apps. Most of the watch's data processing and analytics will be carried out in the cloud. So Apple will come to hold detailed records of its users' exercise regimes, their performance figures, trend data and correlations. These are health records. Inevitably, health applications will take in other medical data, like food diaries entered by users, statistics imported from other databases, and detailed measurements from Internet-connected scales, blood pressure monitors and even medical devices. Apple will see what we're doing to improve our health, day by day, year on year. They will come to know more about what's making us healthy and what's not than we do ourselves.
Now, the potential benefits from this sort of personal technology to self-managed care and preventative medicine are enormous. But so are the data management and privacy obligations.
Within the US, Apple will doubtless be taking steps to avoid falling under the stringent HIPAA regulations, yet in the rest of the world, a more subtle but far-reaching problem looms. Many broad based data privacy regimes forbid the collection of health information without consent. And the laws of the European Union, Australia, New Zealand and elsewhere are generally technology neutral. This means that data collected directly from patients or doctors, and fresh data collected by way of automated algorithms are treated essentially the same way. So when a sophisticated health management app running in the cloud somewhere mines all that exercise and lifestyle data, and starts to make inferences about health and wellbeing, great care needs to be taken that the indiviuals concerned know what's going on in advance, and have given their informed consent.
It ought to be possible to expressly opt in to Big Data processes when you can understand the pros and cons and the net benefits, and to later opt out, and opt back in again, as the benefit equation shifts over time. But even visualising the products of Big Data is hard; I believe graphical user interfaces (GUIs) to allow people to comprehend and actively control the process will be one of the great software design problems of our age.
Apple are obviously preeminent in GUI and user experience innovation. You would think if anyone can create the novel yet intuitive interfaces desperately needed to control Big Data PII, Apple can. But first they will have to embrace their responsibilities for the increasingly intimate details they are helping themselves to. If the Apple Watch is "the most personal device they've ever designed" then let's see privacy and data protection commitments to match.
Constellation's Connected Enterprise (CCE) is an immersive innovation summit for senior business leaders. The theme of this year’s Connected Enterprise is Dominate Digital Disruption. There will be over 200 other early adopters at CCE, at Half Moon Bay outside San Francisco, to discover and share how digital businesses can realise their brand promises, transform their business models, increase revenues, reduce costs, and improve compliance.
CCE is a three day executive retreat, comprising more or less continuous keynotes from visionaries and interactive best practices panels. There are deep one-on-one interviews with market makers, and numerous new enterprise technology demos. And there's the the Constellation SuperNova Awards Gala Dinner.
Register before September 30 to take advantage of early bird pricing. Use code BBLG14 for VIP privileges throughout the event.
See you there!
Posted in Constellation Research
The problem of identity takeover
The root cause of much identity theft and fraud today is the sad fact that customer reference numbers and personal identifiers are so easy to copy. Simple numerical data like bank account numbers and health IDs can be stolen from many different sources, and replayed in bogus trans-actions.
Our personal data nowadays is leaking more or less constantly, through breached databases, websites, online forms, call centres and so on, to such an extent that customer reference numbers on their own are no longer reliable. Privacy consequentially suffers because customers are required to assert their identity through circumstantial evidence, like name and address, birth date, mother’s maiden name and other pseudo secrets. All this data in turn is liable to be stolen and used against us, leading to spiralling identity fraud.
To restore the reliability of personal identifiers, we need to know their pedigree. We need to know that a presented number is genuine, that it originated from a trusted authority, it’s been stored safely by its owner, and it’s been presented with the owner’s consent.
A practical response to ID theft
Several recent breaches of government registers leave citizens vulnerable to ID theft. In Korea, the national identity card system was attacked and it seems that all Korean's citizen IDs will have to be re-issued. In the US, Social Security Numbers are often stolen and used tin fraudulent identifications; recently, SSNs of 800,000 Post Office employees appear to have been stolen along with other personal records.
We could protect people against having their stolen identifiers used behind their backs. It shouldn't be necessary to re-issue every Korean's ID. And changes could be made to improve the relibaility of identification data, without dramatically changing the backend processes. That is, if a Relying Party has always used SSN fpor instance as part of its identification regime, they could continue to do so, if only the actual Social Security Numbers being received were reliable!
The trick is to be able to tell "original" ID numbers from "copies". But what does "original" even mean in the digital world? A more precise term for what we really want is pedigree. What we need is to be able to present numerical data in such a way that the receiver may be sure of its pedigree; that is, know that the data were originally issued by an authoritative body, that the data has been kept safe, and that each presentation of the data has occured under the owner's control.
These objectives can be met with the help of smart cryptographic technologies which today are built into most smart phones and smartcards, and which are finally being properly exploited by initiatives like the FIDO Alliance.
"Notarising" personal data in chip devices
There are ways of issuing personal data to a smart chip device that prevent those data from being stolen, copied and claimed by anyone else. One way to do so is to encapsulate and notarise personal data in a unique digital certificate issued to a chip. Today, a great many personal devices routinely embody cryptographically suitable chips for this purpose, including smart phones, SIM cards, “Secure Elements”, smartcards and many wearable computers.
Consider an individual named Smith to whom Organisation A has issued a unique customer reference number N. If N is saved in ordinary computer memory or something like a magnetic stripe card, then it has no pedigree. Once the number N is presented by the cardholder in a transaction, it looks like any other number. To better safeguard N in a chip device, it can be sealed into a digital certificate, as follows:
1. generate a fresh private-public key pair inside Smith’s chip
2. export the public key
3. create a digital certificate around the public key, with an attribute corresponding to N
4. have the certificate signed by (or on behalf of) organisation A.
The result of coordinating these processes and technologies is a logical triangle that inextricably binds cardholder Smith to their reference number N and to a specific personally controlled device. The certificate signed by organisation A attests to Smith’s ownership of both N and a particular key unique to the device. Keys generated inside the chip are retained internally, never divulged to outsiders. It is impossible to copy the private key to another device, so the triangle cannot be cloned, reproduced or counterfeited.
Note that this technique lies at the core of the EMV "Chip-and-PIN" system where the smart payment card digitally signs cardholder and transaction data, rendering it immune to replay, before sending it to the merchant terminal. See also my 2012 paper Calling for a uniform approach to card fraud, offline and on. Now we should generalise notarised personal data and digitally signed transactions beyond Card-Present payments into as much online business as possible.
Restoring privacy and consumer control
When Smith wants to present their personal number in an electronic transaction, instead of simply copying N out of memory (at which point it would lose its pedigree), Smith’s transaction software digitally signs the transaction using the certificate containing N. With standard security software, any third party can then verify that the transaction originated from a genuine chip holding the unique key certified by A as matching the number N.
Note that N doesn’t have to be a customer number or numeric identifier; it could be any personal data, such as a biometric template or a package of medical information like an allergy alert.
The capability to manage multiple key pairs and certificates, and to sign transactions with a nominated private key, is increasingly built into smart devices today. By narrowing down what you need to know about someone to a precise customer reference number or similar personal data item, we will reduce identity theft and fraud while radically improving privacy. This sort of privacy enhancing technology is the key to a safe Internet of Things, and fortunately now is widely available.
Addressing ID theft
Perhaps the best thing governments could do immediately is to adopt smartcards and equivalent smart phone apps for holding and presenting ID numbers. The US government has actually come close to such a plan many times. Chip-based Social Security Cards and Medicare Cards have been proppsed before, without relaising their full potential. For these devices would best be used as above to hold a citizen's identifiers and present them cryptographically, without vulnerability to ID theft and takeover. We wouldn't have to re-issue compormised SSNs; we would instead switch from manual presentation of these numbers to automatic online presentation, with a chip card or smart phone app conveying the data through digitally signatures.
Updated from original post January 2013.
I have come to believe that a systemic conceptual shortfall affects typical technologists’ thinking about privacy. It may be that engineers tend to take literally the well-meaning slogan that “privacy is not a technology issue”. And I say this in all seriousness. We are forever sugar coating privacy, urging that "privacy is good for business". It's naive. There are plenty of extremes where - sadly - some businesses do very well ignoring privacy. In the mainstream, many organization struggle to resolve privacy with other competing demands, like security, usability, cost and time to market.
I believe the best thing we can do for privacy systemically is to treat it like another one of the many often conflicting requirements faced by designers and engineers, and improve the tools they have to resolve the right balance. This is what engineers do.
Online, we’re talking about data privacy, or data protection, but systems designers bring to work a spectrum of personal outlooks about privacy in the human sphere. Yet what matters is the precise wording of data privacy law, like Australia’s Privacy Act. To illustrate the difference, here’s the sort of experience I’ve had time and time again.
During the course of conducting a PIA in 2011, I spent time with the development team working on a new government database. These were good, senior people, with sophisticated understanding of information architecture, and they’d received in-house privacy training. But they harboured restrictive views about privacy. An important clue was the way they habitually referred to “private” information rather than Personal Information (or equivalently, Personally Identifiable Information, PII). After explaining that Personal Information is the operable term in Australian legislation, and reviewing its definition as essentially any information about an identifiable person, we found that the team had not appreciated the extent of the PII in their system. They had overlooked that most of their audit logs collect PII, albeit indirectly and automatically, and that information about clients in their register provided by third parties was also PII (despite it being intuitively ‘less private’ by virtue of originating from others).
I attributed these blind spots to the developers’ loose framing of “private” information. Online and in privacy law alike, things are very crisp. The definition of PII as any data relating to an individual whose identity is readily apparent sets a low bar, embracing a great many data classes and, by extension, informatics processes. It might be counter-intuitive that PII originating from so many places (even the public domain) falls under privacy regulations, yet the definition of PII is clear cut and readily factored into systems analysis. After getting that, the team engaged in the PIA with fresh energy, and we found and rectified several privacy risks that had gone unnoticed.
Here are some more of the recurring misconceptions I’ve noticed over the past decade:
- “Personal” Information is sometimes taken to mean especially delicate information such as payment card details, rather than any information pertaining to an identifiable individual; see also this exchange with US data breach analyst Jake Kouns over the Epsilon incident in 2011 in which tens of millions of user addresses were taken from a bulk email house;
- the act of collecting PII is sometimes regarded only in relation to direct collection from the individual concerned; technologists can overlook that PII provided by a third party to a data custodian is nevertheless being collected by the custodian; likewise technologists may not appreciate that generating PII internally, through event logging for instance, also represent collection.
These instances and others show that many ICT practitioners suffer important gaps in their understanding. Security professionals in particular may be forgiven for thinking that most legislated Privacy Principles are legal technicalities irrelevant to them, for generally only one of the principles in any given set is overtly about security. Yet every one of the privacy principles in any data protection regime are impacted by information technology and security practices; see Mapping Privacy requirements onto the IT function, Privacy Law & Policy Reporter, v10.1 & 10.2, 2003. I believe the gaps in the privacy knowledge of ICT practitioners are not random but are systemic, probably resulting from privacy training for non-privacy professionals not being properly integrated with their particular world views.
To properly deal with data privacy, ICT practitioners need to have privacy framed in a way that leads to objective design requirements. Luckily there already exist several unifying frameworks for systematising the work of development teams. One tool that resonates strongly with data privacy practice is the Threat & Risk Assessment (TRA).
A TRA is for analysing infosec requirements and is widely practiced in the public and private sectors in Australia. There are a number of standards that guide the conduct of TRAs, such as ISO 31000. A TRA is used to systematically catalogue all foreseeable adverse events that threaten an organisation’s information assets, identify candidate security controls to mitigate those threats, and prioritise the deployment of controls to bring all risks down to an acceptable level. The TRA process delivers real world management decisions, understanding that non zero risks are ever present, and that no organisation has an unlimited security budget.
The TRA exercise is readily extensible to help Privacy by Design. A TRA can expressly incorporate privacy as an aspect of information assets worth protecting, alongside the conventional security qualities of confidentiality, integrity and availability ("C.I.A.").
A crucial subtlety here is that privacy is not the same as confidentiality, yet they are frequently conflated. A fuller understanding of privacy leads designers to consider the Collection, Use, Disclosure and Access & Correction principles, over and above confidentiality when they analyse information assets. The table above illustrates how privacy related factors can be accounted for alongside “C.I.A.”. In another blog post I discuss the selection of controls to mitigate privacy threats, within a unified TRA framework.
And in this post I look at how the definitional uncertainties in privacy and the unfolding identifiability of PII should not cause security professionals much anxiety - because they're trained to deal with uncertainties and likelihoods.
We continue to actively research the closer integration of security and privacy practices.
Have you heard the news? "Privacy is dead!"
The message is urgent. It's often shouted in prominent headlines, with an implied challenge. The new masters of the digital universe urge the masses: C'mon, get with the program! Innovate! Don't be so precious! Don't you grok that Information Wants To Be Free? Old fashioned privacy is holding us back!
The stark choice posited between privacy and digital liberation is rarely examined with much intellectual rigor. Often, "privacy is dead" is just a tired fatalistic response to the latest breach or eye-popping digital development, like facial recognition, or a smartphone's location monitoring. In fact, those who earnestly assert that privacy is over are almost always trying to sell us something, be it sneakers, or a political ideology, or a wanton digital business model.
Is it really too late for privacy? Is the "genie out of the bottle"? Even if we accepted the ridiculous premise that privacy is at odds with progress, no it's not too late, for a couple of reasons. Firstly, the pessimism (or barely disguised commercial opportunism) generally confuses secrecy for privacy. And secondly, frankly, we aint seen nothin yet!
Technology certainly has laid us bare. Behavioral modeling, facial recognition, Big Data mining, natural language processing and so on have given corporations X-Ray vision into our digital lives. While exhibitionism has been cultivated and normalised by the informopolists, even the most guarded social network users may be defiled by data prospectors who, without consent, upload their contact lists, pore over their photo albums, and mine their shopping histories.
So yes, a great deal about us has leaked out into what some see as an infinitely extended neo-public domain. And yet we can be public and retain our privacy at the same time. Just as we have for centuries of civilised life.
It's true that privacy is a slippery concept. The leading privacy scholar Daniel Solove once observed that "Privacy is a concept in disarray. Nobody can articulate what it means."
Some people seem defeated by privacy's definitional difficulties, yet information privacy is simply framed, and corresponding data protection laws are elegant and readily understood.
Information privacy is basically a state where those who know us are restrained in they do with the knowledge they have about us. Privacy is about respect, and protecting individuals against exploitation. It is not about secrecy or even anonymity. There are few cases where ordinary people really want to be anonymous. We actually want businesses to know - within limits - who we are, where we are, what we've done and what we like ... but we want them to respect what they know, to not share it with others, and to not take advantage of it in unexpected ways. Privacy means that organisations behave as though it's a privilege to know us. Privacy can involve businesses and governments giving up a little bit of power.
Many have come to see privacy as literally a battleground. The grassroots Cryptoparty movement came together around the heady belief that privacy means hiding from the establishment. Cryptoparties teach participants how to use Tor and PGP, and they spread a message of resistance. They take inspiration from the Arab Spring where encryption has of course been vital for the security of protestors and organisers. One Cryptoparty I attended in Sydney opened with tributes from Anonymous, and a number of recorded talks by activists who ranged across a spectrum of political issues like censorship, copyright, national security and Occupy.
I appreciate where they're coming from, for the establishment has always overplayed its security hand, and run roughshod over privacy. Even traditionally moderate Western countries have governments charging like china shop bulls into web filtering and ISP data retention, all in the name of a poorly characterised terrorist threat. When governments show little sympathy for netizenship, and absolutely no understanding of how the web works, it's unsurprising that sections of society take up digital arms in response.
Yet going underground with encryption is a limited privacy stratagem, because do-it-yourself encryption is incompatible with the majority of our digital dealings. The most nefarious and least controlled privacy offences are committed not by government but by Internet companies, large and small. To engage fairly and squarely with businesses, consumers need privacy protections, comparable to the safeguards against unscrupulous merchants we enjoy, uncontroversially, in traditional commerce. There should be reasonable limitations on how our Personally Identifiable Information (PII) is used by all the services we deal with. We need department stores to refrain from extracting health information from our shopping habits, merchants to not use our credit card numbers as customer reference numbers, shopping malls to not track patrons by their mobile phones, and online social networks to not x-ray our photo albums by biometric face recognition.
Encrypting everything we do would only put it beyond reach of the companies we obviously want to deal with. Look for instance at how the cryptoparties are organised. Some cryptoparties manage their bookings via the US event organiser Eventbrite to which attendants have to send a few personal details. So ironically, when registering for a cryptoparty, you can not use encryption!
The central issue is this: going out in public does not neutralise privacy. It never did in the physical world and it shouldn't be the case in cyberspace either. Modern society has long rested on balanced consumer protection regulations to curb the occasional excesses of business and government. Therefore we ought not to respond to online privacy invasions as if the digital economy is a new Wild West. We should not have to hide away if privacy is agreed to mean respecting the PII of customers, users and citizens, and restraining what data custodians do with that precious resource.
We're still in the early days of the social web, and the information innovation has really only just begun. There is incredible value to be extracted from mining the underground rivers of data coursing unseen through cyberspace, and refining that raw material into Personal Information.
Look at what the data prospectors and processors have managed to do already.
It's difficult to overstate the value of facial recognition to businesses like Facebook when they have just one asset: knowledge about their members and users. Combined with image analysis and content addressable graphical memory, facial recognition lets social media companies work out what we're doing, when, where and with whom. I call it piracy. Billions of everyday images have been uploaded over many years by users for ostensiby personal purposes, without any clue that technology would energe to convert those pictures into a commercial resource.
Third party services like Facedeals are starting to emerge, using Facebook's photo resources for commercial facial recognition in public. And the most recent facial recognition entrepreneurs like Name Tag App boast of scraping images from any "public" photo databases they can find. But as we shall see below, in many parts of the world there are restrictions on leveraging public-facing databases, because there is a legal difference between anonymous data and identified information.
- Some of the richest stores of raw customer data are aggregated in retailer databases. The UK department store Tesco for example is
said to hold more data about British citizens than the government does. For years of course data analysts have combed through shopping history for marketing insights, but their predictive powers are growing rapidly. An infamous example is Target's covert development of methods to identify customers who are pregnant based on their buying habits. Some Big Data practitioners seem so enamoured with their ability to extract secrets from apparently mundane data, they overlook that PII collected indirectly by algorithm is subject to privacy law just as if it was collected directly by questionnaire. Retailers need to remember this as they prepare to exploit their massive loyalty databases into new financial services ventures.
- And looking ahead, Google Glass in the privacy stakes will probably surpass both Siri and facial recognition. If actions speak louder than words, imagine the value to Google of seeing through Glass exactly what we do in real time. Digital companies wanting to know our minds won't need us to expressly "like" anything anymore; they'll be able to tell our preferences from our unexpurgated behaviours.
The surprising power of data protection regulations
There's a widespread belief that technology has outstripped privacy law, yet it turns out technology neutral data privacy law copes well with most digital developments. OECD privacy principles (enacted in over 100 countries) and the US FIPPs (Fair Information Practice Principles) require that companies be transarent about what PII they collect and why, limit the ways in which PII is used for unrelated purposes.
Privacy advocates can take heart from several cases where existing privacy regulations have proven effective against some of the informopolies' trespasses. And technologists and cynics who think privacy is hopeless should heed the lessons.
- Google StreetView cars, while they drive up and down photographing the world, also collect Wi-Fi hub coordinates for use in geo-location services. In 2010 it was discovered that the StreetView software was also collecting unencrypted Wi-Fi network traffic, some of which contained Personal Information like user names and even passwords. Privacy Commissioners in Australia, Japan, Korea, the Netherlands and elsewhere found Google was in breach of their data protection laws. Google explained that the collection was inadverrtant, apologized, and destroyed all the wireless traffic that had been gathered.
The nature of this privacy offence has confused some commentators and technologists. Some argue that Wi-Fi data in the public domain is not private, and "by definition" (so they like to say) categorically could not be private. Accordingly some believed Google was within its rights to do whatever it liked with such found data. But that reasoning fails to grasp the technicality that Data Protection laws in Europe, Australia and elsewhere do not essentially distinguish “public” from "private". In fact the word “private” doesn’t even appear in Australia’s “Privacy Act”. If data is identifiable, then privacy rights generally attach to it irrespective of how it is collected.
- Facebook photo tagging was ruled unlawful by European privacy regulators in mid 2012, on the grounds it represents a collection of PII (by the operation of the biometric matching algorithm) without consent. By late 2012 Facebook was forced to shut down facial recognition and tag suggestions in the EU. This was quite a show of force over one of the most powerful companies of the digital age. More recently Facebook has started to re-introduce photo tagging, prompting the German privacy regulator to reaffirm that this use of biometrics is counter to their privacy laws.
It's never too late
So, is it really too late for privacy? Outside the United States at least, established privacy doctrine and consumer protections have taken technocrats by surprise. They have found, perhaps counter intuitively, that they are not as free as they thought to exploit all personal data that comes their way.
Privacy is not threatened so much by technology as it is by sloppy thinking and, I'm afraid, by wishful thinking on the part of some vested interests. Privacy and anonymity, on close reflection, are not the same thing, and we shouldn't want them to be! It's clearly important to be known by others in a civilised society, and it's equally important that those who do know us, are reasonably restrained in how they use that knowledge.