Lockstep

Mobile: +61 (0) 414 488 851
Email: swilson@lockstep.com.au

The Rite To Be Forgotten

The European Court of Justice recently ruled on the so-called "Right to be Forgotten" granting members of the public limited rights to request that search engines like Google suppress links to Personal Information under some circumstances. The decision has been roundly criticised by technologists and by American libertarians -- acting out the now familiar ritualised privacy arguments around human rights, freedom of speech, free market forces and freedom to innovate (and hence the bad pun in the title of this article). Surprisingly even some privacy advocates like Jules Polonetsky (quoted in The New Yorker) has a problem with the ECJ judgement because he seems to think it's extremist.

Of the various objections, the one I want to answer here is that search engines should not have to censor "facts" retrieved from the "public domain".

On September 30, I am participating in a live panel discussion of the Right To Be Forgotten, hosted by the IEEE; you can register here and download a video recording of the session later.

Update: recording now available here.

In an address on August 18, the European Union's Justice Commissioner Martine Reicherts made the following points about the Right to be Forgotten (RTBF):

      • "[The European Court of Justice] said that individuals have the right to ask companies operating search engines to remove links with personal information about them -- under certain conditions. This applies when information is inaccurate, for example, or inadequate, irrelevant, outdated or excessive for the purposes of data processing. The Court explicitly ruled that the right to be forgotten is not absolute, but that it will always need to be balanced against other fundamental rights, such as the freedom of expression and the freedom of the media -- which, by the way, are not absolute rights either".

In the current (September 29, 2014) issue of New Yorker, senior legal analyst Jeffrey Toobin looks at RTBF in the article "The Solace of Oblivion". It's a balanced review of a complex issue, which acknowledges the transatlantic polarization of privacy rights and freedom of speech.

Toobin interviewed Kent Walker, Google's general counsel. Walker said Google likes to think of itself as a "card catalogue": "We don't create the information. We make it accessible. A decision like [the ECJ's], which makes us decide what goes inside the card catalogue, forces us into a role we don't want."

But there's a great deal more to search than Walker lets on.

Google certainly does create fresh Personal Information, and in stupendous quantities. Their search engine is the bedrock of a hundred billion dollar business, founded on a mission to "organize the world's information". Google search is an incredible machine, the result of one of the world's biggest ever and ongoing software R&D projects. Few of us now can imagine life without Internet search and instant access to limitless information that would otherwise be utterly invisible. Search really is magic – just as Arthur C. Clarke said any sufficiently advanced technology would be.

On its face therefore, no search result is a passive reproduction of data from a "public domain". Google makes the public domain public.

But while search is free, it is hyper profitable, for the whole point of it is to underpin a gigantic advertising business. The search engine might not create the raw facts and figures in response to our queries, but it covertly creates and collects symbiotic metadata, complicating the picture. Google monitors our search histories, interests, reactions and habits, as well as details of the devices we're using, when and where and even how we are using them, all in order to divine our deep predilections. These insights are then provided in various ways to Google's paying customers (advertisers) and are also fed back into the search engine, to continuously tune it. The things we see courtesy of Google are shaped not only by their page ranking metrics but also by the company's knowledge of our preferences (which it forms by watching us across the whole portfolio of search, Gmail, maps, YouTube, and the Google+ social network). When we search for something, Google tries to predict what we really want to know.

In the modern vernacular, Google hacks the public domain.

The collection and monetization of personal metadata is inextricably linked to the machinery of search. The information Google serves up to us is shaped and transformed to such an extent, in the service of Google's business objectives, that it should be regarded as synthetic and therefore the responsibility of the company. Their search algorithms are famously secret, putting them beyond peer review; nevertheless, there is a whole body of academic work now on the subtle and untoward influences that Google exerts as it filters and shapes the version of reality it thinks we need to see.

Some objections to the RTBF ruling see it as censorship, or meddling with the "truth". But what exactly is the state of the truth that Google purportedly serves up? Search results are influenced by many arbitrary factors of Google's choosing; we don't know what those factors are, but they are dictated by Google's business interests. So in principle, why is an individual's interests in having some influence over search results any less worthy than Google's? The "right to be forgotten" is an unfortunate misnomer: it is really more of a 'limited right to have search results filtered differently'.

If Google's machinery reveals Personal Information that was hitherto impossible to find, then why shouldn't it at least participate in protecting the interests of the people affected? I don't deny that modern technology and hyper-connectivity creates new challenges for the law, and that traditional notions of privacy may be shifting. But it's not a step-change, and in the meantime, we need to tread carefully. There are as many unintended consequences and problems in the new technology as there are in the established laws. The powerful owners of benefactors of these technologies should accept some responsibility for the privacy impacts. With its talents and resources, Google could rise to the challenge of better managing privacy, instead of pleading that it's not their problem.

Posted in RTBF, Privacy, Internet, Social Media

Simply Secure is not simply private

Another week, another security collaboration launch!

"Simply Secure" calls itself “a small but growing organization [with] expertise in usability research, design, software development, and product management". Their mission has to do with improving the security functions that built-in so badly in most software today. Simply Secure is backed by Google and Dropbox, and supported by a diverse advisory board.

It's early days (actually early day, singular) so it might be churlish to point out that Simply Secure's strategic messaging is a little uneven ... except that the words being used to describe it shed light on the clarity of the thinking.

My first exposure to Simply Secure came last night, when I read an article in the Guardian by Cory Doctorow (who is one of their advisers). Doctorow places enormous emphasis on privacy; the word “privacy" outnumbers “security" 16 to three in the body of his column. Another admittedly shorter report about the launch by The Next Web doesn't mention privacy at all. And then there's the Simply Secure blog post, which cites privacy a great deal but every single time in conjunction with security, as in “security and privacy". That repeated phrasing conveys, to me at least, some discomfort. As I say, it's early days and the team is doubtless sorting out how to weigh and progress these closely related objectives.

But I hope they do it quickly. On the face of it, Simply Secure might only scratch the surface of privacy.

Doctorow's Guardian article is mostly concerned with encryption and the terrible implementations that have plagued us since the dawn of the Internet. It's definitely important that we improve here – and radically. If the Simply Secure initiative does nothing but make encryption easier to integrate into commodity software, that would be a great thing. I'm all for it. But it won't necessarily or even probably lead to better privacy, because privacy is about restraint not secrecy or anonymity.
As we go about our lives, we actually want to be known by others, but we want those who know us to be restrained in what they do with the knowledge they have about us. Privacy is the protection you need when your affairs are not secret.

I know Doctorow knows this – I've seen his terrific little speech on the steps on Comic-Con about PRISM. So I'm confused by his focus on cryptography.

How far does encryption get us? If we're using social networks, or if we're shopping and opting in to loyalty programs or selected targeted marketing, or if we're sharing our medical records with relatives, medicos, hospitals and researchers, then encryption becomes moot. We need mechanisms to restrain what the receivers of our personal information do with it. We all know the business model at work behind “free" online services; using encryption to protect privacy in social networking for instance would be like using an armoured van to deliver your valuables to Bernie Madoff.

Another limitation of user-centric or user-managed encryption has to do with Big Data. A great deal of personal information about us is created and collected unseen behind our backs, by sensors, and by analytics processes than manage to work out who we are by linking disparate data streams together. How could SS ameliorate those sorts of problems? If the SS vision includes encryption at rest as well as in transit, then how will the user control or even see all the secondary uses of their encrypted personal information?

There's a combativeness in Doctorow's explanation of Simply Secure and his tweets from yesterday on the topic. His aim is expressly to thwart the surveillance state, which in his view includes a symbiosis (if not conspiracy) between government and internet companies, where the former gets their dirty work done by the latter. I'm sure he and I both find that abhorrent in equal measure. But I argue the proper response to these egregious behaviours is political not technological (and political in the broad sense; I love that Snowden talks as much about accountability, legal processes, transparency and research as he does about encryption). If you think the government is exploiting the exploiters, then DIY encryption is a pretty narrow counter-measure. This is not the sort of society we want to live in, so let's work to change the establishment, rather than try to take it on in a crypto shoot-out.

Yes security technology is important but it's not nearly as important for privacy as the Rule of Law. Data privacy regimes instil restraint. The majority of businesses come to know that they are not at liberty to over-collect personal information, nor to re-use personal information unexpectedly and without consent. A minority of organisations flout data privacy principles, for example by slyly refining raw data into valuable personal knowledge, exploiting the trust citizens and users put in them. Some of these outfits flourish in the United States – the Canary Islands of privacy. Worldwide, the policing of privacy is patchy indeed, yet there have been spectacular legal victories in Europe and elsewhere against the excessive practices of really big companies like Facebook with their biometric data mining of photo albums, and Google's drift net-like harvesting of traffic from unencrypted Wi-Fi networks.

Pragmatically, I'm afraid encryption is such a fragile privacy measure. Once secrecy is penetrated, we need regulations to stem exploitation of our personal information.

By all means, let's improve cryptographic engineering and I wish the Simply Secure initiative all the best. So long as they don't call security privacy.

Posted in Security, Privacy, Language, Big Data

New Paper Coming: The collision between Big Data and privacy law

The collision between Big Data and privacy law

Now available at the Social Science Research Network. First published in the Australian Journal of Telecommunications and the Digital Economy.

Abstract

We live in an age where billionaires are self-made on the back of the most intangible of assets – the information they have about us. The digital economy is awash with data. It's a new and endlessly re-useable raw material, increasingly left behind by ordinary people going about their lives online. Many information businesses proceed on the basis that raw data is up for grabs; if an entrepreneur is clever enough to find a new vein of it, they can feel entitled to tap it in any way they like. However, some tacit assumptions underpinning today's digital business models are naive. Conventional data protection laws, older than the Internet, limit how Personal Information is allowed to flow. These laws turn out to be surprisingly powerful in the face of 'Big Data' and the 'Internet of Things'. On the other hand, orthodox privacy management was not framed for new Personal Information being synthesised tomorrow from raw data collected today. This paper seeks to bridge a conceptual gap between data analytics and privacy, and sets out extended Privacy Principles to better deal with Big Data.

Extract

Introduction

'Big Data' is a broad term capturing the extraction of knowledge and insights from unstructured data. While data processing and analysis is as old as computing, the term 'Big Data' has recently attained special meaning, thanks to the vast rivers of raw data that course unseen through the digital economy, and the propensity for entrepreneurs to tap that resource for their own profit, or to build new analytic tools for enterprises. Big Data represents one of the biggest challenges to privacy and data protection society has seen. Never before has so much Personal Information been available so freely to so many.

Big Data promises vast benefits for a great many stakeholders (Michael & Miller 2013: 22-24) but the benefits may be jeopardized by the excesses of a few overly zealous businesses. Some online business models are propelled by a naive assumption that data in the 'public domain' is up for grabs. Many think the law has not kept pace with technology, but technologists often underestimate the strength of conventional data protection laws and regulations. In particular, technology neutral privacy principles are largely blind to the methods of collection, and barely distinguish between directly and indirectly collected data. As a consequence, the extraction of Personal Information from raw data constitutes an act of collection and as such is subject to longstanding privacy statutes. Privacy laws such as that of Australia don't even use the words 'public' and 'private' to qualify the data flows concerned (Privacy Act 1988).

On the other hand, orthodox privacy policies and static data usage agreements do not cater for the way Personal Information can be synthesised tomorrow from raw data collected today. Privacy management must evolve to become more dynamic, instead of being preoccupied with unwieldy policy documents and simplistic technical notices about cookies.

Thus the fit between Big Data and data privacy standards is complex and sometimes surprising. While existing laws are not to be underestimated, there is a need for data privacy principles to be extended, to help individuals remain abreast of what's being done with information about them, and to foster transparency regarding the new ways for personal information to be generated.

Conclusion: Making Big Data privacy real

A Big Data dashboard like the one described [in this paper] could serve several parallel purposes in aid of progressive privacy principles. It could reveal dynamically to users what PII can be collected about them through Big Data; it could engage users in a fair and transparent exchange of value-for-PII transaction; and it could enable dynamic consent where users are able to opt in to Big Data processes, and opt out and in again, over time, as their understanding of the PII bargain evolves.

Big Data holds big promises, for the benefit of many. There are grand plans for population-wide electronic health records, new personalised financial services that leverage massive retail databases, and electricity grid management systems that draw on real-time consumption data from smart meters in homes, to extend the life of aging 'poles and wires' while reducing greenhouse gas emissions. The value to individuals and operators alike of these programs is amplified as computing power grows, new algorithms are researched, and more and more data sets are joined together. Likewise, the privacy risks are compounded. The potential value of Personal Information in the modern Big Data landscape cannot be represented in a static business model, and neither can the privacy pros and cons be captured in a fixed policy document. New user interfaces and visualisations like a 'Big Data dashboard' are needed to bring dynamic extensions to traditional privacy principles, and help people appreciate and intelligently negotiate the insights that can be extracted about them from the raw material that is data.

Posted in Privacy, Big Data

Schrodinger's Privacy: A Master Class

Master Class: How to Protect Your Customer's Digital Identity and Personal Data

A Social Media Week Sydney event #SMWSydney
Law Lounge, Sydney University Law School
New Law School Building
Eastern Ave, Camperdown
Fri, Sep 26 - 10:00 AM - 11:30 AM

How can you navigate privacy fact and fiction, without the geeks and lawyers boring each other to death?

It's often said that technology has outpaced privacy law. Many digital businesses seem empowered by this brash belief. And so they proceed with apparent impunity to collect and monetise as much Personal Information as they can get their hands on.

But it's a myth!

Some of the biggest corporations in the world, including Google and Facebook, have been forcefully brought to book by privacy regulations. So, we have to ask ourselves:

  • what does privacy law really mean for social media in Australia?
  • is privacy "good for business"?
  • is privacy "not a technology issue"?
  • how can digital businesses navigate fact & fiction, without their geeks and lawyers boring each other to death?

In this Social Media Week Master Class I will:

  • unpack what's "creepy" about certain online practices
  • show how to rate data privacy issues objectively
  • analyse classic misadventures with geolocation, facial recognition, and predicting when shoppers are pregnant
  • critique photo tagging and crowd-sourced surveillance
  • explain why Snapchat is worth more than three billion dollars
  • analyse the regulatory implications of Big Data, Biometrics, Wearables and The Internet of Things.

We couldn't have timed this Master Class better, coming two weeks after the announcement of the Apple Watch, which will figure prominently in the class!

So please come along, for a fun and in-depth a look at social media, digital technology, the law, and decency.

Register here.

About the presenter

Steve Wilson is a technologist, who stumbled into privacy 12 years ago. He rejected those well meaning slogans (like "Privacy Is Good For Business!") and instead dug into the relationships between information technology and information privacy. Now he researches and develops design patterns to help sort out privacy, alongside all the other competing requirements of security, cost, usability and revenue. His latest publications include:

  • "The collision between Big Data and privacy law" due out in October in the Australian Journal of Telecommunications and the Digital Economy.

Posted in Social Networking, Social Media, Privacy, Internet, Biometrics, Big Data

FIDO Alliance - Update

You can be forgiven if the FIDO Alliance is not on your radar screen. It was launched barely 18 months ago, to help solve the "password crisis" online, but it's already proven to be one of most influential security bodies yet.

The typical Internet user has dozens of accounts and passwords. Not only are they a pain in the arse, poor password practices are increasingly implicated in fraud and terrible misadventures like the recent "iCloud Hack" which exposed celebrities' personal details.

With so many of our assets, our business and our daily lives happening in cyberspace, we desperately need better ways to prove who we are online – and even more importantly, prove what we entitled to do there.

The FIDO Alliance is a new consortium of identity management vendors, product companies and service providers working on strong authentication standards. FIDO’s vision is to tap the powers of smart devices – smart phones today and wearables tomorrow – to log users on to online services more securely and more conveniently.

FIDO was founded by Lenovo, PayPal, and security technology companies AGNITiO, Nok Nok Labs and Validity Sensors, and launched in February 2013. Since then the Alliance has grown to over 130 members. Two new authentication standards have been published for peer review, half a dozen companies showcased FIDO-Ready solutions at the 2014 Consumer Electronic Show (CES) in Las Vegas, and PayPal has released its ground-breaking pay-by-fingerprint app for the Samsung Galaxy S5.

The FIDO Alliance includes technology heavyweights like Google, Lenovo, Microsoft and Samsung; payments giants Discover, MasterCard, PayPal and Visa; financial services companies such as Aetna, Bank of America and Goldman Sachs; and e-commerce players like Netflix and Salesforce.com. There are also a couple of dozen biometrics vendors, many leading Identity and Access Management (IDAM) solutions and services, and almost every cell phone SIM and smartcard supplier in the world.

I have been watching FIDO since its inception and reporting on it for Constellation Research. The third update in my series of research reports on FIDO is now available and can be downloaded here. The report looks in depth at what the Alliance has to offer vendors and end user communities, its critical success factors, and how and why this body is poised to shake up authentication like never before.

Posted in Security, Identity, FIDO Alliance, Constellation Research, Smartcards

Privacy watch

Update 22 September 2014

Last week, Apple suddenly went from silent to expansive on privacy, and the thrust of my blog straight after the Apple Watch announcement is now wrong. Apple posted a letter from CEO Tim Cook at www.apple.com/privacy along with a document that sets outs how "We’ve built privacy into the things you use every day".

The paper is very interesting. It's a sophisticated and balanced account of policy, business strategy and technology elements that go to create privacy. Apple highlights that they:

  • forswear the exploitation of customer data
  • do not scan content or messages
  • do not let their small "iAd" business take data from other Apple departments
  • require certain privacy protective practices on the part of their health app developers.

They have also provided quite decent information about how Siri and health data is handled.

Apple's stated privacy posture is all about respect and self-restraint. Setting out these principles and commitments is a very welcome development indeed. I congratulate them.

Today Apple launched their much anticipated wrist watch, described by CEO Tim Cook as "the most personal device they have ever developed". He got that right!

Rather more than a watch, it's a sort of guardian angel. The Apple Watch has Siri built-in, along with new haptic sensors and buzzers, a heartbeat monitor, accelerometer, and naturally the GPS and Wi-Fi geolocation capability to track your speed and position throughout the day. So they say "Apple Watch is an all-day fitness tracker and a highly advanced sports watch in a single device".

Apple Watch

The Apple Watch will be a paragon of digital disruption. To understand and master disruption today requires the coordination of mobility, Big Data, the cloud and user interfaces. These cannot be treated as isolated technologies, so when a company like Apple controls them all, at scale, real transformation follows.

Thus Apple is one of the few businesses that can make promises like this: "Over time, Apple Watch gets to know you the way a good personal trainer would". In this we hear echoes of the smarts that power Siri, and we are reminded that amid the novel intimacy we have with these devices, many serious privacy problems have yet to be resolved.

The Apple Event today was a play in four acts:
Act I: the iPhone 6 release;
Act II: Apple Pay launch;
Act III: the Apple Watch announcement;
Act IV: U2 played live and released their new album free on iTunes!

It was fascinating to watch the thematic differences across these stanzas. With Apple Pay, they stressed security and privacy; we were told about the Secure Element, the way card numbers are replaced by random numbers (tokenization), and an architecture where Apple cannot see how much you spend nor where you spend it. On the other hand, when it came to the Apple Watch and its integrated health sensors, privacy wasn't mentioned, not at all. We are left to deduce that aggregating personal health data at Apple's servers is a part of a broader plan.

The cornerstones of data privacy include Collection Limitation, Use Limitation (or "Purpose Specification") and Openness. Custodians of our Personally Identifiable Information (PII) should refrain from collecting and retaining PII they don't really need; they should specify what they do with PII and restrict unrelated secondary usage; and they should tell people what they're doing, generally in a Privacy Policy. With Siri, Apple sadly fails all these tests.See Update 22 September 2014 above.

The Apple Privacy Policy is altogether silent on Siri. The document details the sorts of information collected through its overt business processes like registration, sales and support, but it says nothing about the voice recordings and transcripts of Siri communications. Neither does the Siri FAQ mention what is done with all that data. It's quite an omission, seeing that when you dictate an SMS or an email to Siri, Apple retains a copy of communications that are normally out of bounds for your telecomms carrier.

It's been left to journalists to try and find out what Apple does with the information it mines from Siri. Wired magazine discovered eventually that Apple retains masked Siri voice recordings for six months; it then purportedly de-identifies them and keeps them for a further 18 months, for research. Yet even these explanations don't touch on the extracted contents of the communications, nor the metadata, like the trends and correlations that go to Siri's learning. If the purpose of Siri is ostensibly to automate the operation of the iPhone and its apps, then Apple should be refrain from using the by-products of Siri's voice processing for anything else. But we just don't know what they do, and Apple imposes no self-restraint.See Update 22 September 2014 above.

We should hope for radically greater transparency with the Apple Watch and its health apps. Most of the watch's data processing and analytics will be carried out in the cloud. So Apple will come to hold detailed records of its users' exercise regimes, their performance figures, trend data and correlations. These are health records. Inevitably, health applications will take in other medical data, like food diaries entered by users, statistics imported from other databases, and detailed measurements from Internet-connected scales, blood pressure monitors and even medical devices. Apple will see what we're doing to improve our health, day by day, year on year. They will come to know more about what's making us healthy and what's not than we do ourselves.

Apple Watch Activity App

Now, the potential benefits from this sort of personal technology to self-managed care and preventative medicine are enormous. But so are the data management and privacy obligations.

Within the US, Apple will doubtless be taking steps to avoid falling under the stringent HIPAA regulations, yet in the rest of the world, a more subtle but far-reaching problem looms. Many broad based data privacy regimes forbid the collection of health information without consent. And the laws of the European Union, Australia, New Zealand and elsewhere are generally technology neutral. This means that data collected directly from patients or doctors, and fresh data collected by way of automated algorithms are treated essentially the same way. So when a sophisticated health management app running in the cloud somewhere mines all that exercise and lifestyle data, and starts to make inferences about health and wellbeing, great care needs to be taken that the indiviuals concerned know what's going on in advance, and have given their informed consent.

One of the deep privacy challenges in Big Data is that data miners don't know what they're going to find. Even with the best will in the world, a company can struggle to say in its Privacy Policy what PII is expects to extract (and thus collect) in future from the raw data it collects today. At Constellation Research we've been fleshing out a new sort of compact between businesses and individuals that seeks to keep users abreast of developments in data analytics, and promises to provide people with proper control of personal Big Data results.

It ought to be possible to expressly opt in to Big Data processes when you can understand the pros and cons and the net benefits, and to later opt out, and opt back in again, as the benefit equation shifts over time. But even visualising the products of Big Data is hard; I believe graphical user interfaces (GUIs) to allow people to comprehend and actively control the process will be one of the great software design problems of our age.

Apple are obviously preeminent in GUI and user experience innovation. You would think if anyone can create the novel yet intuitive interfaces desperately needed to control Big Data PII, Apple can. But first they will have to embrace their responsibilities for the increasingly intimate details they are helping themselves to. If the Apple Watch is "the most personal device they've ever designed" then let's see privacy and data protection commitments to match.

Posted in Privacy, e-health, Constellation Research, Cloud, Big Data

Constellation Connected Enterprise 2014

Constellation's Connected Enterprise (CCE) is an immersive innovation summit for senior business leaders. The theme of this year’s Connected Enterprise is Dominate Digital Disruption. There will be over 200 other early adopters at CCE, at Half Moon Bay outside San Francisco, to discover and share how digital businesses can realise their brand promises, transform their business models, increase revenues, reduce costs, and improve compliance.

CCE is a three day executive retreat, comprising more or less continuous keynotes from visionaries and interactive best practices panels. There are deep one-on-one interviews with market makers, and numerous new enterprise technology demos. And there's the the Constellation SuperNova Awards Gala Dinner.

Register before September 30 to take advantage of early bird pricing. Use code BBLG14 for VIP privileges throughout the event.

See you there!

Posted in Constellation Research

Safeguarding the pedigree of personal attributes

The problem of identity takeover

The root cause of much identity theft and fraud today is the sad fact that customer reference numbers, personal identifiers and attributes generally are so easy to copy and replay without permission and without detection. Simple numerical attributes like bank account numbers and health IDs can be stolen from many different sources, and replayed with impunity in bogus transactions.

Our personal data nowadays is leaking more or less constantly, through breached databases, websites, online forms, call centres and so on, to such an extent that customer reference numbers on their own are no longer reliable. Privacy consequentially suffers because customers are required to assert their identity through circumstantial evidence, like name and address, birth date, mother’s maiden name and other pseudo secrets. All this data in turn is liable to be stolen and used against us, leading to spiraling identity fraud.

To restore the reliability of personal attribute data, we need to know their pedigree. We need to know that a presented data item is genuine, that it originated from a trusted authority, it’s been stored safely by its owner, and it’s been presented with the owner’s consent. If confidence in single attributes can be restored then we can step back from all the auxiliary proof-of-identity needed for routine transactions, and thus curb identity theft.

A practical response to ID theft

Several recent breaches of government registers leave citizens vulnerable to ID theft. In Korea, the national identity card system was attacked and it seems that all Korean's citizen IDs will have to be re-issued. In the US, Social Security Numbers are often stolen and used tin fraudulent identifications; recently, SSNs of 800,000 Post Office employees appear to have been stolen along with other personal records.

Update 14 June 2015: Now last week we got news of a hugely worse breach of US SSNs (not to mention deep personal records) of four million federal US government employees, when the Office of Personnel Management was hacked.

We could protect people against having their stolen identifiers used behind their backs. It shouldn't actually be necessary to re-issue every Korean's ID. Nor should it matter that US SSNs aren't usually replaceable. And great improvements may be made to the reliability of identification data presented online without dramatically changing Relying Parties' back-end processes. If for instance a service provider has always used SSN as part of its identification regime, they could continue to do so, if only the actual Social Security Numbers being received were known to be reliable.

The trick is to be able to tell "original" ID numbers from "copies". But what does "original" mean in the digital world? A more precise term for what we really want is pedigree. What we need is to be able to present attribute data in such a way that the receiver may be sure of their pedigree; that is, know that the attributes were originally issued by an authoritative body to the person presenting or claiming them, and that each presentation of an attribute has occurred under the owner's control.

These objectives can be met with the help of smart cryptographic technologies which today are built into most smart phones and smartcards, and which are finally being properly exploited by initiatives like the FIDO Alliance.


"Notarising" attributes in chip devices

There are ways of issuing attributes to a smart chip device that prevent them from being stolen, copied and claimed by anyone else. One way to do so is to encapsulate and notarise attributes in a unique digital certificate issued to a chip. Today, a great many personal devices routinely embody cryptographically suitable chips for this purpose, including smart phones, SIM cards, "Secure Elements", smartcards and many wearable computers.

Consider an individual named Smith to whom Organisation A has issued a unique attribute N (which could be as simple as a customer reference number). If N is saved in ordinary computer memory or something like a magnetic stripe card, then it has no pedigree. Once the number N is presented by the cardholder in a transaction, it has the same properties as any other number. To better safeguard N in a chip device, it can be sealed into a digital certificate, as follows:

1. generate a fresh private-public key pair inside Smith’s chip
2. export the public key
3. create a digital certificate around the public key, with an attribute corresponding to N
4. have the certificate signed by (or on behalf of) organisation A.

Pedigree Diagram 140901

The result of coordinating these processes and technologies is a logical triangle that inextricably binds cardholder Smith to her attribute N and to a specific personally controlled device. The certificate signed by organisation A attests to both Smith’s attribute value N and Smith's control of a particular device. Keys generated inside the chip are retained internally, never divulged to outsiders. It is not possible to copy the private key to another device, so the logical triangle cannot be reproduced or counterfeited.

Note that this technique is at the heart of the EMV "Chip-and-PIN" system where the smart payment card digitally signs cardholder and transaction data, rendering it immune to replay, before sending it to the merchant terminal. See also my 2012 paper Calling for a uniform approach to card fraud, offline and on. Now we should generalise notarised personal data and digitally signed transactions beyond Card-Present payments into as much online business as possible.

Restoring privacy and consumer control

When Smith wants to present her attribute N in an electronic transaction, instead of simply copying N out of memory (at which point it would lose its pedigree), Smith’s app digitally signs the transaction using the certificate containing N. With standard security software, anyone else can then verify that the transaction originated from a genuine device under Smith's control, with an attribute certified by A. And above all, this assurance is reliably made without needing to name Smith or reveal anything about her other than the attribute of interest.

Note that N doesn't have to be a customer number or numeric identifier; it could be any personal data, such as a biometric template, or a package of medical information like an allergy alert, or an isolated (and anonymous) property of the user, such as her age.

The capability to manage multiple key pairs and certificates, and to sign transactions with a nominated private key, is increasingly built into smart devices today. By narrowing down what you need to know about someone to a precise attribute or personal data item, we will reduce identity theft and fraud while radically improving privacy. This sort of privacy enhancing technology is the key to a safe Internet of Things, and it is now widely available.

Addressing ID theft

Perhaps the best thing governments could do immediately is to adopt smartcards and equivalent smart phone apps for holding and presenting such attributes as official ID numbers. The US government has actually come close to such a plan many times; Chip-based Social Security Cards and Medicare Cards have been proposed before, without realising their full potential. These devices would best be used as above to hold a citizen's identifiers and present them cryptographically, without vulnerability to ID theft and takeover. We wouldn't have to re-issue compromised SSNs; we would instead switch from manual presentation of these numbers to automatic online presentation, with a chip card or smart phone app conveying the data through digitally signatures.

Posted in Smartcards, Security, PKI, Payments, Identity, Fraud, Biometrics