Mobile: +61 (0) 414 488 851
Email: swilson@lockstep.com.au

An identity glut on the Internet of Things

The identerati sometimes refer to the challenge of “binding carbon to silicon”. That’s a poetic way of describing how the field of Identity and Access Management (IDAM) is concerned with associating carbon-based life forms (as geeks fondly refer to people) with computers (or silicon chips).

To securely bind users’ identities or attributes to their computerised activities is indeed a technical challenge. In most conventional IDAM systems, there is only circumstantial evidence of who did what and when, in the form of access logs and audit trails, most of which can be tampered with or counterfeited by a sufficiently determined fraudster. To create a lasting, tamper-resistant impression of what people do online requires some sophisticated technology (in particular, digital signatures created using hardware-based cryptography).

On the other hand, working out looser associations between people and computers is the stock-in-trade of social networking operators and Big Data analysts. So many signals are emitted as a side effect of routine information processing today that even the shyest of users may be uncovered by third parties with sufficient analytics know-how and access to data.

So privacy is in peril. For the past two years, big data breaches have only got bigger: witness the losses at Target (110 million), EBay (145 million), Home Depot (109 million records) and JPMorgan Chase (83 million) to name a few. Breaches have got deeper, too. Most notably, in June 2015 the U.S. federal government’s Office of Personnel Management (OPM) revealed it had been hacked, with the loss of detailed background profiles on 15 million past and present employees.

I see a terrible systemic weakness in the standard practice of information security. Look at the OPM breach: what was going on that led to application forms for employees dating back 15 years remaining in a database accessible from the Internet? What was the real need for this availability? Instead of relying on firewalls and access policies to protect valuable data from attack, enterprises need to review which data needs to be online at all.

We urgently need to reduce the exposed attack surface of our information assets. But in the information age, the default has become to make data as available as possible. This liberality is driven both by the convenience of having all possible data on hand, just in case in it might be handy one day, and by the plummeting cost of mass storage. But it's also the result of a technocratic culture that knows "knowledge is power," and gorges on data.

In communications theory, Metcalfe’s Law states that the value of a network is proportional to the square of the number of devices that are connected. This is an objective mathematical reality, but technocrats have transformed it into a moral imperative. Many think it axiomatic that good things come automatically from inter-connection and information sharing; that is, the more connection the better. Openness is an unexamined rallying call for both technology and society. “Publicness” advocate Jeff Jarvis wrote (admittedly provocatively) that: “The more public society is, the safer it is”. And so a sort of forced promiscuity is shaping up as the norm on the Internet of Things. We can call it "superconnectivity", with a nod to the special state of matter where electrical resistance drops to zero.

In thinking about privacy on the IoT, a key question is this: how much of the data emitted from Internet-enabled devices will actually be personal data? If great care is not taken in the design of these systems, the unfortunate answer will be most of it.

Steve Wilson CISID15 Rationing Identity in IoT (0 4) HANDOUTS  Data flows in Internet of Cars
Steve Wilson CISID15 Rationing Identity in IoT (0 4 1) HANDOUTS  Imposing order IoT PII flows

My latest investigation into IoT privacy uses the example of the Internet connected motor car. "Rationing Identity on the Internet of Things" will be released soon by Constellation Research.

And don't forget Constellation's annual innovation summit, Connected Enterprise at Half Moon Bay outside San Francisco, November 4th-6th. Early bird registration closes soon.

Posted in Security, Privacy, Cloud, Big Data

A letter on Free Speech and the Right to be Forgotten

An unpublished letter to New Yorker magazine, August 2015.

Kelefa Sanneh ("The Hell You Say", Aug 10 & 17) poses a question close to the heart of society’s analog-to-digital conversion: What is speech?

Internet policy makers worldwide are struggling with a recent European Court of Justice decision which grants some rights to individuals to have search engines like Google block results that are inaccurate, irrelevant or out of date. Colloquially known as the "Right To Be Forgotten" (RTBF), the ruling has raised the ire of many Americans in particular, who typically frame it as yet another attack on free speech. Better defined as a right to be de-listed, RTBF makes search providers consider the impact on individuals of search algorithms, alongside their commercial interests. For there should be no doubt – search is very big business. Google and its competitors use search to get to know people, so they can sell better advertising.

Search results are categorically not the sort of text which contributes to "democratic deliberation". Free speech may be many things but surely not the mechanical by-products of advertising processes. To protect search results as such mocks the First Amendment.


Some of my other RTBF thoughts:

Posted in Privacy, Internet, Culture, Big Data

Good, better, BlackBerry

In the latest course of a 15 month security feast, BlackBerry has announced it is acquiring mobile device management (MDM) provider Good Technology. The deal is said to be definitive, for US$425 million in cash.

As BlackBerry boldly re-positions itself as a managed service play in the Internet of Things, adding an established MDM capability to its portfolio will bolster its claim -- which still surprises many -- to be handset neutral. But the Good buy is much more than that. It has to be seen in the context of John Chen's drive for cross-sector security and privacy infrastructure for the IoT.

As I reported from the recent BlackBerry Security Summit in New York, the company has knitted together a comprehensive IoT security fabric. Look at how they paint their security platform:

BBY Security Platform In Action

And see how Good will slip neatly into the Platform Services column. It's the latest in what is now a $575 million investment in non-organic security growth (following purchases of Secusmart, Watchdox, Movirtu and Athoc).

According to BlackBerry,

    • Good will bring complementary capabilities and technologies to BlackBerry, including secure applications and containerization that protects end user privacy. With Good, BlackBerry will expand its ability to offer cross-platform EMM solutions that are critical in a world with varying deployment models such as bring-your-own-device (BYOD); corporate owned, personally enabled (COPE); as well as environments with multiple user interfaces and operating systems. Good has expertise in multi-OS management with 64 percent of activations from iOS devices, followed by a broad Android and Windows customer base.(1) This experience combined with BlackBerry’s strength in BlackBerry 10 and Android management – including Samsung KNOX-enabled devices – will provide customers with increased choice for securely deploying any leading operating system in their organization.


The strategic acquisition of Good Technology will also give the Identity-as-a-Service sector a big kick. IDaaS is become a crowded space with at least ten vendors (CA, Centrify, IBM, Microsoft, Okta, OneLogin, Ping, Salepoint, Salesforce, VMware) competing strongly around a pretty well settled set of features and functions. BlackBerry themselves launched an IDaaS a few months ago. At the Security Summit, I asked their COO Marty Beard what is going to distinguishe their offering in such a tight market, and he said, simply, mobility. Presto!

But IDaaS is set to pivot. We all know that mobility is now the locus of security , and we've seen VMware parlay its AirWatch investment into a competitive new cloud identity service. This must be more than a catch-up play with so many entrenched IDaaS vendors.

Here's the thing. I foresee identity actually disappearing from the user experience, which more and more will just be about the apps. I discussed this development in a really fun "Identity Innovators" video interview recorded with Ping at the recent Cloud Identity Summit. For identity to become seamless with the mobile application UX, we need two things. Firstly, federation protocols so that different pieces of software can hand over attributes and authentication signals to one another, and these are all in place now. But secondly we also need fully automated mobile device management as a service, and that's where Good truly fits with the growing BlackBerry platform.

Now stay tuned for new research coming soon via Constellation on the Internet of Things, identity, privacy and software reliability.

See also The State of Identity Management in 2015.

Posted in Security, Identity, Federated Identity, Constellation Research, Big Data

BlackBerry Security Summit 2015

On July 23, BlackBerry hosted its second annual Security Summit, once again in New York City. As with last year’s event, this was a relatively intimate gathering of analysts and IT journalists, brought together for the lowdown on BlackBerry’s security and privacy vision.

By his own account, CEO John Chen has met plenty of scepticism over his diverse and, some say, chaotic product and services portfolio. And yet it’s beginning to make sense. There is a strong credible thread running through Chen’s initiatives. It all has to do with the Internet of Things.

Disclosure: I traveled to the Blackberry Security Summit as a guest of Blackberry, which covered my transport and accommodation.

The Growth Continues

In 2014, John Chen opened the show with the announcement he was buying the German voice encryption firm Secusmart. That acquisition appears to have gone well for all concerned; they say nobody has left the new organisation in the 12 months since. News of BlackBerry’s latest purchase - of crisis communications platform AtHoc - broke a few days before this year’s Summit, and it was only the most recent addition to the family. In the past 12 months, BlackBerry has been busy spending $150M on inorganic growth, picking up:

  • Secusmart - voice & message encryption (announced at the inaugural Security Summit 2014)
  • Movirtu - innovative virtual SIM solutions for holding multiple cell phone numbers on one chip
  • Watchdox - document security and rights management, for “data centric privacy”, and
  • Athoc (announced but not yet complete; see more details below).

    Chen has also overseen an additional $100M expenditure in the same timeframe on organic security expansion (over and above baseline product development). Amongst other things BlackBerry has:

  • "rekindled" Certicom, a specialist cryptography outfit acquired back in 2009 for its unique IP in elliptic curve encryption, and spun out a a new managed PKI service.
  • And it has created its own Enterprise Identity-as-a-Service (IDaas) solution. From what I saw at the Summit, BlackBerry is playing catch-up in cloud based IDAM but they do have an edge in mobility over the specialist identity vendors in what is now a crowded identity services marketplace.

    The Growth Explained - Secure Mobile Communications

    Executives from different business units and different technology horizontals all organised their presentations around what is now a comprehensive security product and services matrix. It looks like this (before adding AtHoc):

    BBY Security Platform In Action

    BlackBerry is striving to lead in Secure Mobile Communications. In that context the highlights of the Security Summit for mine were as follows.

    The Internet of Things

    BlackBerry’s special play is in the Internet of Things. It’s the consistent theme that runs through all their security investments, because as COO Marty Beard says, IoT involves a lot more than machine-to-machine communications. It’s more about how to extract meaningful data from unbelievable numbers of devices, with security and privacy. That is, IoT for BlackBerry is really a security-as-a-service play.

    Chief Security Officer David Kleidermacher repeatedly stressed the looming challenge of “how to patch and upgrade devices at scale”.

      • MyPOV: Functional upgrades for smart devices will of course be part and parcel of IoT, but at the same time, we need to work much harder to significantly reduce the need for reactive security patches. I foresee an angry consumer revolt if things that never were computers start to behave and fail like computers. A radically higher standard of quality and reliability is required. Just look at the Jeep Uconnect debacle, where it appears Chrysler eventually thought better of foisting a patch on car owners and instead opted for a much more expensive vehicle recall. It was BlackBerry’s commitment to ultra high reliability software that really caught my attention at the 2014 Security Summit, and it convinces me they grasp what’s going to be required to make ubiquitous computing properly seamless.

    Refreshingly, COO Beard preferred to talk about economic value of the IoT, rather than the bazillions of devices we are all getting a little jaded about. He said the IoT would bring about $4 trillion of required technology within a decade, and that the global economic impact could be $11 trillion.

    BlackBerry’s real time operating system QNX is in 50 million cars today.


    AtHoc is a secure crisis communications service, with its roots in the first responder environment. It’s used by three million U.S. government workers today, and the company is now pushing into healthcare.

    Founder and CEO Guy Miasnik explained that emergency communications involves more than just outbound alerts to people dealing with disasters. Critical to crisis management is the secure inbound collection of info from remote users. AtHoc is also not just about data transmission (as important as that is) but it works also at the application layer, enabling sophisticated workflow management. This allows procedures for example to be defined for certain events, guiding sets of users and devices through expected responses, escalating issues if things don’t get done as expected.


    We heard more about BlackBerry’s collaboration with Oxford University on the Centre for High Assurance Computing Excellence, first announced in April at the RSA Conference. CHACE is concerned with a range of fundamental topics, including formal methods for verifying program correctness (an objective that resonates with BlackBerry’s secure operating system division QNX) and new security certification methodologies, with technical approaches based on the Common Criteria of ISO 15408 but with more agile administration to reduce that standard’s overhead and infamous rigidity.

    CSO Kleidermacher announced that CHACE will work with the Diabetes Technology Society on a new healthcare security standards initiative. The need for improved medical device security was brought home vividly by an enthralling live demonstration of hacking a hospital drug infusion pump. These vulnerabilities have been exposed before at hacker conferences but BlackBerry’s demo was especially clear and informative, and crafted for a non-technical executive audience.

      • MyPOV: The message needs to be broadcast loud and clear: there are life-critical machines in widespread use, built on commercial computing platforms, without any careful thought for security. It’s a shameful and intolerable situation.


    I was impressed by BlackBerry’s privacy line. It's broader and more sophisticated than most security companies, going way beyond the obvious matters of encryption and VPNs. In particular, the firm champions identity plurality. For instance, WorkLife by BlackBerry, powered by Movirtu technology, realizes multiple identities on a single phone. BlackBerry is promoting this capability in the health sector especially, where there is rarely a clean separation of work and life for professionals. Chen said he wants to “separate work and private life”.

    The health sector in general is one of the company’s two biggest business development priorities (the other being automotive). In addition to sophisticated telephony like virtual SIMs, they plan to extend extend AtHoc into healthcare messaging, and have tasked the CHACE think-tank with medical device security. These actions complement BlackBerry’s fine words about privacy.


    So BlackBerry’s acquisition plan has gelled. It now has perhaps the best secure real time OS for smart devices, a hardened device-independent Mobile Device Management backbone, new data-centric privacy and rights management technology, remote certificate management, and multi-layered emergency communications services that can be diffused into mission-critical rules-based e-health settings and, eventually, automated M2M messaging. It’s a powerful portfolio that makes strong sense in the Internet of Things.

    BlackBerry says IoT is 'much more than device-to-device'. It’s more important to be able to manage secure data being ejected from ubiquitous devices in enormous volumes, and to service those things – and their users – seamlessly. For BlackBerry, the Internet of Things is really all about the service.

    Posted in Software engineering, Security, Privacy, PKI, e-health, Constellation Research, Cloud, Big Data

  • Apply for a SuperNova Award - Recognising leaders in digital business

    Every year the Constellation SuperNova Awards recognise eight individuals for their leadership in digital business. Nominate yourself or someone you know by August 7, 2015.

    The SuperNova Awards honour leaders that demonstrate excellence in the application and adoption of new and emerging technologies. In its fifth year, the SuperNova Awards program will recognise eight individuals who demonstrate true leadership in digital business through their application of new and emerging technologies. Constellation Research is searching for leaders and corporate teams who have innovatively applied disruptive technolgies to their businesses, to adapt to the rapidly-changing digital business environment. Special emphasis will be given to projects that seek to redefine how the enterprise uses technology on a large scale.

    We’re searching for the boldest, most transformative technology projects out there. Apply for a SuperNova Award by filling out the application here: http://www.constellationr.com/node/3137/apply

    SuperNova Award Categories

    • Consumerization of IT & The New C-Suite - The Enterprise embraces consumer tech, and perfects it.
    • Data to Decisions - Using data to make informed business decisions.
    • Digital Marketing Transformation - Put away that megaphone. Marketing in the digital age requires a new approach.
    • Future of Work - The processes and technologies addressing the rapidly shifting work paradigm.
    • Matrix Commerce - Commerce responds to changing realities from the supply chain to the storefront.
    • Next Generation Customer Experience - Customers in the digital age demand seamless service throughout all lifecycle stages and across all channels.
    • Safety and Privacy - Not 'security'. Safety and Privacy is the art and science of the art and science of protecting information assets, including your most important assets: your people.
    • Technology Optimization & Innovation - Innovative methods to balance innovation and budget requirements.

    Five reasons to apply for a SuperNova Award

    • Exposure to the SuperNova Award judges, comprised of the top influencers in enterprise technology
    • Case study highlighting the achievements of the winners written by Constellation analysts
    • Complimentary admission to the SuperNova Award Gala Dinner and Constellation's Connected Enterprise for all finalists November 4-6, 2015 (NB: lodging and travel not included)
    • One year unlimited access to Constellation's research library
    • Winners featured on Constellation's blog and weekly newsletter.

    Learn more about the SuperNova Awards.

    What to expect when applying for a SuperNova Award. Tips and sample application.

    Posted in Constellation Research, Cloud, Big Data

    Digital Disruption - Melbourne

    Ray Wang tells us now that writing a book and launching a company are incredibly fulfilling things to do - but ideally, not at the same time. He thought it would take a year to write "Disrupting Digital Business", but since it overlapped with building Constellation Research, it took three! But at the same time, his book is all the richer for that experience.

    Ray is on a world-wide book tour (tweeting under the hash tag #cxotour). I was thrilled to participate in the Melbourne leg last week. We convened a dinner at Melbourne restaurant The Deck and were joined by a good cross section of Australian private and public sector businesses. There were current and recent executives from Energy Australia, Rio Tinto, the Victorian Government and Australia Post among others, plus the founders of several exciting local start-ups. And we were lucky to have special guests Brian Katz and Ben Robbins - two renowned mobility gurus.

    The format for all the launch events has one or two topical short speeches from Constellation analysts and Associates, and a fireside chat by Ray. In Melbourne, we were joined by two of Australia's deep digital economy experts, Gavin Heaton and Joanne Jacobs. Gavin got us going on the night, surveying the importance of innovation, and the double edged opportunities and threats of digital disruption.

    Then Ray spoke off-the-cuff about his book, summarising years of technology research and analysis, and the a great many cases of business disruption, old and new. Ray has an encyclopedic grasp of tech-driven successes and failures going back decades, yet his presentations are always up-to-the-minute and full of practical can-do calls to action. He's hugely engaging, and having him on a small stage for a change lets him have a real conversation with the audience.

    Speaking with no notes and PowerPoint-free, Ray ranged across all sorts of disruptions in all sorts of sectors, including:

    • Sony's double cassette Walkman (which Ray argues playfully was their "last innovation")
    • Coca Cola going digital, and the speculative "ten cent sip"
    • the real lesson of the iPhone: geeks spend time arguing about whether Apple's technology is original or appropriated, when the point is their phone disrupted 20 or more other business models
    • the contrasting Boeing 787 Dreamliner and Airbus A380 mega jumbo - radically different ways to maximise the one thing that matters to airlines: dollars per passenger-miles, and
    • Uber, which observers don't always fully comprehend as a rich mix of mobility, cloud and Big Data.

    And I closed the scheduled part of the evening with a provocation on privacy. I asked the group to think about what it means to call any online business practice "creepy". Have community norms and standards really changed in the move online? What's worse: government surveillance for political ends, or private sector surveillance for profit? If we pay for free online services with our personal information, do regular consumers understand the bargain? And if cynics have been asking "Is Privacy Dead?" for over 100 years, doesn't it mean the question is purely rhetorical? Who amongst us truly wants privacy to be over?!

    The discussion quickly attained a life of its own - muscular, but civilized. And it provided ample proof that whatever you think about privacy, it is complicated and surprising, and definitely disruptive! (For people who want to dig further into the paradoxes of modern digital privacy, Ray and I recently recorded a nice long chat about it).

    Here are some of the Digital Disruption tour dates coming up:


    Posted in Social Media, Privacy, Internet, Constellation Research, Cloud, Big Data

    You can de-identify but you can't hide

    Acknowledgement: Daniel Barth-Jones kindly engaged with me after this blog was initially published, and pointed out several significant factual errors, for which I am grateful.

    In 2014, the New York Taxi & Limousine Company (TLC) released a large "anonymised" dataset containing 173 million taxi rides taken in 2013. Soon after, software developer Vijay Pandurangan managed to undo the hashed taxi registration numbers. Subsequently, privacy researcher Anthony Tockar went on to combine public photos of celebrities getting in or out of cabs, to recreate their trips. See Anna Johnston's analysis here.

    This re-identification demonstration has been used by some to bolster a general claim that anonymity online is increasingly impossible.

    On the other hand, medical research advocates like Columbia University epidemiologist Daniel Barth-Jones argue that the practice of de-identification can be robust and should not be dismissed as impractical on the basis of demonstrations such as this. The identifiability of celebrities in these sorts of datasets is a statistical anomaly reasons Barth-Jones and should not be used to frighten regular people out of participating in medical research on anonymised data. He wrote in a blog that:

      • "However, it would hopefully be clear that examining a miniscule proportion of cases from a population of 173 million rides couldn’t possibly form any meaningful basis of evidence for broad assertions about the risks that taxi-riders might face from such a data release (at least with the taxi medallion/license data removed as will now be the practice for FOIL request data)."

    As a health researcher, Barth-Jones is understandably worried that re-identification of small proportions of special cases is being used to exaggerate the risks to ordinary people. He says that the HIPAA de-identification protocols if properly applied leave no significant risk of re-id. But even if that's the case, HIPAA processes are not applied to data across the board. The TLC data was described as "de-identified" and the fact that any people at all (even stand-out celebrities) could be re-identified from data does create a broad basis for concern - "de-identified" is not what it seems. Barth-Jones stresses that in the TLC case, the de-identification was fatally flawed [technically: it's no use hashing data like registration numbers with limited value ranges because the hashed values can be reversed by brute force] but my point is this: who among us who can tell the difference between poorly de-identified and "properly" de-identified?

    And how long can "properly de-identified" last? What does it mean to say casually that only a "minuscule proportion" of data can be re-identified? In this case, the re-identification of celebrities was helped by the fact lots of photos of them are readily available on social media, yet there are so many photos in the public domain now, regular people are going to get easier to be identified.

    But my purpose here is not to play what-if games, and I know Daniel advocates statistically rigorous measures of identifiability. We agree on that -- in fact, over the years, we have agreed on most things. The point I am trying to make in this blog post is that, just as nobody should exaggerate the risk of re-identification, nor should anyone play it down. Claims of de-identification are made almost daily for really crucial datasets, like compulsorily retained metadata, public health data, biometric templates, social media activity used for advertising, and web searches. Some of these claims are made with statistical rigor, using formal standards like the HIPAA protocols; but other times the claim is casual, made with no qualification, with the aim of comforting end users.

    "De-identified" is a helluva promise to make, with far-reaching ramifications. Daniel says de-identification researchers use the term with caution, knowing there are technical qualifications around the finite probability of individuals remaining identifiable. But my position is that the fine print doesn't translate to the general public who only hear that a database is "anonymous". So I am afraid the term "de-identified" is meaningless outside academia, and in casual use is misleading.

    Barth-Jones objects to the conclusion that "it's virtually impossible to anonymise large data sets" but in an absolute sense, that claim is surely true. If any proportion of people in a dataset may be identified, then that data set is plainly not "anonymous". Moreover, as statistics and mathematical techniques (like facial recognition) improve, and as more ancillary datasets (like social media photos) become accessible, the proportion of individuals who may be re-identified will keep going up.

    [Readers who wish to pursue these matters further should look at the recent Harvard Law School online symposium on "Re-identification Demonstrations", hosted by Michelle Meyer, in which Daniel Barth-Jones and I participated, among many others.]

    Both sides of this vexed debate need more nuance. Privacy advocates have no wish to quell medical research per se, nor do they call for absolute privacy guarantees, but we do seek full disclosure of the risks, so that the cost-benefit equation is understood by all. One of the obvious lessons in all this is that "anonymous" or "de-identified" on their own are poor descriptions. We need tools that meaningfully describe the probability of re-identification. If statisticians and medical researchers take "de-identified" to mean "there is an acceptably small probability, namely X percent, of identification" then let's have that fine print. Absent the detail, lay people can be forgiven for thinking re-identification isn't going to happen. Period.

    And we need policy and regulatory mechanisms to curb inappropriate re-identification. Anonymity is a brittle, essentially temporary, and inadequate privacy tool.

    I argue that the act of re-identification ought to be treated as an act of Algorithmic Collection of PII, and regulated as just another type of collection, albeit an indirect one. If a statistical process results in a person's name being added to a hitherto anonymous record in a database, it is as if the data custodian went to a third party and asked them "do you know the name of the person this record is about?". The fact that the data custodian was clever enough to avoid having to ask anyone about the identity of people in the re-identified dataset does not alter the privacy responsibilities arising. If the effect of an action is to convert anonymous data into personally identifiable information (PII), then that action collects PII. And in most places around the world, any collection of PII automatically falls under privacy regulations.

    It looks like we will never guarantee anonymity, but the good news is that for privacy, we don't actually need to. Privacy is the protection you need when you affairs are not anonymous, for privacy is a regulated state where organisations that have knowledge about you are restrained in what they do with it. Equally, the ability to de-anonymise should be restricted in accordance with orthodox privacy regulations. If a party chooses to re-identify people in an ostensibly de-identified dataset, without a good reason and without consent, then that party may be in breach of data privacy laws, just as they would be if they collected the same PII by conventional means like questionnaires or surveillance.

    Surely we can all agree that re-identification demonstrations serve to shine a light on the comforting claims made by governments for instance that certain citizen datasets can be anonymised. In Australia, the government is now implementing telecommunications metadata retention laws, in the interests of national security; the metadata we are told is de-identified and "secure". In the UK, the National Health Service plans to make de-identified patient data available to researchers. Whatever the merits of data mining in diverse fields like law enforcement and medical research, my point is that any government's claims of anonymisation must be treated critically (if not skeptically), and subjected to strenuous and ongoing privacy impact assessment.

    Privacy, like security, can never be perfect. Privacy advocates must avoid giving the impression that they seek unrealistic guarantees of anonymity. There must be more to privacy than identity obscuration (to use a more technically correct term than "de-identification"). Medical research should proceed on the basis of reasonable risks being taken in return for beneficial outcomes, with strong sanctions against abuses including unwarranted re-identification. And then there wouldn't need to be a moral panic over re-identification if and when it does occur, because anonymity, while highly desirable, is not essential for privacy in any case.

    Posted in Social Media, Privacy, Identity, e-health, Big Data

    Identity Management Moves from Who to What

    The State Of Identity Management in 2015

    Constellation Research recently launched the "State of Enterprise Technology" series of research reports. These assess the current enterprise innovations which Constellation considers most crucial to digital transformation, and provide snapshots of the future usage and evolution of these technologies.

    My second contribution to the state-of-the-state series is "Identity Management Moves from Who to What". Here's an excerpt from the report:


    In spite of all the fuss, personal identity is not usually important in routine business. Most transactions are authorized according to someone’s credentials, membership, role or other properties, rather than their personal details. Organizations actually deal with many people in a largely impersonal way. People don’t often care who someone really is before conducting business with them. So in digital Identity Management (IdM), one should care less about who a party is than what they are, with respect to attributes that matter in the context we’re in. This shift in focus is coming to dominate the identity landscape, for it simplifies a traditionally multi-disciplined problem set. Historically, the identity management community has made too much of identity!

    Six Digital Identity Trends for 2015

    SoS IdM Summary Pic

    1. Mobile becomes the center of gravity for identity. The mobile device brings convergence for a decade of progress in IdM. For two-factor authentication, the cell phone is its own second factor, protected against unauthorized use by PIN or biometric. Hardly anyone ever goes anywhere without their mobile - service providers can increasingly count on that without disenfranchising many customers. Best of all, the mobile device itself joins authentication to the app, intimately and seamlessly, in the transaction context of the moment. And today’s phones have powerful embedded cryptographic processors and key stores for accurate mutual authentication, and mobile digital wallets, as Apple’s Tim Cook highlighted at the recent White House Cyber Security Summit.

    2. Hardware is the key – and holds the keys – to identity. Despite the lure of the cloud, hardware has re-emerged as pivotal in IdM. All really serious security and authentication takes place in secure dedicated hardware, such as SIM cards, ATMs, EMV cards, and the new Trusted Execution Environment mobile devices. Today’s leading authentication initiatives, like the FIDO Alliance, are intimately connected to standard cryptographic modules now embedded in most mobile devices. Hardware-based identity management has arrived just in the nick of time, on the eve of the Internet of Things.

    3. The “Attributes Push” will shift how we think about identity. In the words of Andrew Nash, CEO of Confyrm Inc. (and previously the identity leader at PayPal and Google), “Attributes are at least as interesting as identities, if not more so.” Attributes are to identity as genes are to organisms – they are really what matters about you when you’re trying to access a service. By fractionating identity into attributes and focusing on what we really need to reveal about users, we can enhance privacy while automating more and more of our everyday transactions.

    The Attributes Push may recast social logon. Until now, Facebook and Google have been widely tipped to become “Identity Providers”, but even these giants have found federated identity easier said than done. A dark horse in the identity stakes – LinkedIn – may take the lead with its superior holdings in verified business attributes.

    4. The identity agenda is narrowing. For 20 years, brands and organizations have obsessed about who someone is online. And even before we’ve solved the basics, we over-reached. We've seen entrepreneurs trying to monetize identity, and identity engineers trying to convince conservative institutions like banks that “Identity Provider” is a compelling new role in the digital ecosystem. Now at last, the IdM industry agenda is narrowing toward more achievable and more important goals - precise authentication instead of general identification.

    Digital Identity Stack (3 1)

    5. A digital identity stack is emerging. The FIDO Alliance and others face a challenge in shifting and improving the words people use in this space. Words, of course, matter, as do visualizations. IdM has suffered for too long under loose and misleading metaphors. One of the most powerful abstractions in IT was the OSI networking stack. A comparable sort of stack may be emerging in IdM.

    6. Continuity will shape the identity experience. Continuity will make or break the user experience as the lines blur between real world and virtual, and between the Internet of Computers and the Internet of Things. But at the same time, we need to preserve clear boundaries between our digital personae, or else privacy catastrophes await. “Continuous” (also referred to as “Ambient”) Authentication is a hot new research area, striving to provide more useful and flexible signals about the instantaneous state of a user at any time. There is an explosion in devices now that can be tapped for Continuous Authentication signals, and by the same token, rich new apps in health, lifestyle and social domains, running on those very devices, that need seamless identity management.

    A snapshot at my report "Identity Moves from Who to What" is available for download at Constellation Research. It expands on the points above, and sets out recommendations for enterprises to adopt the latest identity management thinking.

    Posted in Trust, Social Networking, Security, Privacy, Identity, Federated Identity, Constellation Research, Biometrics, Big Data

    Correspondence in Nature magazine

    I had a letter to the editor published in Nature on big data and privacy.

    Data protection: Big data held to privacy laws, too

    Stephen Wilson
    Nature 519, 414 (26 March 2015) doi:10.1038/519414a
    Published online 25 March 2015

    Letter as published

    Privacy issues around data protection often inspire over-engineered responses from scientists and technologists. Yet constraints on the use of personal data mean that privacy is less about what is done with information than what is not done with it. Technology such as new algorithms may therefore be unnecessary (see S. Aftergood, Nature 517, 435–436; 2015).

    Technology-neutral data-protection laws afford rights to individuals with respect to all data about them, regardless of the data source. More than 100 nations now have such data-privacy laws, typically requiring organizations to collect personal data only for an express purpose and not to re-use those data for unrelated purposes.

    If businesses come to know your habits, your purchase intentions and even your state of health through big data, then they have the same privacy responsibilities as if they had gathered that information directly by questionnaire. This is what the public expects of big-data algorithms that are intended to supersede cumbersome and incomplete survey methods. Algorithmic wizardry is not a way to evade conventional privacy laws.

    Stephen Wilson
    Constellation Research, Sydney, Australia.

    Posted in Science, Privacy, Big Data

    The Google Advisory Council

    In May 2014, the European Court of Justice (ECJ) ruled that under European law, people have the right to have certain information about them delisted from search engine results. The ECJ ruling was called the "Right to be Forgotten", despite it having little to do with forgetting (c'est la vie). Shortened as RTBF, it is also referred to more clinically as the "Right to be Delisted" (or simply as "Google Spain" because that was one of the parties in the court action). Within just a few months, the RTBF has triggered conferences, public debates, and a TEDx talk.

    Google itself did two things very quickly in response to the RTBF ruling. First, it mobilised a major team to process delisting requests. This is no mean feat -- over 200,000 requests have been received to date; see Google's transparency report. However it's not surprising they got going so quickly as they already have well-practiced processes for take-down notices for copyright and unlawful material.

    Secondly, the company convened an Advisory Council of independent experts to formulate strategies for balancing the competing rights and interests bound up in RTBF. The Advisory Council delivered its report in January; it's available online here.

    I declare I'm a strong supporter of RTBF. I've written about it here and here, and participated in an IEEE online seminar. I was impressed by the intellectual and eclectic make-up of the Council, which includes a past European Justice Minister, law professors, and a philosopher. And I do appreciate that the issues are highly complex. So I had high expectations of the Council's report.

    Yet I found it quite barren.

    Recap - the basics of RTBF

    EU Justice Commissioner Martine Reicherts in a speech last August gave a clear explanation of the scope of the ECJ ruling, and acknowledged its nuances. Her speech should be required reading. Reicherts summed up the situation thus:

      • What did the Court actually say on the right to be forgotten? It said that individuals have the right to ask companies operating search engines to remove links with personal information about them – under certain conditions - when information is inaccurate, inadequate, irrelevant, outdated or excessive for the purposes of data processing. The Court explicitly ruled that the right to be forgotten is not absolute, but that it will always need to be balanced against other fundamental rights, such as the freedom of expression and the freedom of the media – which, by the way, are not absolute rights either.

    High tension

    Everyone concerned acknowledges there are tensions in the RTBF ruling. The Google Advisory Council Report mentions these tensions (in Section 3) but sadly spends no time critically exploring them. In truth, all privacy involves conflicting requirements, and to that extent, many features of RTBF have been seen before. At p5, the Report mentions that "the [RTBF] Ruling invokes a data subject’s right to object to, and require cessation of, the processing of data about himself or herself" (emphasis added); the reader may conclude, as I have, that the computing of search results by a search engine is just another form of data processing.

    One of the most important RTBF talking points is whether it's fair that Google is made to adjudicate delisting requests. I have some sympathies for Google here, and yet this is not an entirely novel situation in privacy. A standard feature of international principles-based privacy regimes is the right of individuals to have erroneous personal data corrected (this is, for example, OECD Privacy Principle No. 7 - Individual Participation, and Australian Privacy Principle No. 13 - Correction of Personal Information). And at the top of p5, the Council Report cites the right to have errors rectified. So it is standard practice that a data custodian must have means for processing access and correction requests. Privacy regimes expect there to be dispute resolution mechanisms too, operated by the company concerned. None of this is new. What seems to be new to some stakeholders is the idea that the results of a search engine is just another type of data processing.

    A little rushed

    The Council explains in the Introduction to the Report that it had to work "on an accelerated timeline, given the urgency with which Google had to begin complying with the Ruling once handed down". I am afraid that the Report shows signs of being a little rushed.

    • There are several spelling errors.
    • The contributions from non English speakers could have done with some editing.
    • Less trivially, many of the footnotes need editing; it's not always clear how a person's footnoted quote supports the text.
    • More importantly, the Advisory Council surely operated with Terms of Reference, yet there is no clear explanation of what those were. At the end of the introduction, we're told the group was "convened to advise on criteria that Google should use in striking a balance, such as what role the data subject plays in public life, or whether the information is outdated or no longer relevant. We also considered the best process and inputs to Google’s decision making, including input from the original publishers of information at issue, as potentially important aspects of the balancing exercise." I'm surprised there is not a more complete and definitive description of the mission.
    • It's not actually clear what sort of search we're all talking about. Not until p7 of the Report does the qualified phrase "name-based search" first appear. Are there other types of search for which the RTBF does not apply?
    • Above all, it's not clear that the Council has reached a proper conclusion. The Report makes a number of suggestions in passing, and there is a collection of "ideas" at the back for improving the adjudication process, but there is no cogent set of recommendations. That may be because the Council didn't actually reach consensus.

    And that's one of the most surprising things about the whole exercise. Of the eight independent Council members, five of them wrote "dissenting opinions". The work of an expert advisory committee is not normally framed as a court-like determination, from which members might dissent. And even if it was, to have the majority of members "dissent" casts doubt on the completeness or even the constitution of the process. Is there anything definite to be dissented from?

    Jimmy Wales, the Wikipedia founder and chair, was especially strident in his individual views at the back of the Report. He referred to "publishers whose works are being suppressed" (p27 of the Report), and railed against the report itself, calling its recommendation "deeply flawed due to the law itself being deeply flawed". Can he mean the entire Charter of Fundamental Rights of the EU and European Convention on Human Rights? Perhaps Wales is the sort of person that denies there are any nuances in privacy, because "suppressed" is an exaggeration if we accept that RTBF doesn't cause anything to be forgotten. In my view, it poisons the entire effort when unqualified insults are allowed to be hurled at the law. If Wales thinks so little of the foundation of both the ECJ ruling and the Advisory Council, he might have declined to take part.

    A little hollow

    Strangely, the Council's Report is altogether silent on the nature of search. It's such a huge part of their business that I have to think the strength of Google's objection to RTBF is energised by some threat it perceives to its famously secret algorithms.

    The Google business was founded on its superior Page Rank search method, and the company has spent fantastic funds on R&D, allowing it to keep a competitive edge for a very long time. And the R&D continues. Curiously, just as everyone is debating RTBF, Google researchers published a paper about a new "knowledge based" approach to evaluating web pages. Surely if page ranking was less arbitrary and more transparent, a lot of the heat would come out of RTBF.

    Of all the interests to balance in RTBF, Google's business objectives are actually a legitimate part of the mix. Google provides marvelous free services in exchange for data about its users which it converts into revenue, predominantly through advertising. It's a value exchange, and it need not be bad for privacy. A key component of privacy is transparency: people have a right to know what personal information about them is collected, and why. The RTBF analysis seems a little hollow without frank discussion of what everyone gets out of running a search engine.

    Further reading

    Posted in Social Media, Privacy, Internet, Big Data