Lockstep

Mobile: +61 (0) 414 488 851
Email: swilson@lockstep.com.au

Yet another anonymity promise broken

In 2016, the Australian government released, for research purposes, an extract of public health insurance data, comprising the 30-year billing history of ten percent of the population, with medical providers and patients purportedly de-identified. Melbourne University researcher Dr Vanessa Teague and her colleagues famously found quite quickly that many of the providers were readily re-identified. The dataset was withdrawn, though not before many hundreds of copies were downloaded from the government website.

The government’s responses to the re-identification work were emphatic but sadly not positive. For one thing, legislation was written to criminalize the re-identification of ostensibly ‘anonymised’ data, which would frustrate work such as Teague’s regardless of its probative value to ongoing privacy engineering (the bill has yet to be passed). For another, the Department of Health insisted that no patient information has been compromised.

It seems less ironic than inevitable that in fact the patients’ anonymity was not to be taken as read. In follow-up work released today, Teague, with Dr Chris Culnane and Dr Ben Rubinstein, have now published a paper showing how patients in that data release may indeed be re-identified.

The ability to re-identify patients from this sort of Open Data release is frankly catastrophic. The release of imperfectly de-identified healthcare data poses real dangers to patients with socially difficult conditions. This is surely well understood. What we now need to contend with is the question of whether Open Data practices like this deliver benefits that justify the privacy risks. That’s going to be a trick debate, for the belief in data science is bordering on religious.

It beggars belief that any government official would promise "anonymity" any more. These promises just cannot be kept.

Re-identification has become a professional sport. Researchers are constantly finding artful new ways to triangulate individuals’ identities, drawing on diverse public information, ranging from genealogical databases to social media photos. But it seems that no matter how many times privacy advocates warn against these dangers, the Open Data juggernaut just rolls on. Concerns are often dismissed as academic, or being trivial compared with the supposed fruits of research conducted on census data, Medicare records and the like.

In "Health Data in an Open World (PDF)" Teague et al warn (not for the first time) that "there is a misconception that [protecting the privacy of individuals in these datasets] is either a solved problem, or an easy problem to solve” (p2). They go on to stress “there is no good solution for publishing sensitive unit-record level data that protects privacy without substantially degrading the usefulness of the data" (p3).

What is the cost-benefit of the research done on these data releases? Statisticians and data scientists say their work informs government policy, but is that really true? Let’s face it. "Evidence based policy" has become quite a joke in Western democracies. There are umpteen really big public interest issues where science and evidence are not influencing policy settings at all. So I am afraid statisticians need to be more modest about the practical importance of their findings when they mount bland “balance” arguments that the benefits outweigh the risks to privacy.

If there is a balance to be struck, then the standard way to make the calculation is a Privacy Impact Assessment (PIA). This can formally assess the risk of “de-identified” data being re-identified. And if it can be, a PIA can offer other, layered protections to protect privacy.

So where are all the PIAs?

Open Data is almost a religion. Where is the evidence that evidence-based policy making really works?

I was a scientist and I remain a whole-hearted supporter of publicly funded research. But science must be done with honest appraisal of the risks. It is high time for government officials to revisit their pat assertions of privacy and security. If the public loses confidence in the health system's privacy protection, then some people with socially problematic conditions might simply withdraw from treatment, or hold back vital details when they engage with healthcare providers. In turn, that would clearly damage the purported value of the data being collected and shared.

Big Data-driven research on massive public data sets just seems a little too easy to me. We need to discuss alternatives to massive public releases. One option is to confine research data extracts to secure virtual data rooms, and grant access only to specially authorised researchers. These people would be closely monitored and audited; they would comprise a small set of researchers; their access would be subject to legally enforceable terms & conditions.

There are compromises we all need to make in research on human beings. Let’s be scientific about science-based policy. Let’s rigorously test our faith in Open Data, and let’s please stop taking “de-identification” for granted. It’s really something of a magic spell.

Posted in Big Data, Government, Privacy

A hidden message from Ed Snowden to the Identerati

The KNOW Identity Conference in Washington DC last week opened with a keynote fireside chat between tech writer Manoush Zomorodi and Edward Snowden.

Once again, the exiled security analyst gave us a balanced and nuanced view of the state of security, privacy, surveillance, government policy, and power. I have always found him to be a rock-solid voice of reason. Like most security policy analysts, Snowden sees security and privacy as symbiotic: they can be eroded together, and they must be bolstered together. When asked (inevitably) about the "security-privacy balance", Snowden rejects the premise of the question, as many of us do, but he has an interesting take, arguing that governments tend to surveil rather than secure.

The interview was timely for it gave Snowden the opportunity to comment on the "Wannacry" ransomware episode which affected so many e-health systems recently. He highlighted the tragedy that cyber weapons developed by governments keep leaking and falling into the hands of criminals. For decades, there has been an argument that cryptography is a type of “Dual-Use Technology”; like radio-isotopes, plastic explosives and supercomputers, it can be used in warfare, and thus the NSA and other security agencies try to include encryption in the "Wassenaar Arangement" of export restrictions. The so-called "Crypto Wars" policy debate is usually seen as governments seeking to stop terrorists from encrypting their communications. Even if crypto export control worked, it doesn’t address security agencies' carelessness with their own cyber weapons.

But identity was the business of the conference. What did Snowden have to say about that?

  • * Identifiers and identity are not the same thing. Identifiers are for computers but “identity is about the self”, to differentiate yourself from others.
  • * Individuals need names, tokens and cryptographic keys, to be able to express themselves online, to trade, to exchange value.
  • * “Vendors don’t need your true identity”; notwithstanding legislated KYC rules for some sectors, unique identification is rarely needed in routine business.
  • *Historically, identity has not been a component of many commercial transactions.
  • *The original Web of Trust, for establishing a level of confidence in people though mutual attestation, was “crude and could not scale”. But new “programmatic, frictionless, decentralised” techniques are possible.
  • * He thought a “cloud of verifiers” in a social fabric could be more reliable, to avoid single points of failure in identity.
  • *When pressed, Snowden said actually he was not thinking of blockchain (and that he saw blockchain as being specifically good for showing that "a certain event happened at a certain time").

Now, what are identity professionals to make of Ed Snowden’s take on all this?

For anyone who has worked in identity for years, he said nothing new, and the identerati might be tempted to skip Snowden. On the other hand, in saying nothing new, perhaps Snowden has shown that the identity problem space is fully defined.

There is a vital meta-message here.

In my view, identity professionals still spend too much time in analysis. We’re still writing new glossaries and standards. We’re still modelling. We’re still working on new “trust frameworks”. And all for what? Let’s reflect on the very ordinariness of Snowden’s account of digital identity. He’s one of the sharpest minds in security and privacy, and yet he doesn’t find anything new to say about identity. That’s surely a sign of maturity, and that it’s time to move on. We know what the problem is: What facts do we need about each other in order to deal digitally, and how do we make those facts available?

Snowden seems to think it’s not complicated a question, and I would agree with him.

Posted in Security, Privacy, Identity, Government

Uniquely difficult

I was talking with government identity strategists earlier this week. We were circling (yet again) definitions of identity and attributes, and revisiting the reasonable idea that digital identities are "unique in a context". Regular readers will know I'm very interested in context. But in the same session we were discussing the public's understandable anxiety about national ID schemes. And I had a little epiphany that the word "unique" and the very idea of it may be unhelpful.

The association of uniqueness with the troubling idea of national identity is not just perception; there is a real tendency for identity and access management (IDAM) systems to over-identify, with an obvious privacy penalty. Security pros tend to feel instinctively that the more they know about people, the more secure we all will be.

Whenever we think "uniqueness" is important, I wonder if there are really other more precise objectives that apply? Is "singularity" a better word for the property we're looking for? Or the mouthful "non-ambiguity"? In different use cases, what we really need to know can vary:

  • Is the person (or entity) accessing service the same as last time?
  • Is the person exercising a credential clear to use it? Delegation of digital identity means one entity can act for several others, complicating "uniqueness"
  • Does the Relying Party (RP) know the user well enough for the RP's purposes? That doesn't always mean uniquely.

I observe that when IDAM schemes come loaded with reference to uniqueness, it tends to bias the way RPs do their identification and risk management designs. There can arise an expectation that uniqueness is important, no matter what. Yet a great deal of fraud exploits weaknesses at transaction time, not enrollment time: no matter if you are identified uniquely, you can still get defrauded by an attacker who takes over or bypasses your authenticator. So uniqueness in and of itself doesn't always help.

If people do want to use the word "unique" then they should have the discipline to always qualify it, as mentioned, as "unique in a context".

Finally it's worth remembering that the word has long been degraded by the biometrics industry with their habit of calling most any biological trait "unique". There's a sad lack of precision here. No biometric as measured is ever unique! Every mode, even the much vaunted iris, has a non zero False Match Rate.

What's in a word? A lot! I'd like to see more rigorous use of the word "unique". At least let's be aware of what it means subliminally. With the word bandied around so much, engineers can tend to think uniqueness is always a designed objective, and laypeople can presume that every authentication scheme is out to fingerprint them. Literally.

Posted in Privacy, Identity, Government, Biometrics, Security

Does government have the innovation appetite?

Under new Prime Minister Malcolm Turnbull, innovation for once is the policy du jour in Australia. Innovation is associated with risk taking, but too often, government wants others to take the risk. It wants venture capitalists to take investment risk, and start-ups to take R&D risks. Is it time now for government to walk the talk?

State and federal agencies remain the most important buyers of IT in Australia. To stimulate domestic R&D and advance an innovation culture, governments should be taking some bold procurement risk, punting to some degree on new technology. Major projects like driver licence technology upgrades, the erstwhile Human Services Access Card, the national broadband roll-out, and national e-health systems, would be ideal environments in which to preferentially select next generation, home-grown products.

Obviously government must be prudent spending public money on new technology. Yet at the same time, there is a public interest argument for selecting newer solutions: in the rapidly changing online environment, citizens stand to benefit from the latest innovations, bred in response to current challenges.

What do entrepreneurs need most to help them innovate and prosper? It's metaphorical oxygen!

Innovators need:

  • access to prospective customers, so we may showcase disruptive technologies
  • procurement processes that admit, nay encourage, some technology risk taking
  • agile tender specifications that call for the unexpected in responses, prompting disruptive technologies
  • open-mindedness from big prime contractors, who too often are deaf to inventive SMEs
  • curiosity for innovation amongst business people
  • optimism amongst buyers that small local players might have something special to offer
  • and a reversal of the classic Australian taboo against sales.

    Too often, innovative entrepreneurs are met with the admonition you’re only trying to sell us something. Well yes we are, but it's because we believe we have something to meet real needs, and that customers actually need to buy something.

    Posted in Innovation, Government