On Identification and identity

The third of four reflections after Kate Carruthers and I spoke on her Data Revolution podcast.

Now we turn to the vexed topic of identity.

Digital Identity is nothing like Identity

Kate and I discussed verifiable credentials and how they help with data minimisation. This brought us to digital identity practices.

New South Wales is one jurisdiction getting really close to good data minimisation solutions, based on the hugely popular digital driver’s license and the next generation offering they call the “NSW Digital ID”.

But I wish they wouldn’t call a “Digital ID” because it’s actually a lot less than identity.

It’s really collections of factoids.

Look at the archetypal use case. To enter a licensed club in Sydney as a casual visitor, I currently need to prove that I’m over 18 and that I live within a certain radius.

These facts are not my identity — they’re just contextually important attributes under registered club regulations. At the moment, clubs look at driver licences for these details (actually, the acute privacy problem is that clubs scan entire licences and log the details for compliance purposes).

But soon people will be able to prove these specific facts using the New South Wales MyServiceNSW digital wallet, without leaking any other superfluous details. This will be great but calling it a “Digital ID” or a “digital identity” is confusing and unnecessary.

Which identity are we talking about?

Terminologically, the trouble is that there’s two different sorts of identity.

I have my identity as Steve Wilson, the person, which is something I feel so very strongly. It’s analogue and biological and social. This identity is uniquely what makes me me. It’s sovereign (in a way).

But that identity is nothing like what we’re dealing with online!

In the digital realm, we’re dealing with many different sets of factoids that are relevant in different contexts.

On the podcast, Kate wasn’t as worried about the words we use. She highlighted there are “many wrong and stupid names out there” such as Web 2.0 and Web3, which we get stuck with.

But we continue to witness the confusion and suspicion that go with the concept of “digital identity” especially in government hands.

Australia’s federal Minister with carriage of digital identity, Katy Gallagher, told an industry forum last week that “Importantly, [the federal] Digital ID is not a card, nor a unique number, nor a new Form of ID … It’s just an easy way of verifying who you are online, against existing government-held identity documents without having to hand over any physical information.” Clearly our lawmakers are conscious of the baggage that goes with digital identity, especially new forms.

So, if the language is problematic, not to mention inaccurate, then don’t use it! 

The new generation verifiable credentials are really all about attributes, credentials, facts and figures. We should call them precisely “attributes”, “credentials”, “facts” and “figures”.

Think about the scenario where I’m at home and a stranger comes to my door to fix my leaking pipe. They’re supposed to be a plumber and I want to check.  If I asked them to “show me your identity”, they’d probably be insulted. All they want to prove (and all I need to know) is that they’re a licensed plumber.

So I believe it’s sloppy to keep referring to these diverse collections of facts and figures as “identity” because it’s so not identity.

By design, digital credentials are less than identity.

We all want the transactional data about us to be less than our identity. We want the data to be nothing more and nothing less than the contextualised facts needed to allow parties to do something specific together.

Words matter. The lack of precision in the way we talk about “digital identity” leads to over-identification, over-disclosure, and inappropriate identity re-use (usually in the name of user convenience).

The words we use shape the way we think about the problem. If identity is really not the thing we need to know, then it’s actual madness to misname it.

Identity as we’ve seen crosses over between laypeople and deeply technical engineers; between professionals and citizens. It’s critical that we speak with precision.

Identification versus Identity

This brought Kate and me to look at the none-too-subtle difference between identification and identity.

The process or ceremony of identification is a nearly daily experience. Some other party always wants to check who I am before dealing with me. But you don’t need to dig deep to appreciate what’s really going on here. It is extremely rare that anyone needs to know who I am.

Rather, an “identification” almost always checks a handful of specific facts about me. Different facts matter in different use cases. So every business does identification differently (and that is the simplest explanation for the repeated failure of large-scale identity federations).

So here’s the truth about the current digital “identity crisis”. Identification performed online has only bare facts to go by.

Look at the aftermath of the Optus breach. It is completely ridiculous that I am vulnerable to fraudsters because some simple facts and figures about me have become available on the dark web. It is ridiculous that criminals can simply replay these numbers behind my back and “assume” my identity.

Where exactly is KYC broken?

To open a new online bank account, all I need to know is somebody’s passport and driver’s license.

As far as I know, no criminal tries to open a fake account over the counter anymore.  Criminals once went to shady bazaars to buy counterfeit passports for $100 and driver’s licenses for $50, then take them to a suburban bank branch to open a fake account.

Now they can buy the same data online for about a 10th of the price.

What’s really interesting is that the identification logic (the so-called Know Your Customer or KYC protocol) has not changed and is therefore still valid.

Whether the account is opened online or in-branch, the bank still just wants to know four or five facts to indicate it’s probably a known person opening the account. So, the same facts suffice online and offline — so long as those facts are correct.

The problem we have as a nation in responding to so-called identity theft is that it is actually data theft. We are failing to recognise this and deal with it appropriately because we are failing to name it correctly.

Cybercriminals don’t steal anyone’s identity. They just steal enough facts and figures about me that they can pretend to be me online.

I love that our government is responding to the data breach epidemic with real muscle.  But it disturbs me that they seem to be moving towards a new National ID as the answer. Our collective blind spot is largely a result of being imprisoned by the language of “digital identity”.

So we don’t need any new ID. What we need to do is to make our existing facts and figures better so that they can’t be replayed in fraudulent identification. We don’t have any identity problem in the wake of the Optus breach; we’ve got a data problem.

If we were focused on the core problem and just solved the provenance of data used in identification, then we could also solve the provenance of data in multiple digital scenarios. We could use the same technology — networked digital credentials and data signing — to tackle digital authenticity across the whole of cyberspace.

Further reading: Data Supply Chain Quality

With AI, we are worried about the source of the data used to train algorithms. We’re worried about algorithmic transparency (when an adverse insurance rating is made about me, I’d like to know what the algorithm is).

We could “stamp” the products of Big Data and AI to name the algorithms used, the provenance of the training data, and the governance conditions.

So there’s a vital recurring pattern in all these contemporary digital problems, which boils down to data and metadata.

We depend on data. We all live on data. Digital businesses thrive on the stuff. We all need to have better data.

We know how to measure data quality and gauge the properties that matter. We could imprint the quality measures like a hallmark on every piece of data that matters in a straightforward way using proven verifiable credentials technology.

Lockstep’s Data Verification Platform is a scheme to rationalise and organise data flows between data originators such as government and the risk owners who rely on accurate data to guide decisions. Join us in conversation.

If you’d like to follow the development of the Data Verification Platform model, please subscribe for email updates.​