The goal is identification, not an identity

Or “The Grammar of Identification”

Governments talk about creating a “digital identity” as if it’s a one-off task. But really what they’re creating is a process for “identification”, a process that’s repeated again and again, in many different contexts.

Sure, identification sounds like it creates an identity. Indeed, Kim Cameron’s Laws of Identity got us to think that identification “provides” identity. But it doesn’t and I’ll explain why, because the conflation of identification with identity has contributed to the repeated failure of grand identity federations, and more.

There is of course a repeated structure to the process of identification, a sort of grammar. But every organisation performs identification differently, and for good reasons.

Enrolment or onboarding, two steps requiring identification processes, tend to draw on a standard set of ID documents, but different customers will present a different selection of IDs from that set.

Some government services might be restricted to citizens, for example, so proof of citizenship will be vital. But for other services you might need to be a resident in a particular area, in which case proof of residential address becomes more important.

Admitting a graduate to a professional association will entail close examination of their college qualifications. However the association might take the person’s name and address for granted, on the basis that these details would have been checked previously by the colleges.

Consider the Know Your Customer (KYC) identification processes in banking. They draw on a near-universal superset of official ID documents, including passport, birth certificate, driver licence, and residency papers, from which the prospective customer can choose and present a certain minimum number.

But the precise ways in which IDs are checked vary between banks. Some might simply inspect the physical document and record the details. Others might look up the details in a database in real time. Online, they might match a selfie taken live by the user with an ID document, using an online verification bureau and real-time facial recognition.

Each identification follows its own script. The variations in scripts mean that identifications done by different organisations are rarely equivalent.

While we might say casually that a customer has been “identified” and given an identifier, these identifiers are usually meaningless outside the business concerned. Your bank won’t identify you solely by your energy provider’s account number.

This is why “reusable KYC” (aka “KYC Once”) is such an intractable problem. The experience of the Australian Payment Council’s trust framework is instructive: for five years the working group deliberated over digital identity use cases but it couldn’t reach agreement on KYC Once.

Counterintuitively, the identification process does not generally create a reusable identity.

There is no getting around this specificity of identification.

The government of Australia’s most populous state, New South Wales, is rolling out a NSW Digital ID, and the federal government has flagged a national digital ID as a response to the ongoing scourge of data theft.

The process for issuing the NSW Digital ID will be modelled on the proven enrolment process for Service NSW customers. Applicants will present one or two of their primary IDs, choosing from passport, driver licence, Medicare card, or migration visa.

As with the federal government’s myGovID, the more primary IDs presented and confirmed, the higher the resulting so-called “identity strength”.

As the NSW government puts it, government digital IDs will “take the hassle out of registering and applying for things you need”, and improve privacy by “putting you in control of your identity”.

It has even been implied that banks will be able to streamline their own KYC processes — even though the banks can’t agree to use each others’ existing identifications.

The promised reusability of new government digital IDs across private businesses doesn’t quite jive with the conclusions of last month’s digital identity roundtable hosted by the large Australian bank NAB.

“Individuals rarely need to prove their identity. Rather, in most cases, they need to prove they possess an attribute for a particular purpose (i.e., I am over 18 years old and therefore legally allowed to purchase alcohol, or I am a licensed fisherperson and entitled to fish in these waters),” the cross-disciplinary NAB group noted.

“The concepts of ‘digital identity’ and ‘digital identification’ are often conflated, causing confusion.”

To repeat, I argue that the outcome of identifications performed at different organisations should not be thought of as “identities” because in practice they are not useful across organisations; they are not equivalent.  Calling identification outcomes “identities” is natural shorthand but it carries an expectation of reusability that is rarely possible.

Replacing established customer onboarding processes — especially finance-sector KYC — with a newly minted government ID is easier said than done.

And Lockstep argues that shouldn’t be the goal. We don’t have an “identity problem”. We have a data problem. To improve identification processes, we should improve the data that goes into them, not invent brand new types of data. We should make primary IDs non-replayable and verifiable at scale.

Over the long term, new digital IDs might become a regular part of the vernacular, alongside all the other government cards and credentials. Some businesses will find the new digital IDs useful as an additional quality signal. But no new digital ID can ever be a plug-in replacement for existing identification scripts.

Risk management does not interoperate that simply.

Now to put all that mathematically…

When an applicant is said to have been “identified”, this is a proxy for a quite nuanced statement: that a certain subset of facts or attributes have all been checked and accepted by the organisation to the quality level required by its risk management processes.

The outcome of identifying subject S can therefore be broken down into a list of n attributes {A1, A2,…,An}, each of which has been checked and accepted.

But what does “checked” really mean? If one attribute is “Australian driver licence” then was the licence checked in person or online? If in person, who actually checked it, and how were they trained? Was the document ratified by the government’s document verification service, ID Match?

Banks care deeply about the training of the staff who check ID documents. They prefer to do KYC themselves, relying on their own personnel, internal methods, and controls rather than relying on someone else’s potentially inferior procedures.

Looking at individual cases of KYC, even within the same bank, different documents may have been checked. So to fully describe a specific customer identification, we need not only the list of items that were checked and accepted, but also metadata about how the checking was done.

Let’s denote the specific method of checking by a superscript; checking by one organisation is indicated by an asterisk * and another by a hash #. We can therefore write:

  • Outcome of identification * = {A1*, A2*, A3*}
  • Outcome of identification # = {A3#, A4#, A5#}
A spider diagram showing how two different identifications do not overlap. (Source: Stephen Wilson)

It is impossible to say that these outcomes are equivalent or interoperable when the scripts are not the same. As the visualisation shows, only portions of the two identifications overlap.

These details of the identification tell a story. If the outcome of an identification is rolled up into a single datum such as a new account number, then the real value of the data lies in its metadata.

Lockstep’s Data Verification Platform is a scheme to rationalise and organise data flows between data originators such as government and the risk owners who rely on accurate data to guide decisions. Join us in conversation.

If you’d like to follow the development of the Data Verification Platform model, please subscribe for email updates.​

Image: Digital composition by Stilgherrian. Includes elements by Pronesto and Brett via Wikimedia Commons, used under a Creative Commons Attribution-Share Alike 4.0 International license.