How the language and metaphors of identity mislead us

It’s a grave problem that we identity pros can be speaking at cross purposes. The language is problematic. We carry intuitions and ingrained mental models of everyday identity into the technical field, where arbitrary definitions of “identity” are imposed. But we mash up the two senses – the analog “this is who I am” and the digital “this is how I am known”, and we get confused. Unwitnessed cognitive dissonance could be at work.

That’s why I try to reframe “digital identity” around data: to demystify things.

My personal-professional journey saw me shift focus first from identity to attributes. From 2004 I was advocating PKI certificates as electronic business cards rather than open ended proof of identity.

At the Cloud Identity Summit 2013 I detected a distinct push towards attributes. I gave a talk on the “iconoclasts” panel about the ecological problems in federated identity. On the same panel, one of the really prominent identity thinkers Andrew Nash said that “attributes are more interesting than identity”.

Then maybe five years ago I saw that attributes are, in essence, a special class of data. They’re usually personal data, but machines and IoT devices have attributes too. So I became interested in reframing our digital identity work around data mainly because it de-personalises this wretched topic.

I want to find a level at which all digital identity specialists can agree, whether they be government federation types, trust framework authors, self-sovereign identity libertarians, old-school PKI enthusiasts, or payments risk managers.

What if we reframe how we think about digital identity around data? What if we ask “What does the provider of a service need to know about the customer?” And the supplementary design question: “What detailed signals does the service provider need regarding the customer’s qualifications, associations, permissions, consent, jurisdiction and so on?”.

A lawyer or contract specialist would think in terms of parties.

The so-called first party is the service provider or merchant; the second party is the customer or receiver of the service; and a [trusted] third party is often called upon to vouch for important details of the first or second. Usually that’s going to be a specific credential held by the customer that entitles them to the service.

When it comes to digital identity, we’re usually talking about credentials. Here the second party (customer) is also referred to as the user, subject, or [credential] holder. The first party is the relying party or verifier, in that the first party verifies that the second party holds the credential.

If we do that, can’t we all agree that the parties involved need specific pieces of data to underpin a given transaction?

I think it’s useful to see digital signatures as just another instance of metadata, another signal that helps provide confidence in the authenticity of the transaction.

If there’s any merit in the metaphor of data as the new crude oil, then it serves to stress the perils. Both oil and excess data are toxic.

No metaphor is perfect. Some critics say that data is infinite while oil is not. But what I really like about the oil metaphor is that the black gold rush forced regulatory changes. It created a whole new jurisprudence around exploration rights and land rights, and new classes of intangible assets.

We do need new legal thinking around data, don’t we?

The supply chain comparisons are also instructive.

A petrochemical supply chain has assays all along the way, from mining the crude, through cracking it and distributing the fractions to end users — for example to pharmaceutical manufacturers. These assays assure the intermediary buyers, and ultimate end users, that the product is fit for purpose.

We need to start organising data supply chains in the same way, so we know more about the quality of the data at every step.

Photo by T K.