We can over-stretch our metaphors.
Is a passport an "identifier"?
Is a drivers licence an identifier?
Is a credit card an identifier?
Is a professional membership card an identifier?
Is an employee badge an identifier?
Is a building access card an identifier?
Is a house key an identifier?
Is a car key an identifier?
Or putting the questions another way ...
Is a car key a "key"?
Is a house key a key?
Is a building access card a key?
Is an employee badge a key?
Is a professional membership card a key [to access an association]?
Is a credit card a key [to a payments system]?
Is a drivers licence a key [to access the privileges of road usage]?
Is a passport a key [to enter another country]?
From When does a key become an identifier?, 28 April 2005.
In my recent post "Identity is in the eye of the beholder" I tried to unpack the language of "identity provision". I argued that IdPs do not and cannot "provide identity" because identification is carried out by Relying Parties. It may seem like a sterile view in these days of '"self narrated' and bring-you-own identities but I think the truth is that identity is actually determined by Relying Parties. The state of being "identified" may be assisted (to a very great extent) by information provided by others including so-called "Identity" Providers but ultimately it is the RP that identifies me.
I note that the long standing dramaturgical analysis of social identity of Erving Goffman actually says the same thing, albeit in a softer way. That school of thought holds that identity is an emergent property, formed by the way we think others see us. In a social setting there are in effect many Relying Parties, all impressing upon us their sense of who we are. We reach an equilibrium over time, after negotiating all the different interrelating roles in the play of life. And the equilibrium can be starkly disrupted in what I've called the "High School Reunion Effect". So we do not actually curate our own identities with complete self-determination, but rather we allow our identities to be moulded dynamically to fit the expectations of those around us.
Now, in the digital realm, things are so much simpler, you might even say more elegant in an engineering fashion. I'd like to think that the dramaturgical frame sets a precedent for having identities impressed upon us. We should not take offense at this, and we should temper what we mean by "user centric" identities: it need not mean freely expressing all of our identities.
For more precision, maybe it would be useful to get into the habit of specifying the context whenever we talk of a Digital Identity. So here's a bit of mathematical nomenclature, but don't worry, it's not strenuous!
Let's designate the identification performed by a Relying Party RP on a Subject S as IRP-S.
If the RP has drawn on information provided by an "Identity Provider" (running with the dominant language for now), then we can write the identification as a function of the IdP:
Identification = IRP-S(IdP)
But it is still true that the state of identification is reached by the RP and not the IdP.
We can generalise from this to imagine Relying Parties using more than one IdP in making the identification of a subject:
Identification = IRP-S(IdP1,IdP2)
And then we could take things one step further, to recognise that the distinction between "identity providers" and "attribute providers" is arbitrary. So the most general formulation would show identification being a function of a number of attributes verified by the RP either for itself or on its behalf by external attribute providers:
Identification = IRP-S(A1,A2,...,A2)
(where the source of the attribute information could be indicated in various ways).
The work we're trying to start in Australia on a Claims Verification ecosystem reflects this kind of thinking -- it may be more powerful and more practicable to have RPs assemble their knowledge of Subjects from a variety of sources.
That is to say, identity is in the eye of the Relying Party.
The word "identity" seems increasingly problematic to me. It's full of contradictions. On the one hand, it's a popular view that online identity should be "user centric"; many commentators call for users to be given greater determination in how they are identified. People like the idea of "narrating" their own identities, and "bringing their own identity" to work. Yet it's not obvious how governments, banks, healthcare providers or employers for instance can grant people much meaningful say in how they are identified. These sorts of organisations impress their particular forms of identity upon us in order to formalise the relationship they have with us and manage our access to services.
The language of orthodox Federated Identity institutionalises the idea that identity is a good that is "provided" to us through a supply formal chain elaborated in architectures like the Open Identity Exchange (OIX). It might make sense in some settings for individuals to exercise a choice of IdPs, for example choosing between Facebook or Twitter to log on to a social website, but users still don't have much influence on how the IdPs operate, nor on the decision made by Relying Parties about which IdPs they elect to recognise. Think about the choice we have of credit cards: you might prefer to use Diners Club over MasterCard, but if you're shopping at a place that doesn't accept Diners, your "choice" is constrained. You cannot negotiate in real time to have the store accept your chosen instrument (instead you can choose to get yourself a MasterCard or you can choose to go to a different store).
I think the concept of "identity" is so fluid that we should probably stop using it. Or at least use it with much more self-conscious precision.
I'd like you to consider that "Identity Providers" do not in fact provide identity. They really can't provide identity at all, but only assertions -- that is, elements of identity -- that are put together by others who are impacted by the validity of those elements. The act of identification is a part of risk management. It means getting to know a Subject so as to make certain risks more manageable. And it's always done by a Relying Party.
An identity is the outcome of an identification process in which claims about a Subject are verified, to the satisfaction of the Relying Party. An "identity" is basically a handle by which the Subject is known. Recall that the Laws of Identity usefully defined a Digital Identity as a set of claims about the Digital Subject. And we all know that identity is highly context dependent; on its own, an identity like "Acct No. 12345678" means little or nothing without knowing the context as well.
This line of reasoning reminds me once again of the technology neutral, functional definition of "authentication" used by the APEC eSecurity Task Group over a decade ago: the means by which a receiver of an electronic transaction or message makes a decision to accept or reject that transaction or message. Wouldn't life be so much simpler if we stopped overloading some bits of authentication knowledge with the label "identity" and going to such lengths to differentiate other bits of knowledge as "attributes"? What we need online is better means for reliably conveying precise pieces of information about each other, relevant to the transaction at hand. That's all.
Carefully unpacking the language of identity management, we see that no Identity Provider ever actually "identifies" people. In realty, identification is always done by Relying Parties by pulling together what they need to know about a Subject for their own purposes. One IdP might say "This is Steve Wilson", another "This is Stephen Kevin Wilson", another "This is @Steve_Lockstep", another "This is Stephen Wilson, CEO of Lockstep" and yet another "This is Stephen Wilson at 100 Park Ave Jonestown Visa 4000 1234 5678 9012". None of these assertions are my "identity"! My "identity" is different at every RP, each to their need.
See also An Algebra of Identity.
I was recently editing my long "ecological identity" paper from last year and I was reminded how we tend to complicate identity when we speak about it. Here's a passage from that paper, which argues that the language we use is important. I contend we don't need to introduce new technical definitions around identity. Furthermore, I think if we returned to plain language, we might actually see federated identity differently.
Why for instance do orthodox identity engineers insist that authentication and authorization are fundamentally different things? The idea that roles are secondary to identity dates back to 1960's era Logical Access Control. It's an arbitrary distinction not usually seen in the the real world. Authorization is what really matters in most business, not identity. For instance, no pharmacist identifies a doctor before relying on a prescription; the prescription itself, written on an official watermarked form confers the necessary authority. Context is vital; in fact it's often the case that "the medium is the authentication" (with apologies to Marshall McLuhan).
What follows is extracted from Identities Evolve: Why federated identity is easier said than done, AusCERT Security Conference, 2011.
The word "identity" means different things to different people. I believe it is futile quoting dictionary definitions in an attempt to disambiguate something like identity (in fact, when a perfectly ordinary word attracts technical definition, it's a sure sign that misunderstanding is around the corner). Instead of forcing precision on the term, we should actually respect its ambiguity! Consider that in life we are completely at ease with the complexity and nuance of identity. We understand the different flavours of personal identity, national identity and corporate identity. We talk intuitively about identifying with friends, family, communities, companies, sports teams, suburbs, cities, countries, flags, causes, fashions and styles. In multiculturalism, whether or not we agree on the politics of this challenging topic, we understand what is meant by the mingling or the co-existence or the adoption of cultural identities. The idea of "multiple personality syndrome" makes perfect sense to lay people (regardless of its clinical controversies). Identity is not absolute, but instead dilates in time and space. Most of us know how it feels at a high school re-union to no longer identify with the young person we once were, and to have to edit ourselves in real time to better fit how we and others remember us. And it seems clear that we switch identities unconsciously, when for example we change from work garb to casual clothes, or when we wear our team's colours to a football match.
Yet when it comes to digital identity -- that is, knowing and showing who we are online -- we have made an embarrassing mess of it. Information technologists have taken it upon themselves to redefine the meaning of the word, while philosophically they don't even agree if we should possess one identity or more.
We don't need to make identity any more complicated than this: Identity is how someone is known. In life, people move in different circles and they often adopt different guises or identities in each of them. We have circles of colleagues, customers, fellow users, members, professionals, friends and so on -- and we often have distinct identities in each of them. The old saw "don't mix business and pleasure" plainly shows we instinctively keep some of our circles apart. The more formal circles -- which happen to be the ones of greatest interest in e-business -- have procedures that govern how people join them. To be known in a circle of a bank's customers or a company's employees or a profession means that you've met some prescribed criteria, thus establishing a relationship with the circle.[To build on my idea of impressed vs expressed identities, let's acknowledge that the way you know yourself one thing, but the way others know you is something quite different.]
Kim Cameron's seminal Laws of Identity define a Digital Identity as "a set of claims made by one digital subject about itself or another digital subject". This is a relativistic definition; it stresses that context helps to grant meaning to any given identity. Cameron also recognised that this angle "does not jive with some widely held beliefs", especially the common presumption that all identities must be unique in any one setting. He stressed instead that uniqueness in a context might have featured in many early systems but it was not necessarily so in all contexts.
So a Digital Identity is essentially a proxy for how one is known in a given circle; it represents someone in that context. Digital Identity is a powerful abstraction that hides a host of formalities, like the identification protocol, and the terms & conditions for operating in a particular circle, fine-tuned to the business environment. All modern identity thinking stresses that identity is context dependent; what this means in practical terms is that an identifier is usually meaningless outside its circle. For example, if we know that someone's "account number" is 56236741, it's probably meaningless without giving the bank/branch number as well (and that's assuming the number is a bank account and not something from a different context altogether).
I contend that plain everyday language illuminates some of the problems that have hampered progress in federated identity. One of these is "interoperability", a term that has self-evidently good connotations but which passes without a lot of examination. What can it mean for identities to "interoperate" across contexts? People obviously belong to many circles at once, but the simple fact of membership of any one circle (say the set of chartered accountants in Australia) doesn't necessarily say anything about membership of another. That is to say, relationships don't "interoperate", and neither in general do identities.
Yet another breathless report crossed my desk via Twitter this morning where the rise of mobile payments is predicted to lead to cards and cash "disappearing", in this case by 2020. Notably, this hyperventilation comes not from a tech vendor but instead from a "research" company.
So I started to wonder why the success of mobile payments (or any other disruptive technology) is so often framed in terms of winner-take-all. Surely we can imagine new payments modalities being super successful without having to see plastic cards and cash disappear? It might just be that press releases and Twitter tend towards polar language. More likely, and not unrelatedly, it's because a lot of people really think this way.
It's especially ironic given how the term "ecosystem" tops most Buzzword Bingo cards these days. If commentators were to actually think ecologically for a minute they'd realise that the extinction of a Family or Order at the hands of another is very rare indeed.
In Identity Management, Levels of Assurance are an attempt to standardise the riskiness of online transactions and the commensurate authentication strength needed to secure them. Quaternary LOAs (levels 1/2/3/4) have been instituted by governments in the USA, Australia and elsewhere, and they're a cornerstone of federated identity programs like NSTIC.
All LOA formulations are based on risk management methodologies like the international standard ISO 31000. The common approach is for organisations to assess both the impact and expected likelihood of all important adverse events (threats) using metrics customised to the local business conditions and objectives. The severity of security threats can be calculated in all sorts of ways. Some organisations can put a dollar price on the impact of a threat; others look at qualititative or political effects. And the capacity to cover the downside means that the same sort of incident might be thought "minor" at a big pharmaceutical company but "catastrophic" at a small Clinical Research Organisation.
I've blogged before that one problem with LOAs is that risk ratings aren't transferrable. Risk management standards like ISO 31000 are forumulated for internal customised use, so their results are not inherently meaningful between organisations.
Just look at another type of risk rating: the colours of ski runs.
All ski resorts around the world badge the degree of difficulty of their runs the same way: Green, Blue, Black and sometimes Double Black. But do these labels mean anything between resorts? Is a Blue run at Aspen the same as a Blue at Thredbo? No. These colours are not like currency, so skiers are free to boast "that Black isn't nearly as tough as the Black I did last week".
LOAs are just like this. They're local. They're based on risk metrics (and risk appetites) that are not uniform across organisations. They cannot interoperate.
As far as I am aware, there are as yet no examples of LOA 3 or 4 credentials issued by one IdP being relied on by external Service Providers. When there's a lot at stake, organisations prefer to use their own identities and risk management processes. And it's the same with skiing. A risk averse skier at the top of a Black run needs more than the pat assurance of others; they will make up their own mind about the risk of going down the hill.
Most people think that Apple's Siri is the coolest thing they've ever seen on a smart phone. It certainly is a milestone in practical human-machine interfaces, and will be widely copied. The combination of deep search plus natural language processing (NLP) plus voice recognition is dynamite.
And Siri also marks a new milestone in privacy invasion. I predict Siri will become the poster girl for PII piracy, the exemplar of the sly bargain for Personal Information at the heart of most social media.
If you haven't had the pleasure ... Siri is a wondrous new function built into the latest iPhone. It’s the state-of-the-art in artificial intelligence and NLP. You speak directly to Siri, ask her questions (yes, she's female) and tell her what to do with many of your other apps. Siri integrates with mail, text messaging, maps, search, weather, calendar and so on. Ask her "Will I need an umbrella in the morning?" and she'll look up the weather for you – after checking your calendar to see what city you’ll be in tomorrow. It's amazing.
Natural Language Processing is a fabulous idea of course. It radically improves the usability of smart phones, and even their safety with much improved hands-free operation.
An important technical detail is that NLP is very demanding on computing power. In fact it's beyond the capability of today's smart phones, even if each of them alone is more powerful than all of NASA's computers in 1969!. So all Siri's hard work is actually done on Apple's mainframe computers scattered around the planet. That is, all your interactions with Siri are sent into the cloud.
Imagine Siri was a human personal assistant. Imagine she's looking after your diary, placing calls for you, booking meetings, planning your travel, taking dictation, sending emails and text messages for you, reminding you of your appointments, even your significant other’s birthday. She's getting to know you all the while, learning your habits, your preferences, your personal and work-a-day networks.
And she's free!
Now, wouldn't the offer of a free human PA strike you as too good to be true?
When you dictate your mails and text messages to Siri, you’re providing Apple with content that's usually off limits to carriers, phone companies and ISPs. Siri is an end run around telecommunicationss intercept laws.
Of course there are many, many examples of where free social media apps mask a commercial bargain. Face recognition is the classic case. It was first made available on photo sharing sites as a neat way to organise one’s albums, but then Facebook went further by inviting photo tags from users and then automatically identifying people in other photos on others' pages. What's happening behind the scenes is that Facebook is running its face recognition templates over the billions of photos in their databases (which were originally uploaded for personal use long before face recognition was deployed). Given their business model and their track record, we can be certain that Facebook is using face recognition to identify everyone they possibly can, and thence work out fresh associations between countless people and situations accidentally caught on camera. Combine this with image processing and visual search technology (like Google's "Goggles") and the big social media companies have an incredible new eye in the sky. They can work out what we're doing, when, where and with whom. Nobody will need to like expressly "like" anything anymore when Facebook can see what cars we're driving, what brands we're wearing, where we spend our vacations, what we're eating, what makes us laugh. Apple, Facebook and others have understandably invested hundreds of millions of dollars in image recognition start-ups and intellectual property; with these tools they convert the hitherto anonymous image collections in Picassa, Flickr and the like into content-addressable PII gold mines. It's the next frontier of Big Data.
Now, there wouldn't be much wrong with these sorts of arrangements if the social media corporations were up-front about them. In their Privacy Policies they should detail what Personal Information they are extracting and collecting from all the voice and image data; they should explain why they collect this information, what they plan to do with it, how long they will retain it, and how they promise to limit secondary usage. They should explain that biometrics technology allows them to generate brand new PII out of members' snapshots and utterances. And they should acknowledge that by rendering data identifiable, they become accountable in many places under privacy and data protection laws for its safekeeping as PII. It's just not good enough to vaguely reserve their rights to "use personal information to help us develop, deliver, and improve our products, services, content, and advertising". They should treat their customers -- and all those innocents about whom they collect PII indirectly -- with proper respect, and stop pretending that 'service improvement' is what they're up to.
Siri along with face recognition herald a radical new type of privatised surveillance, and on a breathtaking scale. While Facebook stealthily "x-ray" photo albums without consent, Apple now has even more intimate access to our daily routines and personal habits. And they don’t even pay as much as a penny for our thoughts.
As cool as Siri may be, I myself will decline to use any natural language processing while the software runs in the cloud, and while the service providers refuse to restrain their use of my voice data. I'll wait for NLP to be done on my device with my data kept private.
And I'd happily pay cold hard cash for that kind of app, instead of having an infomopoly embed itself in my personal affairs.
Imagine this. Two grain growers are neighbours. One farms wheat and the other corn. Both have invested a lot of money in their silos and grain handling equipment, all of which continues to be a significant cost in their operations. The corn farmer is an innovator and comes up with a bright idea. She approaches her neighbour and gives him the following proposition: since their infrastructure is such an overhead, why not, in the name of efficiency, join up and share their silos?
What farmer wouldn't reject this idea out of hand? If a grain grower needs more capacity, in theory they could re-engineer the entire storage and handling system to use someone else's silo, strike up new support arrangements with their equipment providers, and seek insurance to cover new risks of mixing up their grains. But it would be simpler, cheaper and quicker to just build themselves another silo!
"Break down the silos" is one of the catch cries of modern management practice, and it's a special rallying call in the Federated Identity movement. Nobody denies that myriad passwords and security devices have become a huge headache, but attempts to solve what is really a technology and human factors challenge, by sharing identities and identity provisioning all too often come unstuck.
It's not for nothing that we call identity domains "silos". Grain silos are architecturally elegant, strong and safe; they are critical infrastructure for farmers.
Of all the metaphors in identity management, "silo" is actually one of the good ones. And you have to wonder when and why it became a dirty word in our industry. Identity silos are actually carefully constructed risk management arrangements and in IDAM, risk is the name of the game. As such, silos are not to be trifled with!
Yet another headline crossed my desk this morning reinforcing the orthodoxy that privacy is willingly compromised in return for some reward. This time it's "Many online consumers would trade privacy for discounts" [Internet Retailer, Dec 9].
Try Googling "trade privacy for" (with the quote marks). I got 181,000 hits! Amongst other things, people are said to trade their "privacy" for convenience, security, safety, cheaper loans and free phones.
There's a category error here. And sloppy language belies sloppy thinking.
Increasingly what consumers are doing is trading their Personal Information for a gain of some sort, but not necessarily their privacy. Information privacy is a state where third parties that hold information about you respect that information, undertaking to not know more about you than they need, and to not re-use Personal Information arbitrarily.
We can and should preserve privacy when trading off Personal Information for mercantile benefits. There is no inherent problem in bargaining your PI with others who happen to value it, but to preserve privacy in these transactions, what we need from retailers et al is greater visibility of what they intend to do with the PI they collect, and more sophisticated tools so consumers can fully comprehend what's going on. And we need greater precision in the way we talk about privacy. Let's be clear: there can and should be a fair trade in Personal Information, but not in privacy.
Use of the word “unique” in biometrics constitutes false advertising.
There is little scientific basis for any of the common biometrics to be inherently “unique”. The iris is a notable exception, where the process of embryonic development of eye tissue is known to create random features. But there's little or no literature to suggest that finger vein patterns or gait or voice traits should be highly distinctive and randomly distributed in ways that create what security people call "entropy". In fact, one of the gold standards in biometrics - fingerprinting - has been shown to be based more on centuries old folklore than science (see the work of Simon Cole).
But more's the point, even if a trait is highly distinctive, the vagaries of real world measurement apparatus and conditions mean that every system commits false positives. Body parts age, sensors get grimy, lighting conditions change, and biometric systems must tolerate such variability. In turn, they make odd mistakes. In fact, consumer biometrics are usually tuned to deliberately increase the False Accept Rate, so as not to inconvenience too many bona fide users with a high False Reject Rate.
So no biometric system ever behaves like the trait is unique! Every system has a finite False Accept Rate; FARs of one or two percent are not uncommon. If one in fifty people are confused with someone else on a measured trait, how is that trait “unique”?
The word "unique" should be banned in conenction with biometrics. It's not accurate, and it's used to create over-statements in biometric product marketing.
This is not mere nit picking. The biometrics industry gets away with terrible hyperbole, aided and abetted by loose talk, lulling users into a false sense of security. Managers and strategists need to understand at every turn that there is no such thing as perfect security. Biometric systems fail. But when lay people hear “unique” they think that’s the end of the story. They’re not encouraged to look at the error rate specs and think deeply about what they really mean.
Exaggeration in use of the word "unique" is just the tip of the iceberg. Biometrics vendors are full of it:
Economical with the truth
- Major palm vein vendors claim spectacular error rates of FAR = 0.00008% and FRR = 0.01%. Their brochures show these specs side-by-side, without any mention of the fact that these are best case figures, and utterly impossible to achieve together. I've been asking one vendor for their Detection Error Tradeoff (DET) curves for years but I'm told they're commercial in confidence. The vendor won't even cough up the Equal Error Rate. And why? Because the tradeoff is shocking.
- The International Biometric Group in 2006 published the only palm vein DET curve I have managed to find, in its Comparative Biometric Testing Round 6 ("CBT 6"). Curiously this report is hard to find nowadays, but I have a copy if anyone wants to see it. The DET curves give the lie to the best case vendor specs. For when the palm vein system is tuned to highest security setting with a best possible False Match Rate of 0.0007%, the False Non Match rate deteriorates to 12%, or worse than one in ten. [Ref: CBT6 Executive Summary, p6]
Clueless about privacy
- You'd think that biometric vendors would brush up on privacy. One of them attempted recently to calm fears over facial recognition by asserting that "a face is not, nor has it ever been, considered private". This red herring belies a terrible misunderstanding of information privacy. Once faces are rendered personally identifiable by OSNs and names attached to the terabytes of hitherto anonymous snapshots in their stores, then that data becomes automatically subject to privacy law in many jurisdictions. It's a scandal of the highest order: albums innocently uploaded into the cloud over many years, now suddently rendered identifiable, and trawled for commercially valuable intelligence, without consent, and without any explanation in the operators' Privacy Policies.
Ignoring published research
- And you'd think that for such a research-intensive field (where many products are barely out of the lab) vendors would be up to date. Yet one of them has repeatedly claimed that biometric templates "are nearly impossible to be reverse engineered". This is either a lie or willful ignorance. The academic literature has many examples of facial and fingerprint templates being reverse engineered by successive approximation methods to create synthetic raw biometrics that generate matches with target templates. Tellingly, the untruth that templates can't be reversed has been recently repeated in connection with the possible theft of biometric data of all Israeli citizens. When passwords or keys or any normal security secrets are breached, then the first thing we do is cancel them and re-issue the users with new ones, along with abject apologies for the inconvenience. But with biometrics, that's not an option. So no wonder vendors are so keen to stretch the truth about template security; to admit there is a risk of identity theft, without the ability to reinstate the biometrics of affected victims, would be catastrophic
With more critical thinking, managers and biometric buyers would start to ask the tough questions. Such as How are you testing this system? How do real life error rates compare with bench testing (which the FBI warns is always optimistic)? And what is the disaster recovery plan in the event that a criminal steals a user’s biometric?