I’ve been a critic of Blockchain. Frankly I’ve never seen such a massed rush of blood to the head for a new technology. Breathless books are being churned out about “trust infrastructure” and an “Internet of Value”. They say Blockchain will keep politicians and business people honest, and enable “billions of excluded people to enter the global economy”.
Most pundits overlook the simple fact that Blockchain only does one thing: it lets you move Bitcoin (a digital bearer token) from one account to another without an umpire. And it doesn’t even do that very well, for the Proof of Work algorithm is stupendously inefficient. Blockchain can't magically make merchants keep up their side of a bargain. Surprise! You can still get ripped off paying with Bitcoin. Blockchain simply doesn’t do what the futurists think it does. In their hot flushes, they tend to be caught in a limbo between the real possibilities of distributed consensus today and a future that no one is seeing clearly.
But Blockchain does solve what was thought to be an impossible problem, and in the right hands, that insight can convert to real innovation. I’m happy to see some safe pairs of hands now emerging in the Blockchain storm.
One example is an investment being made by Ping Identity in Swirlds and its new “hashgraph” distributed consensus platform. Hashgraph has been designed from the ground up to deliver many of Blockchain’s vital properties (consensus on the order of events, and redundancy) in a far more efficient and robust manner.
And what is Ping doing with this platform? Well they’re not rushing out with vague promises to manufacture "trust" but instead they’re making babysteps on real problems in identity management. For starters, they’re applying the new hashgraph platform to Distributed Session Management (DSM). This is the challenge of verifiably shutting down all of a user’s multiple log-on sessions around the web when they take a break, suffer a hack, or lose their job. It's one of the great headaches of enterprise identity administration and is exploited in a great many cyberattacks.
Ping’s identity architects have carefully set out the problem they’re trying to solve, why it’s hard, and how existing approaches don’t deliver the desired security properties for session management. They then evaluated a number of consensus approaches - not just Blockchain but also Paxos and Raft – and discussed their limitations. The Ping team then landed on hashgraph, which appears to meet the needs, and also looks like it can deliver a range of advanced features.
In my view, Ping Identity’s work is the very model of mature security design. It’s an example of the care and attention to detail that other innovators should follow.
Swirld’s founder Dr Leemon Baird will be presenting hashgraph in more detail to the Cloud Identity Summit in New Orleans tomorrow (June 7th).
In "We are hopelessly hooked" (New York Review of Books, February 25), political historian Jacob Weisberg canvasses the social impact of digital technology. He describes mobile and social media as “self-depleting and antisocial” but I would prefer different-social not merely for the vernacular but because the new media's sadder side is a lot like what's gone before.
In reviewing four recent contributions to the field - from Sherry Turkle, Joseph Reagle and Nir Eyal - Weisberg dwells in various ways on the twee dichotomy of experience online and off. For many of us, the distinction between digital and "IRL" (the sardonic abbreviation of "in real life") is becoming entirely arbitrary, which I like to show through an anecdote.
I was a mid-career technology researcher and management consultant when I joined Twitter in 2009. It quickly supplanted all my traditional newsfeeds and bulletin boards, by connecting me to individuals who I came to trust to pass on what really mattered. More slowly, I curated my circles, built up a following, and came to enjoy the recognition that would ordinarily come from regular contact, if the travel was affordable from far flung Australia. By 2013 I had made it as a member of the “identerati” – a loose international community of digital identity specialists. Thus, on my first trip to the US in many years, I scored a cherished invitation to a private pre-conference party with 50 or so of these leaders.
On the night, as I made my way through unfamiliar San Francisco streets, I had butterflies. I had met just one of my virtual colleagues face-to-face. How would I be received “IRL”? The answer turned out to be: effortlessly. Not one person asked the obvious question – Steve, tell us about yourself! – for everyone knew me already. And this surprising ease wasn’t just about skipping formalities; I found we had genuine intimacy from years of sharing and caring, all on Twitter.
Weisberg quotes Joseph Reagle in "Reading the Comments..." looking for “intimate serendipity” in successful online communities. It seems both authors are overlooking how serendipity catalyses all human relationships. It’s always something random that turns acquaintances into friends. And happy accidents may be more frequent online, not in spite of all the noise but because of it. We all live for chance happenings, and the much-derided Fear Of Missing Out is not specific to kids nor the Internet. Down the generations, FOMO has always kept teenagers up past their bed time; but it’s also why we grown-ups outstay our welcome at dinner parties and hang out at dreary corporate banquets.
Weisberg considers Twitter’s decay into anarchy and despair to be inevitable, and he may be right, but is it simply for want of supervision? We know sudden social decay all too well; just think of the terribly real-life “Lord of the Flies”.
Sound moral bearings are set by good parents, good teachers, and – if we’re lucky – good peers. At this point in history, parents and teachers are famously less adept than their charges in the new social medium, but this will change. Digital decency will be better impressed on kids when all their important role models are online.
It takes a village to raise a child. The main problem today is that virtual villages are still at version 1.0.
We all know that digital transformation is imminent, but getting there is far from easy. The digital journey is fraught with challenges, not least of which is customer access. "Online" is not what it used to be; the online world by many measures is bigger than the “real world” and it’s certainly not just a special corner of a network we occasionally log into. Many customers spend a substantial part of their lives online. The very word "online" is losing its meaning, with offline becoming a very unusual state. So enterprises are finding they need to totally rethink customer identity, bringing together the perspectives of CTO for risk management and engineering, and the CMO for the voice of the customer.
Consider this. The customer experience of online identity was set in concrete in the 1960s when information technology meant mainframes and computers only sat in “laboratories”. That was when we had the first network logon. The username and password was designed by sys admins for sys admins.
Passwords were never meant to be easy. Ease of use was irrelevant to system administrators; everything about their job was hard, and if they had to manage dozens of account identifiers, so be it. The security of a password depends on it being hard to remember and therefore, in a sense, hard to use. The efficacy of a password is in fact inversely proportional to its ease of use! Isn't that a unique property in all consumer technology?
The tragedy is that the same access paradigm has been inherited from the mainframe era and passed right on through the Age of the PCs in the 1980s, to the Internet in the 2000s. Before we knew it, we all turned into heavy duty “computer” users. The Personal Computer was always regarded as a miniaturized mainframe, with a graphical user interface layered over one or more arcane operating systems, from which consumers never really escaped.
But now all devices are computers. Famously, a phone today is more powerful than all of NASA’s 1969 moon landing IT put together). And the user experience of “computing” has finally changed, and radically so. Few people ever touch operating system anymore. The whole UX is at the app level. What people know now is all tiles and icons, spoken commands, and gestures. Wipe, drag, tap, flick.
Identity management is probably the last facet of IT to be dragged out of the mainframe era. It's all thanks to mobility. We don’t "log on" anymore, we unlockour device. Occasionally we might be asked to confirm who we are before we do something risky, like look up a health record or make a larger payment. The engineer might call it “trust elevation” or some such but the user feels it’s like a reassuring double check.
We might even stop talking about “Two Factor Authentication” now the mobile is so ubiquitous. The phone is your second factor now, a constant part of your life, hardly ever out of sight, and instantly noticed if lost or stolen. And under the covers, mobile devices can make use of many other signals – history, location, activity, behaviour – to effect continuous or ambient authentication, and look out for misuse.
So the user experience of identity per se is melting away. We simply click on an app within an activated device and things happen. The authentication UX has been dictated for decades by technologists, but now, for the first time, the CTO and the CMO are on the same page when it comes to customer identity.
To explore these crucial trends, Ping Identity is hosting a webinar on June 2, Consumerization Killed the Identity Paradigm. To learn more about customer identity and how to implement it successfully in your enterprise, please join me and Ping Identity’s CTO Patrick Harding and CMO Brian Bell.
Almost everything you read about the blockchain is wrong. No new technology since the Internet itself has excited so many pundits, but blockchain just doesn’t do what most people seem to think it does. We’re all used to hype, and we can forgive genuine enthusiasm for shiny new technologies, but many of the claims being made for blockchain are just beyond the pale. It's not going to stamp out corruption in Africa; it's not going to crowdsource policing of the financial system; it's not going to give firefighters unlimited communication channels. So just what is it about blockchain?
The blockchain only does one thing (and it doesn’t even do that very well). It provides a way to verify the order in which entries are made to a ledger, without any centralized authority. In so doing, blockchain solves what security experts thought was an unsolvable problem – preventing the double spend of electronic cash without a central monetary authority. It’s an extraordinary solution, and it comes at an extraordinary price. A large proportion of the entire world’s computing resource has been put to work contributing to the consensus algorithm that continuously watches the state of the ledger. And it has to be so, in order to ward off brute force criminal attack.
How did an extravagant and very technical solution to a very specific problem capture the imagination of so many? Perhaps it’s been so long since the early noughties’ tech wreck that we’ve lost our herd immunity to the viral idea that technology can beget trust. Perhaps, as Arthur C. Clarke said, any sufficiently advanced technology looks like magic. Perhaps because the crypto currency Bitcoin really does have characteristics that could disrupt banking (and all the world hates the banks) blockchain by extension is taken to be universally disruptive. Or perhaps blockchain has simply (but simplistically) legitimized the utopian dream of decentralized computing.
Blockchain is antiauthoritarian and ruthlessly “trust-free”. The blockchain algorithm is rooted in politics; it was expressly designed to work without needing to trust any entity or coalition. Anyone at all can join the blockchain community and be part of the revolution.
The point of the blockchain is to track every single Bitcoin movement, detecting and rejecting double spends. Yet the blockchain APIs also allow other auxiliary data to be written into Bitcoin transactions, and thus tracked. So the suggested applications for blockchain extend far beyond payments, to the management of almost any asset imaginable, from land titles and intellectual property, to precious stones and medical records.
From a design perspective, the most troubling aspect of most non-payments proposals for the blockchain is the failure to explain why it’s better than a regular database. Blockchain does offer enormous redundancy and tamper resistance, thanks to a copy of the ledger staying up-to-date on thousands of computers all around the world, but why is that so much better than a digitally signed database with a good backup?
Remember what blockchain was specifically designed to do: resolve the order of entries in the ledger, in a peer-to-peer mode, without an administrator. When it comes to all-round security, blockchain falls short. It’s neither necessary nor sufficient for any enterprise security application I’ve yet seen. For instance, there is no native encryption for confidentiality; neither is there any access control for reading transactions, or writing new ones. The security qualities of confidentiality, authentication and, above all, authorization, all need to be layered on top of the basic architecture. ‘So what’ you might think; aren’t all security systems layered? Well yes, but the important missing layers undo some of the core assumptions blockchain is founded on, and that’s bad for the security architecture. In particular, as mentioned, blockchain needs massive scale, but access control, “permissioned” chains, and the hybrid private chains and side chains (put forward to meld the freedom of blockchain to the structures of business) all compromise the system’s integrity and fraud resistance.
And then there’s the slippery notion of trust. By “trust”, cryptographers mean so-called “out of band” or manual mechanisms, over and above the pure math and software, that deliver a security promise. Blockchain needs none of that ... so long as you confine yourself to Bitcoin. Many carefree commentators like to say blockchain and Bitcoin are separable, yet the connection runs deeper than they know. Bitcoins are the only things that are actually “on” the blockchain. When people refer to putting land titles or diamonds “on the blockchain”, they’re using a short hand that belies blockchain’s limitations. To represent any physical thing in the ledger requires a schema – a formal agreement as to which symbols in the data structure correspond to what property in the real world – and a binding of the owner of that property to the special private key (known in the trade as a Bitcoin wallet) used to sign each ledger entry. Who does that binding? How exactly do diamond traders, land dealers, doctors and lawyers get their blockchain keys in the first place, and how does the world know who’s who? These questions bring us back to the sorts of hierarchical authorities that blockchain was supposed to get rid of.
There is no utopia in blockchain. The truth is that when we fold real world management, permissions, authorities and trust, back on top of the blockchain, we undo the decentralization at the heart of the design. If we can’t get away from administrators then the idealistic peer-to-peer consensus algorithm of blockchain is academic, and simply too much to bear.
I’ve been studying blockchain for two years now. My latest in-depth report was recently published by Constellation Research.
I was talking with government identity strategists earlier this week. We were circling (yet again) definitions of identity and attributes, and revisiting the reasonable idea that digital identities are "unique in a context". Regular readers will know I'm very interested in context. But in the same session we were discussing the public's understandable anxiety about national ID schemes. And I had a little epiphany that the word "unique" and the very idea of it may be unhelpful. I wonder if we could avoid using the word "uniqueness" wherever we can.
The link from uniqueness to troublesome national identity is not just perception; there is a real tendency for identity and access management (IDAM) systems to over-identify, with an obvious privacy penatly. Security professionals feel instinctively that they more they know about people, the more secure we all will be.
Whenever we think uniqueness is important, I wonder if there are really other more precise objectives that apply? Is "singularity" a better word for the property we're looking for? Or the mouthful "non-ambiguity"? In different use cases, what we really need to know can vary:
- Is the person (or entity) accessing service the same as last time?
- Is the person exercising a credential clear to use it? Delegation of digital identity actually makes "uniqueness" moot)
- Does the Relying Party (RP) know the user "well enough" for the RP's purposes? That doesn't always mean uniquely.
I observe that when IDAM schemes come loaded with reference to uniqueness, it's tends to bias the way RPs do their identification and risk management designs. There is an expectation that uniqueness is important no matter what. Yet it is emerging that much fraud (most fraud?) exploits weaknesses at transaction time, not enrollment time: even if you are identified uniquely, you can still get defrauded by an attacker who takes over or bypasses your authenticator. So uniqueness in and of itself doesn't always help.
If people do want to use the word "unique" then they should have the discipline to always qualify it, as mentioned, as "unique in a context". But I have to say that "unique is a context" is not "unique".
Finally it's worth remembering that the word has long been degraded by the biometrics industry with their habit of calling most any biological trait "unique". There's a sad lack of precision here. No biometric as measured is ever unique! Every mode, even iris, has a non zero False Match Rate.
What's in a word? A lot! I'd like to see more rigorous use of the word "unique". At least let's be aware of what it means subliminally to the people we're talking with - be they technical or otherwise. With the word bandied around so much, engineers can tend to think uniqueness is always a designed objective, and laypeople can presume that every authentication scheme is out to fingerprint them. Literally.
You’ll have to forgive the deliberate inaccuracy in the title, but I just couldn’t resist the wordplay. The topic of this blog is the use of the blockchain for identity, which is not exactly Bitcoin. By my facetiousness, and by my analysis, you’ll see I don’t yet take the identity use case seriously.
In 2009, Bitcoin was launched. A paper was self-published by a person or persons going by the nom de plume Satoshi Nakamoto, called “Bitcoin: A Peer-to-Peer Electronic Cash System” and soon after an open source software base appeared at http://www.bitcoin.org. Bitcoin offered a novel solution to the core problem in electronic cash: how to prevent double spending without reverting to a central authority. Nakamoto’s conception is strongly anti-authoritarian, almost anarchic, with an absolute rejection of fiat currency, reserve banks and other central institutions. Bicoin and its kin aim to change the world, and by loosening the monopolies in traditional finance, they may well do that.
Separate to that, the core cryptographic technology in Bitcoin is novel, and so surprising, it's almost magical. Add to that spell the promise of security and anonymity, and we have a powerful mix that some people see excitedly as stretching far beyond mere money, and into identity. So is that a reasonable step?
Bitcoin’s secret sauce
A decentralised digital currency scheme requires some sort of community-wide agreement on when someone spends a virtual coin, so she cannot spend it again. Bitcoin’s trick is to register every single transaction on one public tamper-proof ledger called the blockchain, which is refreshed in such a way that the whole community in effect votes on the order in which transactions are added or, equivalently, the time when each coin is spent.
The blockchain ledger is periodically hashed to keep it to a manageable length, but all transactions are visible, archived in effect for all time. No proof of identity or KYC check is needed to register a Bitcoin account, and currency – denominated "BTC" – may be transferred freely to any other account. Hence Bitcoin may be called anonymous (but the unique account identifiers are set in stone, providing a rock solid money trail that has been the undoing of many criminal Bitcoin users).
The continuous arbitration of blockchain entries is effected by a peer-to-peer network of servers that race each other to double-check a special hash value for the refreshed chain. The particular server that wins each race is rewarded for its effort with a tiny fraction of a Bitcoin. The ongoing background computation that keeps a network like this honest is referred to technically as "Proof of Work"; with Bitcoin, since there is a monetary reward, it’s called mining.
Whether or not Bitcoin lasts as a form of electronic cash, there is a groundswell of enthusiasm for the blockchain as a new type of public ledger for a much broader range of transactions, including “identity”. The scare quotes are deliberate on my part, reflecting that the blockchain-for-identity speculations have not been clear about what part of the identity puzzle they might solve.
For identity applications, the reality of Bitcoin mining creates some particular challenges which I will return to. But first let’s look at the positive influence of Bitcoin and then review some of its cryptographic building blocks.
People will argue about its true originality, but we can regard Bitcoin and the blockchain as providing an innovative and practical solution to the unsolved double-spend problem. I like Bitcoin as the latest example of a wondrous pattern in applied mathematics. Conundrums widely accepted as impossible are, in fact, solved quite often, after which frenetic periods of innovation can follow. The first surprise or prototype solution is typically inefficient but it can inspire fresh thinking and lead to more polished methods.
One of the greatest examples is Merkle’s Puzzles, a theoretical method invented by Ralph Merkle in 1974 for establishing a shared secret number between two parties who need only exchange public pieces of data. This was the holy grail for cryptography, for it meant that a secret key could be set up without having to carry the secret from one correspondent to the other (after all, if you can securely transfer a key across a long distance, you can do the same with your secret message and thus avoid the hassle of encryption altogether). Without going into detail, Merkle’s solution could not be used in the real world, but it solved what was thought to be an unsolvable problem. In quick succession, practical algorithms followed from Diffie & Hellman, and Rivest, Shamir & Adleman (the names behind “RSA”) and thus was born public key cryptography.
Bitcoin likewise has spurred dozens of new digital currencies, with different approaches to ledgers and arbitration, and different ambitions too (including Ripple, Ethereum, Litecoin, Dogecoin, and Colored Coins). They all promise to break the monopoly that banks have on payments, radically cut costs and settlement delays, and make electronic money more accessible to the unbanked of the world. These are what we might call liquidity advantages of digital currencies. These objectives (plus the more political promises of ending fiat currency and rendering electronic cash transactions anonymous or untraceable) are certainly all important but they are not my concern in this blog.
Bitcoin’s public sauce
Before looking at identity, let’s review some of the security features of the blockchain. We will see that safekeeping of each account holder’s private keys is paramount – as it is with all Internet payments systems and PKIs.
While the blockchain is novel, many elements of Bitcoin come from standard public key cryptography and will be familiar to anyone in security. What’s called a Bitcoin “address” (the identifier of someone you will send currency to) is actually a public key. To send any Bitcoin money from your own address, you use the matching private key to sign a data object, which is sent into the network to be processed and ultimately added to the blockchain.
The only authoritative record of anyone’s Bitcoin balance is held on the blockchain. Account holders typically operate a wallet application which shows their balance and lets them spend it, but, counter-intuitively, the wallet holds no money. All it does is control a private key (and provide a user experience of the definitive blockchain). The only way you have to spend your balance (that is, transfer part of it to another account address) is to use your private key. What follows from this is an unforgiving reality of Bitcoin: your private key is everything. If a private key is lost or destroyed, then the balance associated with that key is frozen forever and cannot be spent. And thus there has been a string of notorious mishaps where computers or disk drives holding Bitcoin wallets have been lost, together with millions of dollars of value they controlled. Furthermore, numerous pieces of malware have – predictably – been developed to steal Bitcoin private keys from regular storage devices (and law enforcement agencies have intercepted suspects’ private keys in the battle against criminal use of Bitcoin).
You would expect the importance of Bitcoin private key storage to have been obvious from the start, to ward off malware and destruction, and to allow for reliable backup. But it was surprisingly late in the piece that “hardware wallets” emerged, the best known of which is probably now the Trezor, which first appeared in 2013. The use of hardware security modules for private key management in soft wallets or hybrid wallets has been notably ad hoc. It appears crypto currency proponents pay more attention to the algorithms and the theory than to practical cryptographic engineering.
Identifying with the blockchain
The enthusiasm for crypto currency innovation has proven infectious, and many commentators have promoted the blockchain in particular as something special for identity management. A number of start-ups are “providing” identity on the blockchain – including OneName, and ShoCard – although on closer inspection what this usually means is nothing more than reserving a unique blockchain identifier with a self-claimed pseudonym.
Prominent financial services blogger Chris Skinner says "the blockchain will radically alter our futures" and envisages an Internet of Things where your appliances are “recorded [on the blockchain] as being yours using your digital identity token (probably a biometric or something similar)”. And the government of Honduras has hired American Bitcoin technology firm Factom to build a blockchain-based land title registry, which they claim will be “immutable”, resistant to insider fraud, and extensible to “more secure mortgages, contracts, and mineral rights”.
While blockchain afficionados have been quick to make a leap to identity, the opposite is not the case. The identerati haven’t had much to say about blockchain at all. Ping Identity CTO Patrick Harding mentioned it in his keynote address at the 2015 Cloud Identity Summit, and got a meek response from the audience when he asked who knew what blockchain is (I was there). Harding’s suggestions were modest, exploratory and cautious. And only now has blockchain figured prominently in the twice-yearly freeform Internet Identity Workshop unconference in Silicon Valley. I'm afraid it's telling that all the initial enthusiasm for blockchain "solving" identity has come from non identity professionals.
What identity management problem would be solved by using the blockchain? The most prominent challenges in digital identity include the following:
What does the blockchain have to offer?
Certainly, pseudonymity is important in some settings, but is rare in economically important personal business, and in any case is not unique to the blockchain. The secure recording of transactions is very important, but that’s well-solved by regular digital signatures (which remain cryptographically verifiable essentially for all time, given the digital certificate chain). Most important identity transactions are pretty private, so recording them all in a single public register instead of separate business-specific databases is not an obvious thing to do.
The special thing about the blockchain and the proof-of-work is that they prevent double-spending. I’ve yet to see a blockchain-for-identity proposal that explains what the equivalent “double identify” problem really is and how it needs solving. And if there is such a thing, the price to fix it is to record all identity transactions in public forever.
The central user action in all blockchain applications is to “send” something to another address on the blockchain. This action is precisely a digital (asymmetric cryptographic) signature, essentially the same as any conventional digital signature, created by hashing a data object and encrypting it with one’s private key. The integrity and permanence of the action comes from the signature itself; it is immaterial where the signature is stored.
What the blockchain does is prevent a user from performing the same action more than once, by using the network to arbitrate the order in which digital signatures are created. In regular identity matters, this objective simply doesn’t arise. The primitive actions in authentication are to leave one’s unique identifying mark (or signature) on a persistent transaction, or to present one’s identity in real time to a service. Apart from peer-to-peer arbitration of order, the blockchain is just a public ledger - and a rather slow one at that. Many accounts of blockchain uses beyond payments simply speak of its inviolability or perpetuity. In truth, any old system of digitally signed database entries is reasonably inviolable. Tamper resistance and integrity come from the digital signatures, not the blockchain. And as mentioned, the blockchain itself doesn't provide any assurance of who really did what - for that we need separate safeguards on users' private keys, plus reliable registration of users and their relevant attributes (which incidentally cannot be done without some authority, unless self-attestation is good enough).
In addition to not offering much advantage in identity management, there are at least two practical downsides to recording non Bitcoin activity on the blockchain, both related to the proof-of-work. The peer-to-peer resolution of the order of transactions takes time. With Bitcoin, the delay is 10 minutes; that’s the time taken for an agreed new version of the blockchain to be distilled after each transaction. Clearly, in real time access control use cases, when you need to know who someone is right away, such delay is unacceptable. The other issue is cost. Proof-of-work, as the name is meant to imply, consumes real resources, and elicits a real reward.
So for arbitrary identity transactions, what is the economics for using the blockchain? Who would pay, who would be paid, and what market forces would price identity, in this utopia where all accounts are equal?
On one of the IDAM industry mail lists recently, a contributer noted in passing that:
- "I replaced ‘identity’ throughout the document with ‘attribute’ and barring a few grammar issues everything still works."
We're getting warm.
Seriously, when will identity engineers come round and do just that: dispense with the word "identity"? We don't need to change our job descriptions or re-badge the whole "identity management" sector but I do believe we need to stop saying things like "federate identity" or "provide identity".
The writing has been on the wall for some time.
"Identity" is actually a macro for how a Relying Party (RP) knows each of its Subject. Identification is the process by which an RP is satisfied it knows enough about a Subject -- a customer, a trading partner, an employee and so on -- that it can deal with that Subject with acceptable residual risk. Identification is just the surface of the relationship between Subject and RP. The risks of misidentification are ultimately borne by the RP -- even if they can be mitigated to some extent through contracts with third parties that have helped the RP establish identity.
The most interesting work in IDAM (especially the "Vectors of Trust" or VoT, initiated by Justin Richer) is now about better management of the diverse and context-dependent signals, claims and/or attributes that go into a multivariate authentication decision. And that reminds me of the good old APEC definition of authentication -- "the means by which a receiver of an electronic transaction or message makes a decision to accept or reject that transaction or message" -- which notably made no mention of identity at all!
We really should now go the whole way and replace "identity" with "attributes". In particular, we should realise there are no "Identity Providers" -- they're all just Attribute Providers. No third party ever actually "provides" a Subject with their identity; that was a naive industrial sort of metaphor that reduces identity to a commodity, able to be bought and sold. It is always the Relying Party that "identifies" a Subject for their (the RP's) purposes. And therefore it is the Relying Party that bestows identity.
The mangled notion of "Identity Provider" seems to me to have contaminated IDAM models for a decade. Just think how much easier it would be to get banks, DMVs, social networks, professional associations, employers and the rest to set up modest Attribute Providers instead of grandiose and monopolistic Identity Providers!
As Yubico CEO Stina Ehrensvard says, "any organization that has tried to own and control online identity has failed".
There's a simple reason for that: identity is not what we thought it was. As we are beginning to see, if we did a global replace of "identity" with "attribute", all our technical works would still make sense. The name change is not mere word-smithing, for the semantics matter. By using the proper name for what we are federating, we will come a lot closer to the practical truth of the identity management problem, and after reframing the way we talk about the problems, we will solve them.
In the latest course of a 15 month security feast, BlackBerry has announced it is acquiring mobile device management (MDM) provider Good Technology. The deal is said to be definitive, for US$425 million in cash.
As BlackBerry boldly re-positions itself as a managed service play in the Internet of Things, adding an established MDM capability to its portfolio will bolster its claim -- which still surprises many -- to be handset neutral. But the Good buy is much more than that. It has to be seen in the context of John Chen's drive for cross-sector security and privacy infrastructure for the IoT.
As I reported from the recent BlackBerry Security Summit in New York, the company has knitted together a comprehensive IoT security fabric. Look at how they paint their security platform:
And see how Good will slip neatly into the Platform Services column. It's the latest in what is now a $575 million investment in non-organic security growth (following purchases of Secusmart, Watchdox, Movirtu and Athoc).
According to BlackBerry,
- Good will bring complementary capabilities and technologies to BlackBerry, including secure applications and containerization that protects end user privacy. With Good, BlackBerry will expand its ability to offer cross-platform EMM solutions that are critical in a world with varying deployment models such as bring-your-own-device (BYOD); corporate owned, personally enabled (COPE); as well as environments with multiple user interfaces and operating systems. Good has expertise in multi-OS management with 64 percent of activations from iOS devices, followed by a broad Android and Windows customer base.(1) This experience combined with BlackBerry’s strength in BlackBerry 10 and Android management – including Samsung KNOX-enabled devices – will provide customers with increased choice for securely deploying any leading operating system in their organization.
The strategic acquisition of Good Technology will also give the Identity-as-a-Service sector a big kick. IDaaS is become a crowded space with at least ten vendors (CA, Centrify, IBM, Microsoft, Okta, OneLogin, Ping, Salepoint, Salesforce, VMware) competing strongly around a pretty well settled set of features and functions. BlackBerry themselves launched an IDaaS a few months ago. At the Security Summit, I asked their COO Marty Beard what is going to distinguishe their offering in such a tight market, and he said, simply, mobility. Presto!
But IDaaS is set to pivot. We all know that mobility is now the locus of security , and we've seen VMware parlay its AirWatch investment into a competitive new cloud identity service. This must be more than a catch-up play with so many entrenched IDaaS vendors.
Here's the thing. I foresee identity actually disappearing from the user experience, which more and more will just be about the apps. I discussed this development in a really fun "Identity Innovators" video interview recorded with Ping at the recent Cloud Identity Summit. For identity to become seamless with the mobile application UX, we need two things. Firstly, federation protocols so that different pieces of software can hand over attributes and authentication signals to one another, and these are all in place now. But secondly we also need fully automated mobile device management as a service, and that's where Good truly fits with the growing BlackBerry platform.
Now stay tuned for new research coming soon via Constellation on the Internet of Things, identity, privacy and software reliability.
See also The State of Identity Management in 2015.
Identity online is a vexed problem. The majority of Internet fraud today can be related to weaknesses in the way we authenticate people electronically. Internet identity is terribly awkward too. Unfortunately today we still use password techniques dating back to 1960s mainframes that were designed for technicians, by technicians.
Our identity management problems also stem from over-reach. For one thing, the information era heralded new ways to reach and connect with people, with almost no friction. We may have taken too literally the old saw “information wants to be free.” Further, traditional ways of telling who people are, through documents and “old boys networks” creates barriers, which are anathema to new school Internet thinkers.
For the past 10-to-15 years, a heady mix of ambitions has informed identity management theory and practice: improve usability, improve security and improve “trust.” Without ever pausing to unravel the rainbow, the identity and access management industry has created grandiose visions of global “trust frameworks” to underpin a utopia of seamless stranger-to-stranger business and life online.
Well-resourced industry consortia and private-public partnerships have come and gone over the past decade or more. Numerous “trust” start-up businesses have launched and failed. Countless new identity gadgets, cryptographic algorithms and payment schemes have been tried.
And yet the identity problem is still with us. Why is identity online so strangely resistant to these well-meaning efforts to fix it? In particular, why is federated identity so dramatically easier said than done?
Identification is a part of risk management. In business, service providers use identity to manage the risk that they might be dealing with the wrong person. Different transactions carry different risks, and identification standards are varied accordingly. Conversely, if a provider cannot be sure enough who someone is, they now have the tools to withhold or limit their services. For example, when an Internet customer signs in from an unusual location, payment processors can put a cap on the dollar amounts they will authorize.
Across our social and business walks of life, we have distinct ways of knowing people, which yields a rich array of identities by which we know and show who we are to others. These Identities have evolved over time to suit different purposes. Different relationships rest on different particulars, and so identities naturally become specific not general.
The human experience of identity is one of ambiguity and contradictions. Each of us simultaneously holds a weird and wonderful ensemble of personal, family, professional and social identities. Each is different, sometimes radically so. Some of us lead quite secret lives, and I’m not thinking of anything salacious, but maybe just the role-playing games that provide important escapes from the humdrum.
Most of us know how it feels when identities collide. There’s no better example than what I call the High School Reunion Effect: that strange dislocation you feel when you see acquaintances for the first time in decades. You’ve all moved on, you’ve adopted new personae in new contexts – not the least of which is the one defined by a spouse and your own new family. Yet you find yourself re-winding past identities, relating to your past contemporaries as you all once were, because it was those school relationships, now fossilised, that defined you.
Frankly, we’ve made a mess of the pivotal analogue-to-digital conversion of identity. In real life we know identity is malleable and relative, yet online we’ve rendered it crystalline and fragile.
We’ve come close to the necessary conceptual clarity. Some 10 years ago a network of “identerati” led by Kim Cameron of Microsoft composed the “Laws of Identity,” which contained a powerful formulation of the problem to be addressed. The Laws defined Digital Identity as “a set of claims made [about] a digital subject.”
Your Digital Identity is a proxy for a relationship, pointing to a suite of particulars that matter about you in a certain context. When you apply for a bank account, when you subsequently log on to Internet banking, when you log on to your work extranet, or to Amazon or PayPal or Twitter, or if you want to access your electronic health record, the relevant personal details are different each time.
The flip side of identity management is privacy. If authentication concerns what a Relying Party needs to know about you, then privacy is all about what they don’t need to know. Privacy amounts to information minimization; security professionals know this all too well as the “Need to Know” principle.
All attempts at grand global identities to date have failed. The Big Certification Authorities of the 1990s reckoned a single, all-purpose digital certificate would meet the needs of all business, but they were wrong. Ever more sophisticated efforts since then have also failed, such as the Infocard Foundation, Liberty Alliance and the Australian banking sector’s Trust Centre.
Significantly, federation for non-trivial identities only works within regulatory monocultures – for example the US Federal Bridge CA, or the Scandinavian BankID network – where special legislation authorises banks and governments to identify customers by the one credential. The current National Strategy for Trusted Identities in Cyberspace has pondered legislation to manage liability but has balked. The regulatory elephant remains in the room.
As an aside, obviously social identities like Facebook and Twitter handles federate very nicely, but these are issued by organisations that don't really know who we are, and they're used by web sites that don't really care who we are; social identity federation is a poor model for serious identity management.
A promising identity development today is the Open Identity Foundation’s Attribute Exchange Network, a new architecture seeking to organise how identity claims may be traded. The Attribute Exchange Network resonates with a growing realization that, in the words of Andrew Nash, a past identity lead at Google and at PayPal, “attributes are at least as interesting as identities – if not more so.”
If we drop down a level and deal with concrete attribute data instead of abstract identities, we will start to make progress on the practical challenges in authentication: better resistance to fraud and account takeover, easier account origination and better privacy.
My vision is that by 2019 we will have a fresh marketplace of Attribute Providers. The notion of “Identity Provider” should die off, for identity is always in the eye of the Relying Party. What we need online is an array of respected authorities and agents that can vouch for our particulars. Banks can provide reliable electronic proof of our payment card numbers; government agencies can attest to our age and biographical details; and a range of private businesses can stand behind attributes like customer IDs, membership numbers and our retail reputations.
In five years time I expect we will adopt a much more precise language to describe how to deal with people online, and it will reflect more faithfully how we’ve transacted throughout history. As the old Italian proverb goes: It is nice to “trust” but it’s better not to.
This article first appeared as "Abandoning identity in favor of attributes" in Secure ID News, 2 December, 2014.
Acknowledgement: Daniel Barth-Jones kindly engaged with me after this blog was initially published, and pointed out several significant factual errors, for which I am grateful.
In 2014, the New York Taxi & Limousine Company (TLC) released a large "anonymised" dataset containing 173 million taxi rides taken in 2013. Soon after, software developer Vijay Pandurangan managed to undo the hashed taxi registration numbers. Subsequently, privacy researcher Anthony Tockar went on to combine public photos of celebrities getting in or out of cabs, to recreate their trips. See Anna Johnston's analysis here.
This re-identification demonstration has been used by some to bolster a general claim that anonymity online is increasingly impossible.
On the other hand, medical research advocates like Columbia University epidemiologist Daniel Barth-Jones argue that the practice of de-identification can be robust and should not be dismissed as impractical on the basis of demonstrations such as this. The identifiability of celebrities in these sorts of datasets is a statistical anomaly reasons Barth-Jones and should not be used to frighten regular people out of participating in medical research on anonymised data. He wrote in a blog that:
- "However, it would hopefully be clear that examining a miniscule proportion of cases from a population of 173 million rides couldn’t possibly form any meaningful basis of evidence for broad assertions about the risks that taxi-riders might face from such a data release (at least with the taxi medallion/license data removed as will now be the practice for FOIL request data)."
As a health researcher, Barth-Jones is understandably worried that re-identification of small proportions of special cases is being used to exaggerate the risks to ordinary people. He says that the HIPAA de-identification protocols if properly applied leave no significant risk of re-id. But even if that's the case, HIPAA processes are not applied to data across the board. The TLC data was described as "de-identified" and the fact that any people at all (even stand-out celebrities) could be re-identified from data does create a broad basis for concern - "de-identified" is not what it seems. Barth-Jones stresses that in the TLC case, the de-identification was fatally flawed [technically: it's no use hashing data like registration numbers with limited value ranges because the hashed values can be reversed by brute force] but my point is this: who among us who can tell the difference between poorly de-identified and "properly" de-identified?
And how long can "properly de-identified" last? What does it mean to say casually that only a "minuscule proportion" of data can be re-identified? In this case, the re-identification of celebrities was helped by the fact lots of photos of them are readily available on social media, yet there are so many photos in the public domain now, regular people are going to get easier to be identified.
But my purpose here is not to play what-if games, and I know Daniel advocates statistically rigorous measures of identifiability. We agree on that -- in fact, over the years, we have agreed on most things. The point I am trying to make in this blog post is that, just as nobody should exaggerate the risk of re-identification, nor should anyone play it down. Claims of de-identification are made almost daily for really crucial datasets, like compulsorily retained metadata, public health data, biometric templates, social media activity used for advertising, and web searches. Some of these claims are made with statistical rigor, using formal standards like the HIPAA protocols; but other times the claim is casual, made with no qualification, with the aim of comforting end users.
"De-identified" is a helluva promise to make, with far-reaching ramifications. Daniel says de-identification researchers use the term with caution, knowing there are technical qualifications around the finite probability of individuals remaining identifiable. But my position is that the fine print doesn't translate to the general public who only hear that a database is "anonymous". So I am afraid the term "de-identified" is meaningless outside academia, and in casual use is misleading.
Barth-Jones objects to the conclusion that "it's virtually impossible to anonymise large data sets" but in an absolute sense, that claim is surely true. If any proportion of people in a dataset may be identified, then that data set is plainly not "anonymous". Moreover, as statistics and mathematical techniques (like facial recognition) improve, and as more ancillary datasets (like social media photos) become accessible, the proportion of individuals who may be re-identified will keep going up.[Readers who wish to pursue these matters further should look at the recent Harvard Law School online symposium on "Re-identification Demonstrations", hosted by Michelle Meyer, in which Daniel Barth-Jones and I participated, among many others.]
Both sides of this vexed debate need more nuance. Privacy advocates have no wish to quell medical research per se, nor do they call for absolute privacy guarantees, but we do seek full disclosure of the risks, so that the cost-benefit equation is understood by all. One of the obvious lessons in all this is that "anonymous" or "de-identified" on their own are poor descriptions. We need tools that meaningfully describe the probability of re-identification. If statisticians and medical researchers take "de-identified" to mean "there is an acceptably small probability, namely X percent, of identification" then let's have that fine print. Absent the detail, lay people can be forgiven for thinking re-identification isn't going to happen. Period.
And we need policy and regulatory mechanisms to curb inappropriate re-identification. Anonymity is a brittle, essentially temporary, and inadequate privacy tool.
I argue that the act of re-identification ought to be treated as an act of Algorithmic Collection of PII, and regulated as just another type of collection, albeit an indirect one. If a statistical process results in a person's name being added to a hitherto anonymous record in a database, it is as if the data custodian went to a third party and asked them "do you know the name of the person this record is about?". The fact that the data custodian was clever enough to avoid having to ask anyone about the identity of people in the re-identified dataset does not alter the privacy responsibilities arising. If the effect of an action is to convert anonymous data into personally identifiable information (PII), then that action collects PII. And in most places around the world, any collection of PII automatically falls under privacy regulations.
It looks like we will never guarantee anonymity, but the good news is that for privacy, we don't actually need to. Privacy is the protection you need when you affairs are not anonymous, for privacy is a regulated state where organisations that have knowledge about you are restrained in what they do with it. Equally, the ability to de-anonymise should be restricted in accordance with orthodox privacy regulations. If a party chooses to re-identify people in an ostensibly de-identified dataset, without a good reason and without consent, then that party may be in breach of data privacy laws, just as they would be if they collected the same PII by conventional means like questionnaires or surveillance.
Surely we can all agree that re-identification demonstrations serve to shine a light on the comforting claims made by governments for instance that certain citizen datasets can be anonymised. In Australia, the government is now implementing telecommunications metadata retention laws, in the interests of national security; the metadata we are told is de-identified and "secure". In the UK, the National Health Service plans to make de-identified patient data available to researchers. Whatever the merits of data mining in diverse fields like law enforcement and medical research, my point is that any government's claims of anonymisation must be treated critically (if not skeptically), and subjected to strenuous and ongoing privacy impact assessment.
Privacy, like security, can never be perfect. Privacy advocates must avoid giving the impression that they seek unrealistic guarantees of anonymity. There must be more to privacy than identity obscuration (to use a more technically correct term than "de-identification"). Medical research should proceed on the basis of reasonable risks being taken in return for beneficial outcomes, with strong sanctions against abuses including unwarranted re-identification. And then there wouldn't need to be a moral panic over re-identification if and when it does occur, because anonymity, while highly desirable, is not essential for privacy in any case.