Blockchain is an algorithm and distributed data structure designed to manage electronic cash without any central administrator. The original blockchain was invented in 2008 by the pseudonymous Satoshi Nakamoto to support Bitcoin, the first large-scale peer-to-peer crypto-currency, completely free of government and institutions.
Blockchain is a Distributed Ledger Technology (DLT). Most DLTs have emerged in Bitcoin's wake. Some seek to improve blockchain's efficiency, speed or throughput; others address different use cases, such as more complex financial services, identity management, and "Smart Contracts".
The central problem in electronic cash is Double Spend. If electronic money is just data, nothing physically stops a currency holder trying to spend it twice. It was long thought that a digital reserve was needed to oversee and catch double-spends, but Nakamoto rejected all financial regulation, and designed an electronic cash without any umpire.
The Bitcoin (BTC) blockchain crowd-sources the oversight. Each and every attempted spend is broadcast to a community, which in effect votes on the order in which transactions occur. Once a majority agrees all transactions seen in the recent past are unique, they are cryptographically sealed into a block. A chain thereby grows, each new block linked to the previously accepted history, preserving every spend ever made.
A Bitcoin balance is managed with an electronic wallet which protects the account holder's private key. Blockchain uses conventional public key cryptography to digitally sign each transaction with the sender's private key and direct it to a recipient's public key. The only way to move Bitcoin is via the private key: lose or destroy your wallet, and your balance will remain frozen in the ledger, never to be spent again.
The blockchain's network of thousands of nodes is needed to reach consensus on the order of ledger entries, free of bias, and resistant to attack. The order of entries is the only thing agreed upon by the blockchain protocol, for that is enough to rule out double spends.
The integrity of the blockchain requires a great many participants (and consequentially the notorious power consumption). One of the cleverest parts of the BTC blockchain is its incentive for participating in the expensive consensus-building process. Every time a new block is accepted, the system randomly rewards one participant with a bounty (currently 12.5 BTC). This is how new Bitcoins are minted or "mined".
Blockchain has security qualities geared towards incorruptible cryptocurrency. The ledger is immutable so long as a majority of nodes remain independent, for a fraudster would require infeasible computing power to forge a block and recalculate the chain to be consistent. With so many nodes calculating each new block, redundant copies of the settled chain are always globally available.
Contrary to popular belief, blockchain is not a general purpose database or "trust machine". It only reaches consensus about one specific technicality – the order of entries in the ledger – and it requires a massive distributed network to do so only because its designer-operators choose to reject central administration. For regular business systems, blockchain's consensus is of questionable benefit.
Posted in Blockchain
A few days ago, it was reported that Julian Assange "read out a bitcoin block hash to prove he was alive". This was in response to rumours that he had died. It was a neat demonstration not only that he was not dead, but also of a couple of limits to the blockchain that are still not widely appreciated. It showed that blockchain on its own provides little value beyond cryptocurrency; in particular, on its own, blockchain doesn’t ‘prove existence’. And further, we can see that when blockchain is hybridised with other security processes, it is no longer terribly unique.
What Assange did was broadcast himself reading out the hexadecimal letters and numbers of the most recent block hash at the time, namely January 10th. Because the hash value is unique to the transaction history of the blockchain and cannot be predicted, quoting the hash value on January 10th proves that the broadcast was not made earlier than that day. It’s equivalent to holding up a copy of a newspaper to show that a video has to be contemporary.
With regards to proof of existence, the evidence on the blockchain comes from the digital signatures created by account holders’ private keys. A blockchain entry certainly proves that a certain private key existed at the time of the entry, but on its own, blockchain doesn’t prove who controls the key. A major objective of blockchain as a crypto-currency engine was indeed to remove any central oversight of keys and account holders.
Quoting the blockchain hash value from January 10th doesn’t prove Assange was alive that day. It is the combination of the broadcast and the blockchain that tells us he was alive.
If this is an example of blockchain providing proof-of-existence (or “proof of life” according to some reports) then the video is like a key management layer: it augments the blockchain by binding the physical person to the data structure. Yet the combination of a video and the blockchain doesn’t provide any unique advantages over, for example, a video plus the day’s newspaper, or a video plus a snapshot of the day’s stock market ticker tape or lotto numbers.
The pure blockchain was designed to manage decentralised electronic cash and it does that with great distinction. But blockchain needs to be combined with other processes to achieve the many other non-cryptocurrency use cases, and those combinations erode its value. If you need to wrap blockchain with other security mechanisms to achieve some outcome, you will find that the consensus algorithm becomes redundant, and that simpler systems can get the job done.
Posted in Blockchain
In a Huffington Post blog "Why the Blockchain Still Lacks Mass Understanding" William Mougayar describes the blockchain as "philosophically inclined technology". It's one of his rare instances of understatement. Like most blockchain visionaries, Mougayar massively exaggerates what this thing does, overlooking what it was designed for, and stretching it to irrelevance. If "99% of people still don’t understand the blockchain" it's because Mougayar and his kind are part of the problem, not part of the solution.
Let's review. This technology is more than philosophically "inclined". Blockchain was invented by someone who flatly rejected fiat currency, government regulation and financial institutions. Satoshi Nakamoto wanted an electronic cash devoid of central oversight or 'digital reserve banks'. And he solved what was thought to be an unsolvable problem, with an elaborate and brilliant algorithm that has a network of thousands of computers vote on the order in which transaction appears in a pool. The problem is Double Spend; the solution is have a crowd watch every spend to see that no Bitcoin is spent twice.
But that's all blockchain does. It creates consensus about the order of entries in the ledger. It does not and cannot reach consensus about anything else, not without additional off-chain processes like user registration, permissions management, access control and encryption. Yet these all require the sort of central administration that Nakamoto railed against. Nakamoto designed an amazing solution to the Double Spend problem, but nothing else. Nakamoto him/herself said that if you still need third parties in your ledger, then the blockchain loses its benefits.
THAT is what most people misunderstand about blockchain. Appreciate what blockchain was actually for and you will see that most applications beyond its original anarchic scope for this philosophically single-minded technology simply don't add up.
Posted in Blockchain
Or Reorientating how engineers think about privacy.
From my chapter Blending the practices of Privacy and Information Security to navigate Contemporary Data Protection Challenges in “Trans-Atlantic Data Privacy Relations as a Challenge for Democracy”, Kloza & Svantesson (editors), in press.
One of the leading efforts to inculcate privacy into engineering practice has been the “Privacy by Design” movement. Commonly abbreviated "PbD" is a set of guidelines developed in the 1990s by the then privacy commissioner of Ontario, Ann Cavoukian. The movement seeks to embed privacy “into the design specifications of technologies, business practices, and physical infrastructures”. PbD is basically the same good idea as build in security, or build in quality, because retrofitting these things too late in the design lifecycle leads to higher costs* and compromised, sub-optimal outcomes.
Privacy by Design attempts to orientate technologists to privacy with a set of simple callings:
- 1. Proactive not Reactive; Preventative not Remedial
- 2. Privacy as the Default Setting
- 3. Privacy Embedded into Design
- 4. Full Functionality – Positive-Sum, not Zero-Sum
- 5. End-to-End Security – Full Lifecycle Protection
- 6. Visibility and Transparency – Keep it Open
- 7. Respect for User Privacy – Keep it User-Centric.
PbD is a well-meaning effort, and yet its language comes from a culture quite different from engineering. PbD’s maxims rework classic privacy principles without providing much that’s tangible to working systems designers.
The most problematic aspect of Privacy by Design is its idealism. Politically, PbD is partly a response to the cynicism of national security zealots and the like who tend to see privacy as quaint or threatening. Infamously, NSA security consultant Ed Giorgio was quoted in “The New Yorker” of 21 January 2008 as saying “privacy and security are a zero-sum game”. Of course most privacy advocates (including me) find that proposition truly chilling. And yet PbD’s response is frankly just too cute with its slogan that privacy is a “positive sum game”.
The truth is privacy is full of contradictions and competing interests, and we ought not sugar coat it. For starters, the Collection Limitation principle – which I take to be the cornerstone of privacy – can contradict the security or legal instinct to always retain as much data as possible, in case it proves useful one day. Disclosure Limitation can conflict with usability, because Personal Information may become siloed for privacy’s sake and less freely available to other applications. And above all, Use Limitation can restrict the revenue opportunities that digital entrepreneurs might otherwise see in all the raw material they are privileged to have gathered.
Now, by highlighting these tensions, I do not for a moment suggest that arbitrary interests should override privacy. But I do say it is naive to flatly assert that privacy can be maximised along with any other system objective. It is better that IT designers be made aware of the many trade-offs that privacy can entail, and that they be equipped to deal with real world compromises implied by privacy just as they do with other design requirements. For this is what engineering is all about: resolving conflicting requirements in real world systems.
So a more sophisticated approach than “Privacy by Design” is privacy engineering in which privacy can take its place within information systems design alongside all the other practical considerations that IT professionals weigh up everyday, including usability, security, efficiency, profitability, and cost.
See also my "Getting Started Guide: Privacy Engineering" from Constellation Research.
- Not unrelatedly, I wonder if we should re-examine the claim that retrofitting privacy, security and/or quality after a system has been designed and realised leads to greater cost! Cold hard experience might suggest otherwise. Clearly, a great many organisations persist with bolting on these sorts of features late in the day -- or else advocates wouldn't have to keep telling them not to. And the Minimum Viable Product movement is almost a license to defer quality and other non-essential considerations. All businesses are cost conscious, right? So averaged across a great many projects over the long term, could it be that businesses have in fact settled on the most cost effective timing of security engineering, and it's not as politically correct as we'd like?!
Last month, over September 26-27, I attended a US government workshop on The Use of Blockchain in Healthcare and Research, organised by the Department of Health & Human Services Office of the National Coordinator (ONC) and hosted at NIST headquarters at Gaithersburg, Maryland. The workshop showcased a number of winning entries from ONC's Blockchain Challenge, and brought together a number of experts and practitioners from NIST and the Department of Homeland Security.
I presented an invited paper "Blockchain's Challenges in Real Life" (PDF) alongside other new research by Mance Harmon from Ping Identity, and Drummond Reed from Respect Network. All the workshop presentations, the Blockchain Challenge winners' papers and a number of the unsuccessful submissions are available on the ONC website. You will find contributions from major computer companies and consultancies, leading medical schools and universities, and a number of unaffiliated researchers.
I also sat on a panel session about identity innovation, joining entrepreneurs from Digital Bazaar, Factom, Respect Network, and XCELERATE, all of which are conducting R&D projects funded by the DHS Science and Technology division.
Around the same time as the workshop, I happened to finalise two new Constellation Research papers, on security and R&D practices for blockchain technologies. And that was timely, because I am afraid that once again, I have immersed myself in some of the most current blockchain thinking, only to find that key pieces of the puzzle are still missing.
Disclosure: I traveled to the Blockchain in Healthcare workshop as a guest of ONC, which paid for my transport and accommodation.
Three observations from the Workshop
There were two things I just did not get as I read the winning Blockchain Challenge papers and listened to the presentations. And I observe that there is one crucial element that most of the proposals are missing
Firstly, one of the most common themes across all of the papers was interoperability. A great challenge in e-health is indeed interoperability. Disparate health systems speak different languages, using different codes for the same medical procedures. Adoption of new standard terminologies and messaging standards, like HL-7 and ICD, is infamously slow, often taking a decade or longer. Large clinical systems are notoriously complex to implement, so along the way they invariably undergo major customisation, which makes each installation peculiar to its setting, and resistant to interfacing with other systems.
In the USA, Health Information Exchanges (HIEs) have been a common response to these problems, the idea being that an intermediary switching system can broker understanding between local e-health programs. But as anyone in the industry knows, HIEs have been easier said than done, to say the least.
According to many of the ONC challenge papers, blockchain is supposed to bring a breakthrough, yet no one has explained how a ledger will make the semantics of all these e-health silos suddenly compatible. Blockchain is a very specific protocol that addresses the order of entries in a distributed ledger, to prevent Double Spend without an administrator. Nothing about blockchain's fundamentals relates to the contents of messages, healthcare semantics, medical codes and so on. It just doesn't "do" interoperability! The complexity in healthcare is intrinsic to the subject matter; it cannot be willed away with any new storage technology.
The second thing I just didn't get about the workshop was the idea that blockchain will fix healthcare information silos. Several speakers stressed the problem that data is fragmented, concentrated in local repositories, and hard to find when needed. All true, but I don't see what blockchain can do about this. A consensus was reached at the workshop that personal information and Protected Health Information (PHI) should not be stored on the blockchain in any significant amounts (not just because of its sensitivity but also the sheer volume of electronic health records and images in particular). So if we're agreed that the blockchain could only hold pointers to health data, what difference can it make to the current complex of record systems?
And my third problem at the workshop was the stark omission of key management. This is the central administrative challenge in any security system, of getting the right cryptographic keys and credentials into the right hands, so all parties can be sure who they are dealing with. The thing about blockchain is that it did away with key management. The genius of the original Bitcoin blockchain is it allows people to exchange guaranteed value without needing to know anything about each other. Blockchain actually dispenses with key management and it may be unique in the history of security for doing so (see also Blockchain has no meaning). But when we do need to know who's who in a health system – to be certain when various users really are authorised medicos, researchers, insurers or patients – then key management must return to the mix. And then things get complicated, much more complicated than the utopian setting of Bitcoin.
Moreover, healthcare is hierarchical. Inherent to the system are management structures, authorizations, credentialing bodies, quality assurance and audits – all the things that blockchain's creator Satoshi Nakamoto expressly tried to get rid of. As I explained in my workshop speech, if a blockchain deployment still has to involve third parties, then the benefits of the algorithm are lost. So said Nakamoto him/herself!
In my view, most blockchain for healthcare projects will discover, sooner or later, than once the necessary key management arrangements are taken care of, their choice of distributed ledger technology becomes inconsequential.
New Constellation Research on Blockchain Technologies
Security for blockchains and Distributed Ledger Technologies (DLTs) have evolved quickly. As soon as interest in blockchain grew past crypto-currency into mainstream business applications, it became apparent that the core ledger would need to augmented with permissions for access control, and encryption for confidentiality. But what few people appreciate is that these measures conflict with the rationale of the original blockchain algorithm, which was expressly meant to dispel administration layers. The first of my new papers looks at these tensions, what they mean for public and private blockchain systems, paints a picture for third generation DLTs.
The uncomfortable marriage of ad hoc security and the early blockchain is indicative of a broader problem I've written about many times: too much blockchain "innovation" is proceeding with insufficient rigor. Which brings us to the second of my new papers. In the rush to apply blockchain to broader payments and real world assets, few entrepreneurs have been clear and precise about the problems they think they’re solving. If the R&D is not properly grounded, then the resulting solutions will be weak and will ultimately fail in the market. It must be appreciated that the original blockchain was only a prototype. Great care needs to be taken to learn from it and more rigorously adapt fast-evolving DLTs to enterprise needs.
Constellation ShortListTM for Distributed Ledger Technologies Labs
Finally, Constellation Research has launched a new product, the Constellation ShortListTM. These are punchy lists by our analysts of leading technologies in dozens of different categories, which will each be refreshed on a short cycle. The objective is to help buyers of technology when choosing offerings in new areas.
My Constellation ShortListTM for blockchain-related solution providers is now available here.
What do land titles, marriage certificates, diamonds, ballots, aircraft parts and medical records have in common? They are all apparently able to be managed "on the blockchain". But enough with the metaphors. What does it mean to be "on the blockchain"?
To put a physical asset "on" the blockchain requires two mappings. Firstly, the asset needs to be mapped onto a token. For example, the serial number or barcode of a part or a diamond is inserted as metadata into a blockchain transaction, to codify the transfer of ownership of the asset. Secondly, asset owners need to be mapped onto their respective blockchain wallet public keys (through the sort of agent or third party which Nakamoto, let's remember, expressly tried to get rid of with the P2P consensus algorithm). The mapping can be pseudonymous, but buyers and sellers of land for instance, need to be confident that the counterparties control the keys they claim to.
How does the "naked" blockchain get away without these mappings? It's because Bitcoins don't exist off-chain. In fact they don't exist "on" the chain either; the blockchain itself only records subtractions and additions of balances.
Furthermore, possession of the private key is the only thing that matters with Bitcoin. Control a wallet's private key and you control the wallet balance. The protocol doesn't care who is in control; it will simply ensure that a quantity of Bitcoin will be transferred from one wallet to another, regardless of who "owns" them.
So unlike any other cryptographic security system, Bitcoin key pairs need not be imbued with any extrinsic significance, or associated with (bound to) any real world attributes. Bitcoins have no symbolic meaning. And in fact that is blockchain's magic trick!
But to make tokens stand for anything else - anything real - breaks the spell. Symbols are defined by authorities, and keys and attributes can only be assigned by third parties. If you have administrators, you just don't need the additional overhead of the blockchain, which exists purely to get around Nakamoto's express assumption that nobody in his system of electronic cash was to be trusted.
Bitcoin is often said to be anonymous, but its special property is actually that it has no meaning. It's truly amazing that such a thing can have value and be relied upon, which is a testament to its architecture. Blockchain was deliberately designed for a non fiat crypto currency. It's brilliant yet very specific to its intended trust-less environment. To re-introduce trusted processes simply undoes the benefits of blockchain.
I’ve been a critic of Blockchain. Frankly I’ve never seen such a massed rush of blood to the head for a new technology. Breathless books are being churned out about “trust infrastructure” and an “Internet of Value”. They say Blockchain will keep politicians and business people honest, and enable “billions of excluded people to enter the global economy”.
Most pundits overlook the simple fact that Blockchain only does one thing: it lets you move Bitcoin (a digital bearer token) from one account to another without an umpire. And it doesn’t even do that very well, for the Proof of Work algorithm is stupendously inefficient. Blockchain can't magically make merchants keep up their side of a bargain. Surprise! You can still get ripped off paying with Bitcoin. Blockchain simply doesn’t do what the futurists think it does. In their hot flushes, they tend to be caught in a limbo between the real possibilities of distributed consensus today and a future that no one is seeing clearly.
But Blockchain does solve what was thought to be an impossible problem, and in the right hands, that insight can convert to real innovation. I’m happy to see some safe pairs of hands now emerging in the Blockchain storm.
One example is an investment being made by Ping Identity in Swirlds and its new “hashgraph” distributed consensus platform. Hashgraph has been designed from the ground up to deliver many of Blockchain’s vital properties (consensus on the order of events, and redundancy) in a far more efficient and robust manner.
And what is Ping doing with this platform? Well they’re not rushing out with vague promises to manufacture "trust" but instead they’re making babysteps on real problems in identity management. For starters, they’re applying the new hashgraph platform to Distributed Session Management (DSM). This is the challenge of verifiably shutting down all of a user’s multiple log-on sessions around the web when they take a break, suffer a hack, or lose their job. It's one of the great headaches of enterprise identity administration and is exploited in a great many cyberattacks.
Ping’s identity architects have carefully set out the problem they’re trying to solve, why it’s hard, and how existing approaches don’t deliver the desired security properties for session management. They then evaluated a number of consensus approaches - not just Blockchain but also Paxos and Raft – and discussed their limitations. The Ping team then landed on hashgraph, which appears to meet the needs, and also looks like it can deliver a range of advanced features.
In my view, Ping Identity’s work is the very model of mature security design. It’s an example of the care and attention to detail that other innovators should follow.
Swirld’s founder Dr Leemon Baird will be presenting hashgraph in more detail to the Cloud Identity Summit in New Orleans tomorrow (June 7th).
In "We are hopelessly hooked" (New York Review of Books, February 25), political historian Jacob Weisberg canvasses the social impact of digital technology. He describes mobile and social media as “self-depleting and antisocial” but I would prefer different-social not merely for the vernacular but because the new media's sadder side is a lot like what's gone before.
In reviewing four recent contributions to the field - from Sherry Turkle, Joseph Reagle and Nir Eyal - Weisberg dwells in various ways on the twee dichotomy of experience online and off. For many of us, the distinction between digital and "IRL" (the sardonic abbreviation of "in real life") is becoming entirely arbitrary, which I like to show through an anecdote.
I was a mid-career technology researcher and management consultant when I joined Twitter in 2009. It quickly supplanted all my traditional newsfeeds and bulletin boards, by connecting me to individuals who I came to trust to pass on what really mattered. More slowly, I curated my circles, built up a following, and came to enjoy the recognition that would ordinarily come from regular contact, if the travel was affordable from far flung Australia. By 2013 I had made it as a member of the “identerati” – a loose international community of digital identity specialists. Thus, on my first trip to the US in many years, I scored a cherished invitation to a private pre-conference party with 50 or so of these leaders.
On the night, as I made my way through unfamiliar San Francisco streets, I had butterflies. I had met just one of my virtual colleagues face-to-face. How would I be received “IRL”? The answer turned out to be: effortlessly. Not one person asked the obvious question – Steve, tell us about yourself! – for everyone knew me already. And this surprising ease wasn’t just about skipping formalities; I found we had genuine intimacy from years of sharing and caring, all on Twitter.
Weisberg quotes Joseph Reagle in "Reading the Comments..." looking for “intimate serendipity” in successful online communities. It seems both authors are overlooking how serendipity catalyses all human relationships. It’s always something random that turns acquaintances into friends. And happy accidents may be more frequent online, not in spite of all the noise but because of it. We all live for chance happenings, and the much-derided Fear Of Missing Out is not specific to kids nor the Internet. Down the generations, FOMO has always kept teenagers up past their bed time; but it’s also why we grown-ups outstay our welcome at dinner parties and hang out at dreary corporate banquets.
Weisberg considers Twitter’s decay into anarchy and despair to be inevitable, and he may be right, but is it simply for want of supervision? We know sudden social decay all too well; just think of the terribly real-life “Lord of the Flies”.
Sound moral bearings are set by good parents, good teachers, and – if we’re lucky – good peers. At this point in history, parents and teachers are famously less adept than their charges in the new social medium, but this will change. Digital decency will be better impressed on kids when all their important role models are online.
It takes a village to raise a child. The main problem today is that virtual villages are still at version 1.0.
We all know that digital transformation is imminent, but getting there is far from easy. The digital journey is fraught with challenges, not least of which is customer access. "Online" is not what it used to be; the online world by many measures is bigger than the “real world” and it’s certainly not just a special corner of a network we occasionally log into. Many customers spend a substantial part of their lives online. The very word "online" is losing its meaning, with offline becoming a very unusual state. So enterprises are finding they need to totally rethink customer identity, bringing together the perspectives of CTO for risk management and engineering, and the CMO for the voice of the customer.
Consider this. The customer experience of online identity was set in concrete in the 1960s when information technology meant mainframes and computers only sat in “laboratories”. That was when we had the first network logon. The username and password was designed by sys admins for sys admins.
Passwords were never meant to be easy. Ease of use was irrelevant to system administrators; everything about their job was hard, and if they had to manage dozens of account identifiers, so be it. The security of a password depends on it being hard to remember and therefore, in a sense, hard to use. The efficacy of a password is in fact inversely proportional to its ease of use! Isn't that a unique property in all consumer technology?
The tragedy is that the same access paradigm has been inherited from the mainframe era and passed right on through the Age of the PCs in the 1980s, to the Internet in the 2000s. Before we knew it, we all turned into heavy duty “computer” users. The Personal Computer was always regarded as a miniaturized mainframe, with a graphical user interface layered over one or more arcane operating systems, from which consumers never really escaped.
But now all devices are computers. Famously, a phone today is more powerful than all of NASA’s 1969 moon landing IT put together). And the user experience of “computing” has finally changed, and radically so. Few people ever touch operating system anymore. The whole UX is at the app level. What people know now is all tiles and icons, spoken commands, and gestures. Wipe, drag, tap, flick.
Identity management is probably the last facet of IT to be dragged out of the mainframe era. It's all thanks to mobility. We don’t "log on" anymore, we unlockour device. Occasionally we might be asked to confirm who we are before we do something risky, like look up a health record or make a larger payment. The engineer might call it “trust elevation” or some such but the user feels it’s like a reassuring double check.
We might even stop talking about “Two Factor Authentication” now the mobile is so ubiquitous. The phone is your second factor now, a constant part of your life, hardly ever out of sight, and instantly noticed if lost or stolen. And under the covers, mobile devices can make use of many other signals – history, location, activity, behaviour – to effect continuous or ambient authentication, and look out for misuse.
So the user experience of identity per se is melting away. We simply click on an app within an activated device and things happen. The authentication UX has been dictated for decades by technologists, but now, for the first time, the CTO and the CMO are on the same page when it comes to customer identity.
To explore these crucial trends, Ping Identity is hosting a webinar on June 2, Consumerization Killed the Identity Paradigm. To learn more about customer identity and how to implement it successfully in your enterprise, please join me and Ping Identity’s CTO Patrick Harding and CMO Brian Bell.
For the past few years, a crucial case has been playing out in Australia's legal system over the treatment of metadata in privacy law. The next stanza is due to be written soon in the Federal Court.
It all began when a journalist with a keen interest in surveillance, Ben Grubb, wanted to understand the breadth and depth of metadata, and so requested that mobile network operator Telstra provide him a copy of his call records. Grubb thought to exercise his rights to access Personal Information under the Privacy Act. Telstra held back a lot of Grubb's call data, arguing that metadata is not Personal Information and is not subject to the access principle. Grubb appealed to the Australian Privacy Commissioner, who ruled that metadata is identifiable and hence represents Personal Information. Telstra took their case to the Administrative Appeals Tribunal, which found in favor of Telstra, with a surprising interpretation of "Personal Information". And the Commissioner then appealed to the next legal authority up the line.
At yesterday's launch of Privacy Awareness Week in Sydney, the Privacy Commissioner Timothy Pilgrim informed us that the full bench of the Federal Court is due to consider the case in August. This could be significant for data privacy law worldwide, for it all goes to the reach of these sorts of regulations.
I always thought the nuance in Personal Information was in the question of "identifiability" -- which could be contested case by case -- and those good old ambiguous legal modifiers like 'reasonably' or 'readily'. So it was a great surprise that the Administrative Appeals Tribunal, in overruling the Privacy Commissioner in Ben Grubb v Telstra, was exercised instead by the meaning of the word "about".
Recall that the Privacy Act (as amended in 2012) defines Personal Information as:
- "Information or an opinion about an identified individual, or an individual who is reasonably identifiable: (a) whether the information or opinion is true or not; and (b) whether the information or opinion is recorded in a material form or not."
The original question at the heart of Grubb vs Telstra was whether mobile phone call metadata falls under this definition. Commissioner Pilgrim showed that call metadata is identifiable to the caller (especially identifiable by the phone company itself that keeps extensive records linking metadata to customer records) and therefore counts as Personal Information.
When it reviewed the case, the tribunal agreed with Pilgrim that the metadata was identifiable, but in a surprise twist, found that the metadata is not actually about Ben Grubb but instead is about the services provided to him.
- Once his call or message was transmitted from the first cell that received it from his mobile device, the [metadata] that was generated was directed to delivering the call or message to its intended recipient. That data is no longer about Mr Grubb or the fact that he made a call or sent a message or about the number or address to which he sent it. It is not about the content of the call or the message ... It is information about the service it provides to Mr Grubb but not about him. See AATA 991 (18 December 2015) paragraph 112.
To me it's passing strange that information about calls made by a person is not also regarded as being about that person. Can information not be about more than one thing, namely about a customer's services and the customer?
Think about what metadata can be used for, and how broadly-framed privacy laws are meant to stem abuse. If Ben Grubb was found, for example, to have repeatedly called the same Indian takeaway shop, would we not infer something about him and his taste for Indian food? Even if he called the takeaway shop just once, we might still conclude something about him, even if the sample size is small. We might deduce he doesn't like Indian (remember that in Australian law, Personal Information doesn't necessarily have to be correct).
By the AAT's logic, a doctor's appointment book would not represent any Personal Information about her patients but only information about the services she has delivered to them. But in fact the appointment list of an oncologist for instance would tell us a lot about peoples' cancer.
Given the many ways that metadata can invade our privacy (not to mention that people may be killed based on metadata) it's important that the definition of Personal Information be broad, and that it has a low threshold. Any amount of metadata tells us something about the person.
I appreciate that the 'spirit of the law' is not always what matters, but let's compare the definition of Personal Information in Australia with corresponding concepts elsewhere (see more detail beneath). In the USA, Personally Identifiable Information is any data that may "distinguish" an individual; in the UK, Personal Data is anything that "relates" to an individual; in Germany, it is anything "concerning" someone. Clearly the intent is consistent worldwide. If data can be linked to a person, then it comes under data privacy law.
Which is how it should be. Technology neutral privacy law is framed broadly in the interests of consumer protection. I hope the Federal Court in drilling into the definition of Personal Information upholds what the Privacy Act is for.
Personal Information definitions around the world.
Personal Information, Personal Data and Personally Identifiable Information are variously and more or less equivalently defined as follows (references are hyperlinked in the names of each country):
- data which relate to a living individual who can be identified
- any information concerning the personal or material circumstances of an identified or identifiable individual
- information about an identifiable individual
- information which can be used to distinguish or trace an individual's identity ...
- information or an opinion ... about an identified individual, or an individual who is reasonably identifiable.
Posted in Privacy