I was discussing definitions of Personally Identifiable Information (PII) with some lawyers today, one of whom took exception to the US General Services Administration definition: information that can be used to distinguish or trace an individual’s identity, either alone or when combined with other personal or identifying information that is linked or linkable to a specific individual". This lawyer concluded rather hysterically that under such a definition, "nobody can use the internet without a violation".
Similarly, I've seen engineers in Australia recoil at the possibility that IP and MAC Addresses might be treated as PII because it is increasingly easy to link them to the names of device owners. I was recently asked "Why are they stopping me collecting IP addresses?". The answer is, they're not.
There are a great many misconceptions about privacy, but the idea that 'if it's personal you can't use it' is by far the worst.
Nothing in any broad-based data privacy law I know of says personal information cannot be collected or used.
Rather, what data privacy laws actually say is: if you're collecting and using PII, be careful.
Privacy is about restraint. The general privacy laws of Australia, Europe and 100-odd countries say things like don't collect PII without consent, don't collect PII beyond what you demonstrably need, don't use PII collected for one purpose for other unrelated purposes, tell individuals if you can what PII you hold about them, give people access to the PII you have, and do not retain PII for longer than necessary.
Such rules are entirely reasonable, and impose marginal restrictions on the legitimate conduct of business. And they align very nicely with standard security practice which promotes the Need To Know principle and the Principle of Least Privilege.
Compliance with Privacy Principles does add some overhead to data management compared with anonymous data. If re-identification techniques and ubiquitous inter-connectedness means that hardly any data is going to stay anonymous anymore, then yes, privacy laws mean that data should be treated more cautiously than was previously the case. And what exactly is wrong with that?
If data is the new gold then it's time data custodians took more care.
Few technologies are so fundamental and yet so derided at the same time as public key infrastructure. PKI is widely thought of as obsolete or generically intrusive yet it is ubiquitous in SIM cards, SSL, chip and PIN cards, and cable TV. Technically, public key infrastructure Is a generic term for a management system for keys and certificates; there have always been endless ways to build PKIs (note the plural) for different communities, technologies, industries and outcomes. And yet “PKI” has all too often come to mean just one way of doing identity management. In fact, PKI doesn’t necessarily have anything to do with identity at all.
This blog is an edited version of a feature I once wrote for SC Magazine. It is timely in the present day to re-visit the principles that make for good PKI implementations and contextualise them in one of the most contemporary instances of PKI: the FIDO Alliance protocols for secure attribute management. In my view, FIDO realises PKI ‘as nature intended’.
In their earliest conceptions in the early-to-mid 1990s, digital certificates were proposed to authenticate nondescript transactions between parties who had never met. Certificates were construed as the sole means for people to authenticate one another. Most traditional PKI was formulated with no other context; the digital certificate was envisaged to be your all-purpose digital identity.
Orthodox PKI has come in for spirited criticism. From the early noughties, many commentators pointed to a stark paradox: online transaction volumes and values were increasing rapidly, in almost all cases without the help of overt PKI. Once thought to be essential, with its promise of "non repdudiation", PKI seemed anything but, even for significant financial transactions.
There were many practical problems in “big” centralised PKI models. The traditional proof of identity for general purpose certificates was intrusive; the legal agreements were complex and novel; and private key management was difficult for lay people. So the one-size-fits-all electronic passport failed to take off. But PKI's critics sometimes throw the baby out with the bathwater.
In the absence of any specific context for its application, “big” PKI emphasized proof of personal identity. Early certificate registration schemes co-opted identification benchmarks like that of the passport. Yet hardly any regular business transactions require parties to personally identify one another to passport standards.
”Electronic business cards”
Instead in business we deal with others routinely on the basis of their affiliations, agency relationships, professional credentials and so on. The requirement for orthodox PKI users to submit to strenuous personal identity checks over and above their established business credentials was a major obstacle in the adoption of digital certificates.
It turns out that the 'killer applications' for PKI overwhelmingly involve transactions with narrow contexts, predicated on specific credentials. The parties might not know each other personally, but invariably they recognize and anticipate each other's qualifications, as befitting their business relationship.
Successful PKI came to be characterized by closed communities of interest, prior out-of-band registration of members, and in many cases, special-purpose application software featuring additional layers of context, security and access controls.
So digital certificates are much more useful when implemented as application-specific 'electronic business cards,' than as one-size-fits-all electronic passports. And, by taking account of the special conditions that apply to different e-business processes, we have the opportunity to greatly simplify the registration processes, user experience and liability arrangements that go with PKI.
The real benefits of digital signatures
There is a range of potential advantages in using PKI, including its cryptographic strength and resistance to identity theft (when implemented with private keys in hardware). Many of its benefits are shared with other technologies, but at least two are unique to PKI.
First, digital signatures provide robust evidence of the origin and integrity of electronic transactions, persistent over time and over 'distance’ (that is, the separation of sender and receiver). This greatly simplifies audit logging, evidence collection and dispute resolution, and cuts the future cost of investigation and fraud. If a digitally signed document is archived and checked at a later date, the quality of the signature remains undiminished over many years, even if the public key certificate has long since expired. And if a digitally signed message is passed from one relying party to another and on to many more, passing through all manner of intermediate systems, everyone still receives an identical, verifiable signature code authenticating the original message.
Electronic evidence of the origin and integrity of a message can, of course, be provided by means other than a digital signature. For example, the authenticity of typical e-business transactions can usually be demonstrated after the fact via audit logs, which indicate how a given message was created and how it moved from one machine to another. However, the quality of audit logs is highly variable and it is costly to produce legally robust evidence from them. Audit logs are not always properly archived from every machine, they do not always directly evince data integrity, and they are not always readily available months or years after the event. They are rarely secure in themselves, and they usually need specialists to interpret and verify them. Digital signatures on the other hand make it vastly simpler to rewind transactions when required.
Secondly, digital signatures and certificates are machine readable, allowing the credentials or affiliations of the sender to be bound to the message and verified automatically on receipt, enabling totally paperless transacting. This is an important but often overlooked benefit of digital signatures. When processing a digital certificate chain, relying party software can automatically tell that:
- the message has not been altered since it was originally created
- the sender was authorized to launch the transaction, by virtue of credentials or other properties endorsed by a recognized Certificate Authority
- the sender's credentials were valid at the time they sent the message; and
- the authority which signed the certificate was fit to do so.
One reason we can forget about the importance of machine readability is that we have probably come to expect person-to-person email to be the archetypal PKI application, thanks to email being the classic example to illustrate PKI in action. There is an implicit suggestion in most PKI marketing and training that, in regular use, we should manually click on a digital signature icon, examine the certificate, check which CA issued it, read the policy qualifier, and so on. Yet the overwhelming experience of PKI in practice is that it suits special purpose and highly automated applications, where the usual receiver of signed transactions is in fact a computer.
Characterising good applications
Reviewing the basic benefits of digital signatures allows us to characterize the types of e-business applications that merit investment in PKI.
Applications for which digital signatures are a good fit tend to have reasonably high transaction volumes, fully automatic or straight-through processing, and multiple recipients or multiple intermediaries between sender and receiver. In addition, there may be significant risk of dispute or legal ramifications, necessitating high quality evidence to be retained over long periods of time. These include:
- Tax returns
- Customs reporting
- E-health care
- Financial trading
- Electronic conveyancing
- Superannuation administration
- Patent applications.
This view of the technology helps to explain why many first-generation applications of PKI were problematic. Retail internet banking is a well-known example of e-business which flourished without the need for digital certificates. A few banks did try to implement certificates, but generally found them difficult to use. Most later reverted to more conventional access control and backend security mechanisms.Yet with hindsight, retail funds transfer transactions did not have an urgent need for PKI, since they could make use of existing backend payment systems. Funds transfer is characterized by tightly closed arrangements, a single relying party, built-in limits on the size of each transaction, and near real-time settlement. A threat and risk assessment would show that access to internet banking can rest on simple password authentication, in exactly the same way as antecedent phone banking schemes.
Trading complexity for specificity
As discussed, orthodox PKI was formulated with the tacit assumption that there is no specific context for the transaction, so the digital certificate is the sole means for authenticating the sender. Consequently, the traditional schemes emphasized high standards of personal identity, exhaustive contracts and unusual legal devices like Relying Party Agreements. They also often resorted to arbitrary 'reliance limits,' which have little meaning for most of the applications listed on the previous page. Notoriously, traditional PKI requires users to read and understand certification practice statements (CPS).
All that overhead stemmed from not knowing what the general-purpose digital certificate was going to be used for. On the other hand, if particular digital certificates are constrained to defined applications, then the complexity surrounding their specific usage can be radically reduced.
The role of PKI in all contemporary 'killer applications' is fundamentally to help automate the online processing of electronic transactions between parties with well-defined credentials. This is in stark contrast to the way PKI has historically been portrayed, where strangers Alice and Bob use their digital certificates to authenticate context-free general messages, often presumed to be sent by email. In reality, serious business messages are never sent stranger-to-stranger with no context or cues as to the parties' legitimacy.
Using generic email is like sending a fax on plain paper. Instead, business messaging is usually highly structured. Parties have an expectation that only certain types of transactions are going to occur between them and they equip themselves accordingly (for instance, a health insurance office is not set up to handle tax returns). The sender is authorized to act in defined types of transactions by virtue of professional credentials, a relevant license, an affiliation with some authority, endorsement by their employer, and so on. And the receiver recognizes the source of those credentials. The sender and receiver typically use prescribed forms and/or special purpose application software with associated user agreements and license conditions, adding context and additional layers of security around the transaction.
PKI got smart
When PKI is used to help automate the online processing of transactions between parties in the context of an existing business relationship, we should expect the legal arrangements between the parties to still apply. For business applications where digital certificates are used to identify users in specific contexts, the question of legal liability should be vastly simpler than it is in the general purpose PKI scenario where the issuer does not know what the certificates might be used for.
The new vision for PKI means the technology and processes should be no more of a burden on the user than a bank card. Rather than imagine that all public key certificates are like general purpose electronic passports, we can deploy multiple, special purpose certificates, and treat them more like electronic business cards. A public key certificate issued on behalf of a community of business users and constrained to that community can thereby stand for any type of professional credential or affiliation.
We can now automate and embed the complex cryptography deeply into smart devices -- smartcards, smart phones, USB keys and so on -- so that all terms and conditions for use are application focused. As far as users are concerned, a smartcard can be deployed in exactly the same way as any magnetic stripe card, without any need to refer to - or be limited by - the complex technology contained within (see also Simpler PKI is on the cards). Any application-specific smartcard can be issued under rules and controls that are fit for their purpose, as determined by the community of users or an appropriate recognized authority. There is no need for any user to read a CPS. Communities can determine their own evidence-of-identity requirements for issuing cards, instead of externally imposed personal identity checks. Deregulating membership rules dramatically cuts the overheads traditionally associated with certificate registration.
Finally, if we constrain the use of certificates to particular applications then we can factor the intended usage into PKI accreditation processes. Accreditation could then allow for particular PKI scheme rules to govern liability. By 'black-boxing' each community's rules and arrangements, and empowering the community to implement processes that are fit for its purpose, the legal aspects of accreditation can be simplified, reducing one of the more significant cost components of the whole PKI exercise (having said that, it never ceases to amaze how many contemporary healthcare PKIs still cling onto face-to-face passport grade ID proofing as if that's the only way to do digital certificates).
The preceding piece is a lightly edited version of the article ”Rethinking PKI” that first appeared in Secure Computing Magazine in 2003. Now, over a decade later, we’re seeing the same principles realised by the FIDO Alliance.
The FIDO protocols U2F and UAF enable specific attributes of a user and their smart devices to be transmitted to a server. Inherent to the FIDO methods are digital certificates that confer attributes and not identity, relatively large numbers of private keys stored locally in the users’ devices (and without the users needing to be aware of them as such) and digital signatures automatically applied to protocol messages to bind the relevant attributes to the authentication exchanges.
Surely, this is how PKI should have been deployed all along.
A repeated refrain of cynics and “infomopolists” alike is that privacy is dead. People are supposed to know that anything on the Internet is up for grabs. In some circles this thinking turns into digital apartheid; some say if you’re so precious about your privacy, just stay offline.
But socialising and privacy are hardly mutually exclusive; we don’t walk around in public with our names tattooed on our foreheads. Why can’t we participate in online social networks in a measured, controlled way without submitting to the operators’ rampant X-ray vision? There is nothing inevitable about trading off privacy for conviviality.
The privacy dangers in Facebook and the like run much deeper than the self-harm done by some peoples’ overly enthusiastic sharing. Promiscuity is actually not the worst problem, neither is the notorious difficulty of navigating complex and ever changing privacy settings.
The advent of facial recognition presents far more serious and subtle privacy challenges.
Facebook has invested heavily in face recognition technology, and not just for fun. Facebook uses it in effect to crowd-source the identification and surveillance of its members. With facial recognition, Facebook is building up detailed pictures of what people do, when, where and with whom.
You can be tagged without consent in a photo taken and uploaded by a total stranger.
The majority of photos uploaded to personal albums over the years were not intended for anything other than private viewing.
Under the privacy law of Australia and data protection regulations in dozens of other jurisdictions, what matters is whether data is personally identifiable. The Commonwealth Privacy Act 1988 (as amended in 2014) defines “Personal Information” as: “information or an opinion about an identified individual, or an individual who is reasonably identifiable”.
Whenever Facebook attaches a member’s name to a photo, they are converting hitherto anonymous data into Personal Information, and in so doing, they become subject to privacy law. Automated facial recognition represents an indirect collection of Personal Information. However too many people still underestimate the privacy implications; some technologists naively claim that faces are “public” and that people can have no expectation of privacy in their facial images, ignoring that information privacy as explained is about the identifiability and identification of data; the words “public” and “private” don’t even figure in the Privacy Act!
If a government was stealing into our photo albums, labeling people and profiling them, there would be riots. It's high time that private sector surveillance - for profit - is seen for what it is, and stopped.
Ed Snowden was interviewed today as part of the New Yorker festival. This TechCruch report says Snowden "was asked a couple of variants on the question of what we can do to protect our privacy. His first answer called for a reform of government policies." He went on to add some remarks about Google, Facebook and encryption and that's what the report chose to focus on. The TechCrunch headline: "Snowden's Privacy Tips".
Mainstream and even technology media reportage does Snowden a terrible disservice and takes the pressure off from government policy.
I've listened to the New Yorker online interview. After being asked by a listener what they should do about privacy, Snowden gave a careful, nuanced, and comprehensive answer over five minutes. His very first line was this is an incredibly complex topic and he did well to stick to plain language throughout. He canvassed a great many issues including: the need for policy reform, the 'Nothing to Hide' argument, the inversion of civil rights when governments ask us to justify the right to be left alone, the collusion of companies and governments, the poor state of product security and usability, the chilling effect on industry of government intervention in security, metadata, and the radicalisation of computer scientists today being comparable with physicists in the Cold War.
Only after all that, and a follow up question about 'ordinary people', did Snowden say 'don't use Dropbox'.
Consistently, when Snowden is asked what to do about privacy, his answers are primarily about politics not technology. When pressed, he dispenses the odd advice about using Tor and disk encryption, but Snowden's chief concerns (as I have discussed in depth previously) are around accountability, government transparency, better cryptology research, better security product quality, and so on. He is no hacker.
I am simply dismayed how Snowden's sophisticated analyses are dumbed down to security tips. He has never been a "cyber Agony Aunt". The proper response to NSA overreach has to be agitation for regime change, not do-it-yourself cryptography. That is Snowden's message.
Tonight, Australian Broadcasting Corporation’s Four Corners program aired a terrific special, "Privacy Lost" written and produced by Martin Smith from the US public broadcaster PBS’s Frontline program.
Here we have a compelling demonstration of the importance and primacy of Collection Limitation for protecting our privacy.
UPDATE: The program we saw in Australia turns out to be a condensed version of PBS's two part The United States of Secrets from May 2014.
About the program
Martin Smith summarises brilliantly what we know about the NSA’s secret surveillance programs, thanks to the revelations of Ed Snowden, the Guardian’s Glenn Greenwald and the Washington Post’s Barton Gellman; he holds many additional interviews with Julia Angwin (author of “Dragnet Nation”), Chris Hoofnagle (UC Berkeley), Steven Levy (Wired), Christopher Soghoian (ACLU) and Tim Wu (“The Master Switch”), to name a few. Even if you’re thoroughly familiar with the Snowden story, I highly recommend “Privacy Lost” or the original "United States of Secrets" (which unlike the Four Corners edition can be streamed online).
The program is a ripping re-telling of Snowden’s expose, against the backdrop of George W. Bush’s PATRIOT Act and the mounting suspicions through the noughties of NSA over-reach. There are freshly told accounts of the intrigues, of secret optic fibre splitters installed very early on in AT&T’s facilities, scandals over National Security Letters, and the very rare case of the web hosting company Calyx who challenged their constitutionality (and yet today, with the letter withdrawn, remains unable to tell us what the FBI was seeking). The real theme of Smith’s take on surveillance then emerges, when he looks at the rise of data-driven businesses -- first with search, then advertising, and most recently social networking -- and the “data wars” between Google, Facebook and Microsoft.
In my view, the interplay between government surveillance and digital businesses is the most important part of the Snowden epic, and it receives the proper emphasis here. The depth and breadth of surveillance conducted by the private sector, and the insights revealed about what people might be up to creates irresistible opportunities for the intelligence agencies. Hoofnagle tells us how the FBI loves Facebook. And we see the discovery of how the NSA exploits the tracking that’s done by the ad companies, most notably Google’s “PREF” cookie.
One of the peak moments in “Privacy Lost” comes when Gellman and his specialist colleague Ashkan Soltani present their evidence about the PREF cookie to Google – offering an opportunity for the company to comment before the story is to break in the Washington Post. The article ran on December 13, 2013; we're told it was then the true depth of the privacy problem was revealed.
My point of view
Smith takes as a given that excessive intrusion into private affairs is wrong, without getting into the technical aspects of privacy (such as frameworks for data protection, and various Privacy Principles). Neither does he unpack the actual privacy harms. And that’s fine -- a TV program is not the right place to canvass such technical arguments.
When Gellman and Soltani reveal that the NSA is using Google’s tracking cookie, the government gets joined irrefutably to the private sector in a mass surveillance apparatus. And yet I am not sure the harm is dramatically worse when the government knows what Facebook and Google already know.
Privacy harms are tricky to work out. Yet obviously no harm can come from abusing Personal Information if that information is not collected in the first place! I take away from “Privacy Lost” a clear impression of the risks created by the data wars. We are imperiled by the voracious appetite of digital businesses that hang on indefinitely to masses of data about us, while they figure out ever cleverer ways to make money out of it. This is why Collection Limitation is the first and foremost privacy protection. If a business or government doesn't have a sound and transparent reason for having Personal Information about us, then they should not have it. It’s as simple as that.
Martin Smith has highlighted the symbiosis between government and private sector surveillance. The data wars not only made dozens of billionaires but they did much of the heavy lifting for the NSA. And this situation is about to get radically more fraught. On the brink of the Internet of Things, we need to question if we want to keep drowning in data.
The "Right to be Forgotten" debate reminds me once again of the cultural differences between technology and privacy.
On September 30, I was honoured to be part of a panel discussion hosted by the IEEE on RTBF; a recording can be viewed here. In a nutshell, the European Court of Justice has decided that European citizens have the right to ask search engine businesses to suppress links to personal information, under certain circumstances. I've analysed and defended the aims of the ECJ in another blog.
One of the IEEE talking points was why RTBF has attracted so much scorn. My answer was that some critics appear to expect perfection in the law; when they look at the RTBF decision, all they see is problems. Yet nobody thinks this or any law is perfect; the question is whether it helps improve the balance of rights in a complex and fast changing world.
It's a little odd that technologists in particular are so critical of imperfections in the law, when they know how flawed is technology. Indeed, the security profession is almost entirely concerned with patching problems, and reminding us there will never be perfect security.
Of course there will be unwanted side-effects of the new RTBF rules and we should trust that over time these will be reviewed and dealt with. I wish that privacy critics could be more humble about this unfolding environment. I note that when social conservatives complain about online pornography, or when police decry encryption as a tool of criminals, technologists typically play those problems down as the unintended consequences of new technologies, which on average overwhelmingly do good not evil.
And it's the same with the law. It really shouldn't be necessary to remind anyone that laws have unintended consequences, for they are the stuff of the entire genre of courtroom drama. So everyone take heart: the good guys nearly always win in the end.
The European Court of Justice recently ruled on the so-called "Right to be Forgotten" granting members of the public limited rights to request that search engines like Google suppress links to Personal Information under some circumstances. The decision has been roundly criticised by technologists and by American libertarians -- acting out the now familiar ritualised privacy arguments around human rights, freedom of speech, free market forces and freedom to innovate (and hence the bad pun in the title of this article). Surprisingly even some privacy advocates like Jules Polonetsky (quoted in The New Yorker) has a problem with the ECJ judgement because he seems to think it's extremist.
Of the various objections, the one I want to answer here is that search engines should not have to censor "facts" retrieved from the "public domain".
On September 30, I am participating in a live panel discussion of the Right To Be Forgotten, hosted by the IEEE; you can register here and download a video recording of the session later.
Update: recording now available here.
In an address on August 18, the European Union's Justice Commissioner Martine Reicherts made the following points about the Right to be Forgotten (RTBF):
- "[The European Court of Justice] said that individuals have the right to ask companies operating search engines to remove links with personal information about them -- under certain conditions. This applies when information is inaccurate, for example, or inadequate, irrelevant, outdated or excessive for the purposes of data processing. The Court explicitly ruled that the right to be forgotten is not absolute, but that it will always need to be balanced against other fundamental rights, such as the freedom of expression and the freedom of the media -- which, by the way, are not absolute rights either".
In the current (September 29, 2014) issue of New Yorker, senior legal analyst Jeffrey Toobin looks at RTBF in the article "The Solace of Oblivion". It's a balanced review of a complex issue, which acknowledges the transatlantic polarization of privacy rights and freedom of speech.
Toobin interviewed Kent Walker, Google's general counsel. Walker said Google likes to think of itself as a "card catalogue": "We don't create the information. We make it accessible. A decision like [the ECJ's], which makes us decide what goes inside the card catalogue, forces us into a role we don't want."
But there's a great deal more to search than Walker lets on.
Google certainly does create fresh Personal Information, and in stupendous quantities. Their search engine is the bedrock of a hundred billion dollar business, founded on a mission to "organize the world's information". Google search is an incredible machine, the result of one of the world's biggest ever and ongoing software R&D projects. Few of us now can imagine life without Internet search and instant access to limitless information that would otherwise be utterly invisible. Search really is magic – just as Arthur C. Clarke said any sufficiently advanced technology would be.
On its face therefore, no search result is a passive reproduction of data from a "public domain". Google makes the public domain public.
But while search is free, it is hyper profitable, for the whole point of it is to underpin a gigantic advertising business. The search engine might not create the raw facts and figures in response to our queries, but it covertly creates and collects symbiotic metadata, complicating the picture. Google monitors our search histories, interests, reactions and habits, as well as details of the devices we're using, when and where and even how we are using them, all in order to divine our deep predilections. These insights are then provided in various ways to Google's paying customers (advertisers) and are also fed back into the search engine, to continuously tune it. The things we see courtesy of Google are shaped not only by their page ranking metrics but also by the company's knowledge of our preferences (which it forms by watching us across the whole portfolio of search, Gmail, maps, YouTube, and the Google+ social network). When we search for something, Google tries to predict what we really want to know.
In the modern vernacular, Google hacks the public domain.
The collection and monetization of personal metadata is inextricably linked to the machinery of search. The information Google serves up to us is shaped and transformed to such an extent, in the service of Google's business objectives, that it should be regarded as synthetic and therefore the responsibility of the company. Their search algorithms are famously secret, putting them beyond peer review; nevertheless, there is a whole body of academic work now on the subtle and untoward influences that Google exerts as it filters and shapes the version of reality it thinks we need to see.
Some objections to the RTBF ruling see it as censorship, or meddling with the "truth". But what exactly is the state of the truth that Google purportedly serves up? Search results are influenced by many arbitrary factors of Google's choosing; we don't know what those factors are, but they are dictated by Google's business interests. So in principle, why is an individual's interests in having some influence over search results any less worthy than Google's? The "right to be forgotten" is an unfortunate misnomer: it is really more of a 'limited right to have search results filtered differently'.
If Google's machinery reveals Personal Information that was hitherto impossible to find, then why shouldn't it at least participate in protecting the interests of the people affected? I don't deny that modern technology and hyper-connectivity creates new challenges for the law, and that traditional notions of privacy may be shifting. But it's not a step-change, and in the meantime, we need to tread carefully. There are as many unintended consequences and problems in the new technology as there are in the established laws. The powerful owners of benefactors of these technologies should accept some responsibility for the privacy impacts. With its talents and resources, Google could rise to the challenge of better managing privacy, instead of pleading that it's not their problem.
Another week, another security collaboration launch!
"Simply Secure" calls itself “a small but growing organization [with] expertise in usability research, design, software development, and product management". Their mission has to do with improving the security functions that built-in so badly in most software today. Simply Secure is backed by Google and Dropbox, and supported by a diverse advisory board.
It's early days (actually early day, singular) so it might be churlish to point out that Simply Secure's strategic messaging is a little uneven ... except that the words being used to describe it shed light on the clarity of the thinking.
My first exposure to Simply Secure came last night, when I read an article in the Guardian by Cory Doctorow (who is one of their advisers). Doctorow places enormous emphasis on privacy; the word “privacy" outnumbers “security" 16 to three in the body of his column. Another admittedly shorter report about the launch by The Next Web doesn't mention privacy at all. And then there's the Simply Secure blog post, which cites privacy a great deal but every single time in conjunction with security, as in “security and privacy". That repeated phrasing conveys, to me at least, some discomfort. As I say, it's early days and the team is doubtless sorting out how to weigh and progress these closely related objectives.
But I hope they do it quickly. On the face of it, Simply Secure might only scratch the surface of privacy.
Doctorow's Guardian article is mostly concerned with encryption and the terrible implementations that have plagued us since the dawn of the Internet. It's definitely important that we improve here – and radically. If the Simply Secure initiative does nothing but make encryption easier to integrate into commodity software, that would be a great thing. I'm all for it. But it won't necessarily or even probably lead to better privacy, because privacy is about restraint not secrecy or anonymity.
As we go about our lives, we actually want to be known by others, but we want those who know us to be restrained in what they do with the knowledge they have about us. Privacy is the protection you need when your affairs are not secret.
I know Doctorow knows this – I've seen his terrific little speech on the steps on Comic-Con about PRISM. So I'm confused by his focus on cryptography.
How far does encryption get us? If we're using social networks, or if we're shopping and opting in to loyalty programs or selected targeted marketing, or if we're sharing our medical records with relatives, medicos, hospitals and researchers, then encryption becomes moot. We need mechanisms to restrain what the receivers of our personal information do with it. We all know the business model at work behind “free" online services; using encryption to protect privacy in social networking for instance would be like using an armoured van to deliver your valuables to Bernie Madoff.
Another limitation of user-centric or user-managed encryption has to do with Big Data. A great deal of personal information about us is created and collected unseen behind our backs, by sensors, and by analytics processes than manage to work out who we are by linking disparate data streams together. How could SS ameliorate those sorts of problems? If the SS vision includes encryption at rest as well as in transit, then how will the user control or even see all the secondary uses of their encrypted personal information?
There's a combativeness in Doctorow's explanation of Simply Secure and his tweets from yesterday on the topic. His aim is expressly to thwart the surveillance state, which in his view includes a symbiosis (if not conspiracy) between government and internet companies, where the former gets their dirty work done by the latter. I'm sure he and I both find that abhorrent in equal measure. But I argue the proper response to these egregious behaviours is political not technological (and political in the broad sense; I love that Snowden talks as much about accountability, legal processes, transparency and research as he does about encryption). If you think the government is exploiting the exploiters, then DIY encryption is a pretty narrow counter-measure. This is not the sort of society we want to live in, so let's work to change the establishment, rather than try to take it on in a crypto shoot-out.
Yes security technology is important but it's not nearly as important for privacy as the Rule of Law. Data privacy regimes instil restraint. The majority of businesses come to know that they are not at liberty to over-collect personal information, nor to re-use personal information unexpectedly and without consent. A minority of organisations flout data privacy principles, for example by slyly refining raw data into valuable personal knowledge, exploiting the trust citizens and users put in them. Some of these outfits flourish in the United States – the Canary Islands of privacy. Worldwide, the policing of privacy is patchy indeed, yet there have been spectacular legal victories in Europe and elsewhere against the excessive practices of really big companies like Facebook with their biometric data mining of photo albums, and Google's drift net-like harvesting of traffic from unencrypted Wi-Fi networks.
Pragmatically, I'm afraid encryption is such a fragile privacy measure. Once secrecy is penetrated, we need regulations to stem exploitation of our personal information.
By all means, let's improve cryptographic engineering and I wish the Simply Secure initiative all the best. So long as they don't call security privacy.
I have a new academic paper due to be published in October, in the Australian Journal of Telecommunications and the Digital Economy. Here is an extract.
Update: see Telecommunications Society members page.
The collision between Big Data and privacy law
We live in an age where billionaires are self-made on the back of the most intangible of assets – the information they have about us. The digital economy is awash with data. It's a new and endlessly re-useable raw material, increasingly left behind by ordinary people going about their lives online. Many information businesses proceed on the basis that raw data is up for grabs; if an entrepreneur is clever enough to find a new vein of it, they can feel entitled to tap it in any way they like. However, some tacit assumptions underpinning today's digital business models are naive. Conventional data protection laws, older than the Internet, limit how Personal Information is allowed to flow. These laws turn out to be surprisingly powerful in the face of 'Big Data' and the 'Internet of Things'. On the other hand, orthodox privacy management was not framed for new Personal Information being synthesised tomorrow from raw data collected today. This paper seeks to bridge a conceptual gap between data analytics and privacy, and sets out extended Privacy Principles to better deal with Big Data.
'Big Data' is a broad term capturing the extraction of knowledge and insights from unstructured data. While data processing and analysis is as old as computing, the term 'Big Data' has recently attained special meaning, thanks to the vast rivers of raw data that course unseen through the digital economy, and the propensity for entrepreneurs to tap that resource for their own profit, or to build new analytic tools for enterprises. Big Data represents one of the biggest challenges to privacy and data protection society has seen. Never before has so much Personal Information been available so freely to so many.
Big Data promises vast benefits for a great many stakeholders (Michael & Miller 2013: 22-24) but the benefits may be jeopardized by the excesses of a few overly zealous businesses. Some online business models are propelled by a naive assumption that data in the 'public domain' is up for grabs. Many think the law has not kept pace with technology, but technologists often underestimate the strength of conventional data protection laws and regulations. In particular, technology neutral privacy principles are largely blind to the methods of collection, and barely distinguish between directly and indirectly collected data. As a consequence, the extraction of Personal Information from raw data constitutes an act of collection and as such is subject to longstanding privacy statutes. Privacy laws such as that of Australia don't even use the words 'public' and 'private' to qualify the data flows concerned (Privacy Act 1988).
On the other hand, orthodox privacy policies and static data usage agreements do not cater for the way Personal Information can be synthesised tomorrow from raw data collected today. Privacy management must evolve to become more dynamic, instead of being preoccupied with unwieldy policy documents and simplistic technical notices about cookies.
Thus the fit between Big Data and data privacy standards is complex and sometimes surprising. While existing laws are not to be underestimated, there is a need for data privacy principles to be extended, to help individuals remain abreast of what's being done with information about them, and to foster transparency regarding the new ways for personal information to be generated.
Conclusion: Making Big Data privacy real
A Big Data dashboard like the one described could serve several parallel purposes in aid of progressive privacy principles. It could reveal dynamically to users what PII can be collected about them through Big Data; it could engage users in a fair and transparent exchange of value-for-PII transaction; and it could enable dynamic consent where users are able to opt in to Big Data processes, and opt out and in again, over time, as their understanding of the PII bargain evolves.
Big Data holds big promises, for the benefit of many. There are grand plans for population-wide electronic health records, new personalised financial services that leverage massive retail databases, and electricity grid management systems that draw on real-time consumption data from smart meters in homes, to extend the life of aging 'poles and wires' while reducing greenhouse gas emissions. The value to individuals and operators alike of these programs is amplified as computing power grows, new algorithms are researched, and more and more data sets are joined together. Likewise, the privacy risks are compounded. The potential value of Personal Information in the modern Big Data landscape cannot be represented in a static business model, and neither can the privacy pros and cons be captured in a fixed policy document. New user interfaces and visualisations like a 'Big Data dashboard' are needed to bring dynamic extensions to traditional privacy principles, and help people appreciate and intelligently negotiate the insights that can be extracted about them from the raw material that is data.
A Social Media Week Sydney event #SMWSydney
Law Lounge, Sydney University Law School
New Law School Building
Eastern Ave, Camperdown
Fri, Sep 26 - 10:00 AM - 11:30 AM
How can you navigate privacy fact and fiction, without the geeks and lawyers boring each other to death?
It's often said that technology has outpaced privacy law. Many digital businesses seem empowered by this brash belief. And so they proceed with apparent impunity to collect and monetise as much Personal Information as they can get their hands on.
But it's a myth!
Some of the biggest corporations in the world, including Google and Facebook, have been forcefully brought to book by privacy regulations. So, we have to ask ourselves:
- what does privacy law really mean for social media in Australia?
- is privacy "good for business"?
- is privacy "not a technology issue"?
- how can digital businesses navigate fact & fiction, without their geeks and lawyers boring each other to death?
In this Social Media Week Master Class I will:
- unpack what's "creepy" about certain online practices
- show how to rate data privacy issues objectively
- analyse classic misadventures with geolocation, facial recognition, and predicting when shoppers are pregnant
- critique photo tagging and crowd-sourced surveillance
- explain why Snapchat is worth more than three billion dollars
- analyse the regulatory implications of Big Data, Biometrics, Wearables and The Internet of Things.
We couldn't have timed this Master Class better, coming two weeks after the announcement of the Apple Watch, which will figure prominently in the class!
So please come along, for a fun and in-depth a look at social media, digital technology, the law, and decency.
About the presenter
Steve Wilson is a technologist, who stumbled into privacy 12 years ago. He rejected those well meaning slogans (like "Privacy Is Good For Business!") and instead dug into the relationships between information technology and information privacy. Now he researches and develops design patterns to help sort out privacy, alongside all the other competing requirements of security, cost, usability and revenue. His latest publications include:
- "Big Privacy: The new standard for Big Data Privacy" from Constellation Research, and
- "The collision between Big Data and privacy law" due out in October in the Australian Journal of Telecommunications and the Digital Economy.