Mobile: +61 (0) 414 488 851
Email: swilson@lockstep.com.au

It's not too late for privacy

Have you heard the news? "Privacy is dead!"

The message is urgent. It's often yelled in prominent headlines, with an implied challenge. The new masters of the digital universe urge the masses: C'mon, get with the program! Innovate! Don't be so precious! Don't you grok that Information Wants To Be Free? Old fashioned privacy is holding us back!

The stark choice posited between privacy and digital liberation is rarely examined with much intellectual rigor. Often, "privacy is dead" is just a tired fatalistic response to the latest breach or the latest eye popping digital development, like facial recognition, or a smartphone's location monitoring. In fact, those who earnestly assert that privacy is over are almost always trying to sell us something, be it sneakers, or a political ideology, or a wanton digital business model.

Is it really too late for privacy? Is the "genie out of the bottle"? Even if we accepted the ridiculous premise that privacy is at odds with progress, no it's not too late, for a couple of reasons. Firstly, the pessimism (or barely disguised commercial opportunism) generally confuses secrecy for privacy, and secondly because frankly, we aint seen nothin yet!

Conflating privacy and secrecy

Technology certainly has laid us bare. Behavioural modeling, facial recognition, Big Data mining, natural language processing and so on have given corporations X-Ray vision into our digital lives. While exhibitionism has been cultivated and normalised by the informopolists, even the most guarded social network users may be defiled by data prospectors who, without consent, upload their contact lists, pore over their photo albums, and mine their shopping histories.

So yes, a great deal about us has leaked out into what some see as an infinitely extended neo-public domain. And yet we can be public and retain our privacy at the same time.

Some people seem defeated by privacy's definitional difficulties, yet information privacy is simply framed, and corresponding data protection laws readily understood. Information privacy is basically a state where those who know us are restrained in what they can do with the knowledge they have about us. Privacy is about respect, and protecting individuals against exploitation. It is not about secrecy or even anonymity. There are few cases where ordinary people really want to be anonymous. We actually want businesses to know -- within limits -- who we are, where we are, what we've done, what we like, but we want them to respect what they know, to not share it with others, and to not take advantage of it in unexpected ways. Privacy means that organisations behave as though it's a privilege to know us. Privacy can involve businesses and governments giving up a little bit of power.

Many have come to see privacy as literally a battleground. The grassroots Cryptoparty movement came together around the heady belief that privacy means hiding from the establishment. Cryptoparties teach participants how to use Tor and PGP, and they spread a message of resistance. They take inspiration from the Arab Spring where encryption has of course been vital for the security of protestors and organisers. One Cryptoparty I attended in Sydney opened with tributes from Anonymous, and a number of recorded talks by activists who ranged across a spectrum of political issues like censorship, copyright, national security and Occupy. I appreciate where they're coming from, for the establishment has always overplayed its security hand, and run roughshod over privacy. Even traditionally moderate Western countries have governments charging like china shop bulls into web filtering and ISP data retention, all in the name of a poorly characterised terrorist threat. When governments show little sympathy for netizenship, and absolutely no understanding of how the web works, it's unsurprising that sections of society take up digital arms in response.

Yet going underground with encryption is a limited privacy stratagem, for DIY crypto is incompatible with the majority of our digital dealings. In fact the most nefarious, uncontrolled and ultimately the most dangerous privacy harms come from mainstream Internet businesses and not government. Assuming one still wants to shop online, use a credit card, tweet, and hang out on Facebook, we still need privacy protections. We need limitations on how our Personally Identifiable Information (PII) is used by all the services we deal with. We need department stores to refrain from extracting sensitive health information from our shopping habits, merchants to not use our credit card numbers as customer reference numbers, shopping malls to not track patrons by their mobile phones, and online social networks to not x-ray our photo albums by biometric face recognition.

I note that some Cryptoparty bookings are managed by the US event organiser Eventbrite, which has a detailed Privacy Policy setting out how it promises to handle personal information provided by attendees. It does seems reasonable to me, but like all private sector data protection arrangements, there's a lot going on there. So ironically, when registering for a cryptoparty, you could not use encryption! For privacy, you have to either trust Eventbrite to have a reasonable policy and to stick to it, or you might rely on government regulations, if applicable. When registering, you give a little Personal Information to the organisers, and you should expect that they will be restrained in what they do with it.

Going out in public never was a license for others to invade our privacy. We ought not to respond to online privacy invasions as if cyberspace is a new Wild West. We have always relied on regulatory systems of consumer protection to curb the excesses of business and government, and we should insist on the same in the digital age. We should not have to hide away if privacy is agreed to mean respecting the PII of customers, users and citizens, and restraining what data custodians do with that precious resource.

We aint seen nothin yet!

I ask anyone who thinks it's too late to reassert our privacy to think for a minute about where we're heading. We're still in the early days of the social web, and the information "innovators" have really only just begun. Look at what they've done so far:

  • Facial recognition converts vast stores of anonymous photos into PII, without consent, and without limit. Facebook's deployment of biometric technology was especially clever. For years they crowd-sourced the creation of face recognition templates and the calibration of their algorithms, without ever mentioning biometrics in their privacy policy or help pages. Even now Facebook's Data Use Policy is entirely silent on biometric templates and what they allow themselves to do with them. Meanwhile, third party services like Facedeals are starting to use Facebook's photo resources for commercial facial recognition in public.
  • It's difficult to overstate the value of facial recognition to businesses like Facebook which have just one asset: the knowledge they have about their members and their associates. Combined with image analysis and content addressable image banks, facial recognition lets Facebook work out what we're doing, when, where and with whom, pirating billions of everyday images given over by members to a business that doesn't even mention these priceless resources in its privacy policy.

  • Big Data. The most notorious recent example of the power of data mining comes from Target's covert research into identifying customers who are pregnant based on their buying habits. Big Data practitioners are so enamoured with their ability to extract secrets from "public" data they seem blithely unaware that by generating fresh PII from their raw materials they are in fact collecting it as far as Information Privacy Law is concerned. As such, they’re legally liable for the privacy compliance of their cleverly synthesised data, just as if they had expressly gathered it all by questionnaire.

  • Natural Language Processing (NLP) is the secret sauce in Apple's Siri, allowing her to take commands and dictation. Every time you dictate an email or a text message to Siri, Apple gets hold of the content of telecommunications that are normally out of bounds to the phone companies. Siri is like a free PA that reports your daily activities back to the secretarial agency. There is no mention at all of Siri in Apple's Privacy Policy despite the limitless collection of intimate personal information.

As an aside, I'm not one of those who fret that technology has outstripped privacy law. Principles-based Information Privacy law copes well with most of this technology. OECD privacy principles (enacted in over 100 countries) and the US FIPPs require that companies be transarent about what PII they collect and why, and that they limit the ways in which PII is used for unrelated purposes and how it may be disclosed. These principles are decades old and yet they have been recently re-affirmed by German regulators recently over Facebook's surreptitious use of facial recognition. I expect that Siri will attract like scrutiny as it rolls out in continental Europe.

So what's next?

  • Google Glass may, in the privacy stakes, surpass both Siri and facial recognition of static photos. If actions speak louder than words, imagine the value to Google of digitising and knowing exactly what we do in real time.
  • Facial recognition as a Service and the sale of biometric templates may be tempting for the photo sharing sites. If and when biometric authentication spreads into retail payments and mobile device security, these systems will face the challenge of enrollment. It might be attractive to share face templates previously collected by Facebook and voice prints by Apple.

So, is it really too late for privacy? The information magnates and national security zealots may hope so, but surely even cynics will see there is great deal at stake, and that it might be just a little too soon to rush to judge something as important as this.

Posted in Social Networking, Social Media, Privacy, Culture, Big Data

I never trusted trust

From the archives.

  • "It is often put simply that in e-business, authentication means that you know who you're dealing with. Authentication is inevitably cited as one of the four or five 'pillars of security' (the others being integrity, non-repudiation, confidentiality and, sometimes, availability).
  • "To be a little more precise, let's examine the functional definition of authentication adopted by the Asia Pacific Economic Co-operation (APEC) E-Security Task Group, namely the means by which the recipient of a transaction or message can make an assessment as to whether to accept or reject that transaction.
  • "Note that this definition does not have identity as an essential element, let alone the complex notion of 'trust'. Identity and trust all too frequently complicate discussions around authentication. Of course, personal identity is important in many cases, but it should not be enshrined in the definition of authentication. Rather, the fundamental issue is one’s capacity to act in the transaction at hand. Depending on the application, this may have more to do with credentials, qualifications, memberships and account status, than identity per se, especially in business transactions."

Making Sense of your Authentication Options in e-Business
Journal of the PricewaterhouseCoopers Cryptographic Centre of Excellence, No. 5, 2001.

See also http://lockstep.com.au/library/quotes.

Posted in Identity, Trust

Types of Personal Information

It's really vital that technologists, software developers, architects and analysts appreciate that privacy law takes a broad view of "Personal Information" and how it may be collected. In essence, whenever any information pertaining to an identifiable individual comes to be in your IT system by whatever means, you may be deemed to have collected Personal Information it for the purposes of the law (for example, Australia's Privacy Act 1988 Cth). And what follows from any PI collection is a range of legal obligations relating to the 10 National Privacy Principles.

A while back I tried to illuminate the problem space from a technologist's standpoint, in a paper called Mapping privacy requirements onto the IT function (Privacy Law & Policy Reporter, 2003). At the time it seemed useful to me to break down different types of Personal Information, because I had found that most application developers only thought about questionnaires and web forms. I wrote then:

Personal data collection can be considered under five categories:
(1) overt collection via application forms, web forms, call centres, face to face interviews, questionnaires, warranty cards and so on;
(2) automatic collection, especially via audit logs and transaction histories;
(3) generated data, which includes evaluative data and inferences drawn from collected data, for the purposes of service customisation (for example buying preferences), business risk management (such as insurance risk scores from claims histories) and so on;
(4) acquired data which has been transferred from a third party, with or without payment for the data, including cases where personal information is acquired as part of a corporate takeover; and
(5) ephemeral data, which is a special category of automatic or generated data, produced as a side effect of other operations. Ephemeral data is reasonably presumed to be transient but can be inadvertently retained. For example, some systems prompt users for pre-arranged challenge-response information -- classically their mother’s maiden name -- when dealing with a forgotten password. The data provided can be left behind in computer memory or logs, or even scribbled on a sticky note by a help desk operator, and this can represent a major privacy breach if it is not protected from unauthorised parties.

This may still be a useful orientation for many engineers and technologists. They need to remember that even if it's found lying around in the public domain, or even if they've conjured it up from Big Data by clever data anaysis, if they have got their hands on Personal Information, then they have collected it.

Speaking of Big Data, I wonder if the categorisation of Personal Data could now be improved or extended?

Posted in Privacy, Big Data

The fundamental privacy challenges in biometrics

The EPIC privacy tweet chat of October 16 included "the Privacy Perils of Biometric Security". Consumers and privacy advocates are often wary of this technology, sometimes fearing a hidden agenda. To be fair, function creep and unauthorised sharing of biometric data are issues that are anticipated by standard data protection regulations and can be well managed by judicious design in line with privacy law.

However, there is a host of deeper privacy problems in biometrics that are not often aired.

  • The privacy policies of social media companies rarely devote reasonable attention to biometric technologies like facial recognition or natural language processing. Only recently has Facebook openly described its photo tagging templates. Apple on the other hand continues to be completely silent about Siri in its Privacy Policy, despite the fact that when Siri takes dictated emails and text messages, Apple is collecting and retaining without limit personal telecommunications that are strictly out of bounds even to the carriers! Some speculate that biometric voice recognition is a natural next step for Siri, but it's not a step that can be taken without giving notice today that personally identifiable voice data may in future be used for that purpose.
  • Personal Information (in Australia) is defined in the law as "information or an opinion ... whether true or not about an individual whose identity is apparent ..." [emphasis added]. This definition is interesting in the context of biometrics. Because biometrics are fuzzy, we can regard a biometric identification as a sort of opinion. Technically, a biometric match is declared when the probability of a scanned trait corresponding to an enrolled template exceeds some preset threshold, like 95%. When a false match results, mistaking say "Alice" for "Bob", it seems to me that the biometric system has created Personal Information about both Alice and Bob. There will be raw data, templates, audit files and metadata in the system pertaining to both individuals, some of it right and some of it wrong, but all of which needing to be accounted for under data protection and information privacy law.
  • In privacy, proportionality is important. The foremost privacy principle is Collection Limitation: organisations must not collect more personal information than they reasonably need to carry out their business. Biometric security is increasingly appearing in mundane applications with almost trivial security requirements, such as school canteens. Under privacy law, biometrics implementations in these sorts of environments may be hard to justify.
  • Even in national security deployments, biometrics lead to over-collection, exceeding what may be reasonable. Very little attention is given in policy debates to exception management, such as the cases of people who cannot enroll. The inevitable failure of some individuals to enroll in a biometric can have obvious causes (like missing digits or corneal disease) and not so obvious ones. The only way to improve false positive and false negative performance for a biometric at the same time is to tighten the mathematical modelling underpinning the algorithm (see also "Failure to enroll" at http://lockstep.com.au/blog/2012/05/06/biometrics-must-be-fallible). This can constrain the acceptable range of the trait being measured leading to outliers being rejected altogether. So for example, accurate fingerprint scanners need to capture a sharp image, making enrollment sometimes difficult for the elderly or manual workers. It's not uncommon for a biometric modality to have a Fail-to-Enroll rate of 1%. Now, what is to be done with those unfortunates who cannot use the biometric? In the case of border control, additional identifying information must be collected. Biometric security sets what the public are told is a 'gold standard' for national security, so there is a risk that individuals who for no fault of their own are 'incompatible' with the technology will form a de facto underclass. Imagine the additional opprobrium that would go with being in a particular ethnic or religious minority group and having the bad luck to fail biometric enrollment. The extra interview questions that go with sorting out these outliers at border control points is a collection necessitated not by any business need but rather the pitfalls of the technology.
  • And finally, there is something of a cultural gap between privacy and technology that causes blind spots amongst biometrics developers. Too many times, biometrics advocates misapprehend what information privacy is all about. It's been said more than once that "faces are not private" and there is "no expectation or privacy" with regards to one's face in public. Even if they were true, these judgement calls are moot, for information privacy laws are concerned with any data about identifiable individuals. So when facial recognition technology takes anonymous imagery from CCTV or photo albums and attaches names to it, Personal Information is being collected, and the law applies. It is this type of crucial technicality that Facebook has smacked into headlong in Germany.

Posted in Privacy, Biometrics

Fixed my RSS feed

Dear Lockstep RSS subscribers.

I found out only last week that my blog's RSS feed has been broken since March. It should now be fixed. As a result you may see a flood of updates. This post is just to summarise what you missed, so you can triage the deluge!


CNP fraud is just online carding

  • about a Lockstep Technologies presentation, looking at the common factors in payment card fraud offline and online.

Card Not Present now three quarters of all fraud

  • condenses the latest Australian card fraud stats.

Ski runs and LOAs

  • very briefly compares Levels of Assurance with another well known risk rating scheme.

A penny for your marketable thoughts?

  • about Siri, the free personal assistant and corporate spy.

A software engineer's memoir (work in progress)

  • some random recollections from my days working on defibrillator and pacemaker software, which has become topical, with more patients groups agitating for access to their data and for more information about pacemaker software reliability.


Killing two birds with one chip

  • argues that we should treat CNP fraud in exactly the same way as we dispatched skimming and carding.

Guilty until proven innocent

  • my thoughts on how the Blackstone Formulation doesn't apply to politicians and others in positions of trust.

Photo data as crude oil:

  • no wonder Facebook paid so much for Instagram!


Let's embrace Identity Plurality:

  • modern identity theory teaches us there is no one identity -- and yet most schemes just can't help but over-federate.

Understanding biometrics and their necessary fallibility:

  • what vendors won't tell you about tradeoffs inherent to all biometrics; a biometric that never got you wrong would still be unusable.


Identities are brittle but crystal clear - update

  • about how highly evolved identities break when we try to bend them to new contexts.


Taking stock of the IdM scene

  • amidst the OAuth 2.0 standards committee shenanigans, tries to remind people that identity strategy could be made simpler.

The double standard in biometrics analysis

  • shows that recent developments in reverse engineering iris detection would be treated seriously by information security professionals -- but not by biometrics advocates.

Card fraud in Australia even worse than feared

  • only six months on from my last dispatch, and fraud is accelerating unexpectedly.


Surfacing identity

  • presents some new ways to diagram the match (or mismatch) between IdPs and RPs.

Memetic engineering our identities

  • is a brief summary of my thesis that the key to federation lies in understanding how identity attributes, methods and technologies have evolved.

Is quantum computing all it's cracked up to be?

  • is a brief inquiry about why it's hard to manage realistically large numbers of qubits, and whether large RSA keys might still escape attack by Shor's Algorithm by simply adding more bits

A response to M2SYS on reverse engineering

  • A biometric vendor blocks me on Twitter and on their blog, so here is my public response to their naive position on revrse engineering.


Identity is not a thing

  • suggests we accept that some identities are bestowed upon us by RPs according to their rules, and that we can only hope to self-curate a small number of our identities.

I hope there's good food for thought here, and as always, I look forward to your comments and feedback.