Lockstep

Mobile: +61 (0) 414 488 851
Email: swilson@lockstep.com.au

The beginning of privacy!

The headlines proclaim that the newfound ability to re-identify anonymous DNA donors means The End Of Privacy!.

No it doesn't, it only means the end of anonymity.

Anonymity is not the same thing as privacy. Anonymity keeps people from knowing what you're doing, and it's a vitally important quality in many settings. But in general we usually want people (at least some people) to know what we're up to, so long as they respect that knowledge. That's what privacy is all about. Anonymity is a terribly blunt instrument for protecting privacy, and it's also fragile. If anonymity was all you have, then you're in deep trouble when someone manages to defeat it.

New information technologies have clearly made anonymity more difficult, yet it does not follow that we must lose our privacy. Instead, these developments bring into stark relief the need for stronger regulatory controls that compel restraint in the way third parties deal with Personal Information that comes into their possession.

A great example is Facebook's use of facial recognition. When Facebook members innocently tag one another in photos, Facebook creates biometric templates with which it then automatically processes all photo data (previously anonymous), looking for matches. This is how they can create tag suggestions, but Facebook is notoriously silent on what other applications it has for facial recognition. Now and then we get a hint, with, for example, news of the Facedeals start up last year. Facedeals accesses Facebook's templates (under conditions that remain unclear) and uses them to spot customers as they enter a store to automatically check them in. It's classic social technology: kinda sexy, kinda creepy, but clearly in breach of Collection, Use and Disclosure privacy principles.

And indeed, European regulators have found that Facebook's facial recognition program is unlawful. The chief problem is that Facebook never properly disclosed to members what goes on when they tag one another, and they never sought consent to create biometric templates with which to subsequently identify people throughout their vast image stockpiles. Facebook has been forced to shut down their facial recognition operations in Europe, and they've destroyed their historical biometric data.

So privacy regulators in many parts of the world have real teeth. They have proven that re-identification of anonymous data by facial recognition is unlawful, and they have managed to stop a very big and powerful company from doing it.

This is how we should look at the implications of the DNA 'hacking'. Indeed, Melissa Gymrek from the Whitehead Institute said in an interview: "I think we really need to learn to deal with the fact that we cannot ever make data sets truly anonymous, and that I think the key will be in regulating how we are allowed to use this genetic data to prevent it from being used maliciously."

Perhaps this episode will bring even more attention to the problem in the USA, and further embolden regulators to enact broader privacy protections there. Perhaps the very extremeness of the DNA hacking does not spell the end of privacy so much as its beginning.

Posted in Social Media, Science, Privacy, Biometrics, Big Data

It's not too late for privacy

Have you heard the news? "Privacy is dead!"

It's an urgent, impatient sort of line in the sand, drawn by the new masters of the universe digital, as a challenge to everyone else. C'mon, get with the program! Innovate! Don't be so precious - so very 20th century! Don't you dig that Information Wants To Be Free? Clearly, old fashioned privacy is holding us back!

The stark choice posited between privacy and digital liberation is rarely examined with much diligence; often it's actually a fatalistic response to the latest breach or the latest eye popping digital development. In fact, those who earnestly assert that privacy is dead are almost always trying to sell us something, be it a political ideology, or a social networking prospectus, or sneakers targeted at an ultra-connected, geolocated, behaviorally qualified nano market segment.

Is it really too late for privacy? Is the genie out of the bottle? Even if we accepted the ridiculous premise that privacy is at odds with progress, no it's not too late, firstly because the pessimism (or commercial opportunism) generally confuses secrecy for privacy, and secondly because frankly, we aint seen nothin yet!

Conflating privacy and secrecy

Technology certainly has laid us bare. Behavioural modeling, facial recognition, Big Data mining, natural language processing and so on have given corporations x-ray vision into our digital lives. While exhibitionism has been cultivated and normalised by the infomopolists, even the most guarded social network users may be defiled by Big Data wizards who without consent upload their contact lists, pore over their photo albums, and mine their shopping histories, as is their wanton business model.

So yes, a great deal about us has leaked out into what some see as an extended public domain. And yet we can be public and retain our privacy at the same time.

Some people seem defeated by privacy's definitional difficulties, yet information privacy is simply framed, and corresponding data protection laws readily understood. Information privacy is basically a state where those who know us are restrained in what they can do with the knowledge they have about us. Privacy is about respect, and protecting individuals against exploitation. It is not about secrecy or even anonymity. There are few cases where ordinary people really want to be anonymous. We actually want businesses to know -- within limits -- who we are, where we are, what we've done, what we like, but we want them to respect what they know, to not share it with others, and to not take advantage of it in unexpected ways. Privacy means that organisations behave as though it's a privilege to know us.

Many have come to see privacy as literally a battleground. The grassroots Cryptoparty movement has come together around a belief that privacy means hiding from the establishment. Cryptoparties teach participants how to use Tor and PGP, and spread a message of resistance. They take inspiration from the Arab Spring where encryption has of course been vital for the security of protestors and organisers. The one Cryptoparty I've attended so far in Sydney opened with tributes from Anonymous, and a number of recorded talks by activists who ranged across a spectrum of social and technosocial issues like censorship, copyright, national security and Occupy. I appreciate where they're coming from, for the establishment has always overplayed its security hand. Even traditionally moderate Western countries have governments charging like china shop bulls into web filtering and ISP data retention, all in the name of a poorly characterised terrorist threat. When governments show little sympathy for netizenship, and absolutely no understanding of how the web works, it's unsurprising that sections of society take up digital arms in response.

Yet going underground with encryption is a limited privacy stratagem, for DIY crypto is incompatible with the majority of our digital dealings. In fact the most nefarious, uncontrolled and ultimately the most dangerous privacy harms come from mainstream Internet businesses and not government. Assuming one still wants to shop online, use a credit card, tweet, and hang out on Facebook, we still need privacy protections. We need limitations on how our Personally Identifiable Information (PII) is used by all the services we deal with; we need department stores to refrain from extracting sensitive health information from our shopping habits, merchants to not use our credit card numbers as customer reference numbers, and online social networks to not x-ray our photo albums by biometric face recognition. I note that some Cryptoparty bookings are managed by the US event organiser Eventbrite, which has a detailed Privacy Policy setting out how it promises to handle personal information provided by attendees. It does seems reasonable to me, but like all private sector data protection arrangements, there's a lot going on there.

So ironically, when registering for a cryptoparty, you could not use encryption! For privacy, you have to either trust Eventbrite to have a reasonable policy and to stick to it, or you might rely on government regulations, if applicable. When registering, you give a little Personal Information to the organisers, and we expect that they will be restrained in what they do with it.

Going out in public never was a license for others to invade our privacy. We ought not to respond to online privacy invasions as if cyberspace is a new Wild West. We have always relied on regulatory systems of consumer protection to curb the excesses of business and government, and we should insist on the same in the digital age. We should not have to hide away if privacy is agreed to mean respecting the PII of customers, users and citizens, and restraining what data custodians do with that precious resource.

We aint seen nothin yet!

I ask anyone who thinks it's too late to reassert our privacy to think for a minute about where we're heading. We're still in the early days of the social web, and the information "innovators" have really only just begun. Look at what they've done so far:


  • Facial recognition converts vast stores of anonymous photos into PII, without consent, and without limit. Facebook's deployment of biometric technology was especially clever. For years they crowd-sourced the creation of templates and the calibration of their algorithms, without ever mentioning facial recognition in their privacy policy or help pages. Even now Facebook's Data Use Policy is entirely silent on biometric templates and what they allow themselves to do with them. Meanwhile, third party services like Facedeals are starting to use Facebook's photo resources for commercial facial recognition in public.
  • It's difficult to overstate the value of facial recognition to businesses like Facebook which have just one asset: the knowledge they have about their members. Combined with image analysis and content addressable image banks, facial recognition lets Facebook work out what we're doing, when, where and with whom, pirating billions of everyday images given over by members to a business that doesn't even mention these priceless resources in its privacy policy.

  • Big Data. The most notorious recent example of the power of data mining comes from Target's covert research into identifying customers who are pregnant based on their buying habits. Big Data practitioners are so enamoured with their ability to extract secrets from "public" data they seem blithely unaware that by generating fresh PII from their raw materials they are in fact collecting it as far as Information Privacy Law is concerned. As such, they’re legally liable for the privacy compliance of their cleverly synthesised data, just as if they had expressly gathered it all by questionnaire.

  • Natural Language Processing (NLP) is the secret sauce in Apple's Siri, allowing her to take commands -- and dictation. Every time you dictate an email or a text message to Siri, Apple gets hold of the content of telecommunications that are normally out of bounds to the phone companies. Siri is like a free PA that reports your daily activities back to the secretarial agency. There is no mention at all of Siri in Apple's Privacy Policy despite the limitless collection of intimate personal information.

As an aside, I'm not one of those who fret that technology has outstripped privacy law. Principles-based Information Prvacy law copes well with most of this technology. OECD privacy principles (enacted in over seventy countries) and the US FIPPs require that companies be transarent about what PII they collect and why, and that they limit the ways in which PII is used for unrelated purposes, and how it may be disclosed. These principles are decades old and yet they have been recently re-affirmed by German regulators recently over Facebook's surreptitious use of facial recognition. I expect that Siri will attract like scrutiny as it rolls out in continental Europe.

So what's next?


  • Google Glass may, in the privacy stakes, surpass both Siri and facial recognition of static photos. If actions speak louder than words, imagine the value to Google of digitising and knowing exactly what we do in real time.

  • Facial recognition as a Service and the sale of biometric templates may be tempting for the photo sharing sites. If and when biometric authentication spreads into retail payments and mobile device security, these systems will face the challenge of enrollment. It might be attractive to share face templates previously collected by Facebook and voice prints by Apple.



So, is it really too late for privacy? The infomopolists and national security zealots may hope so, but surely even cynics will see there is great deal at stake, and that it might be just a little too soon to rush to judge something as important as this.

Posted in Social Networking, Social Media, Privacy, Culture, Big Data

Photo data as crude oil

It's been said that "data is the new oil". The immense stores of Personal Information gifted to Facebook, Google et al by their users are like crude oil reserves: raw material to be tapped, refined, processed and value-added.

[ Update 19 Dec 2012: Instagram has predictably revised its Privacy Policy and Terms of Use to integrate with better Facebook. They put it this way: "As part of our new collaboration, we've learned that by being able to share insights and information with each other, we can build better experiences for our users" (emphasis added). Of course it was inevitable that they would lubricate the disclosure of Instagram pictures with their owners and beyond. Facebook didn't buy them for nothing. ]

[ Update 8 Dec 2012: Some have poo-poohed the comparison with crude oil, including the New York Times' Jer Thorp. No metaphor is ever complete, and this one might distract some people, but the idea is not meant to be about fossil fuels and finite resources. Rather it alludes to the undifferentiated nature of raw data and the high tech ways in which Big Data is refined to create wondrous new products. I like the historical and political context of the oil metaphor too. Right now we are at a historical point comparable to that of the Black Gold prospectors of the 1800s; new supply chains and business models are being devised to exploit this new bounty. The parallels with the oil industry remind us that Big Data is Big Business! ]

I'm especially interested in photo data, and the rapid evolution of tools for monetising it. These tools range from embedded metadata in the uploded photos, through to increasingly sophisticated object recognition and facial recognition algorithms.

Image analysis can extract place names and product names from photos, and recognise objects. It can re-identify faces using biometric templates that users have helpfully created by tagging their friends in entirely unrelated images. Image analysis lets social media companies work out what you're doing, when and where, and who you're doing it with. If Facebook can work out from a photo that you're enjoying a coffee at a recognisable retail outlet, they don't need you to expressly "Like" it. Nor do you have to actively check in to the cafe when most phones tag their photos with geolocation data. Instead, Facebook will automatically file away another little bit of Personal Information, to be melded into the amazingly rich picture they're relentlessly building up.

The ability to extract value from photo data defines a new black-gold rush. Like petroleum engineering, Image Analysis is high tech stuff. There is extraordinary R&D going on in face recognition and object recognition, and the "infomopolies" like Apple, Google and Facebook pay big bucks for IP and startups in this space.

I think there is only one way to look at Facebook's acquisition of Instagram. With 250 million new pictures being added everyday, Instagram is like an undeveloped crude oil field. As such, a billion dollars seems like a bargain.

So Facebook's core business isn't all of a sudden photo sharing. It always was and always will be PI refining.

Oil cracking
Image Analysis As Cracking Tower (0 1)


Posted in Social Media, Privacy, Big Data

A penny for your marketable thoughts?

Most people think that Apple's Siri is the coolest thing they've ever seen on a smart phone. It certainly is a milestone in practical human-machine interfaces, and will be widely copied. The combination of deep search plus natural language processing (NLP) plus voice recognition is dynamite.

And Siri also marks a new milestone in privacy invasion. I predict Siri will become the poster girl for PII piracy, the exemplar of the sly bargain for Personal Information at the heart of most social media.

If you haven't had the pleasure ... Siri is a wondrous new function built into the latest iPhone. It’s the state-of-the-art in artificial intelligence and NLP. You speak directly to Siri, ask her questions (yes, she's female) and tell her what to do with many of your other apps. Siri integrates with mail, text messaging, maps, search, weather, calendar and so on. Ask her "Will I need an umbrella in the morning?" and she'll look up the weather for you – after checking your calendar to see what city you’ll be in tomorrow. It's amazing.

Natural Language Processing is a fabulous idea of course. It radically improves the usability of smart phones, and even their safety with much improved hands-free operation.

An important technical detail is that NLP is very demanding on computing power. In fact it's beyond the capability of today's smart phones, even if each of them alone is more powerful than all of NASA's computers in 1969!. So all Siri's hard work is actually done on Apple's mainframe computers scattered around the planet. That is, all your interactions with Siri are sent into the cloud.

Imagine Siri was a human personal assistant. Imagine she's looking after your diary, placing calls for you, booking meetings, planning your travel, taking dictation, sending emails and text messages for you, reminding you of your appointments, even your significant other’s birthday. She's getting to know you all the while, learning your habits, your preferences, your personal and work-a-day networks.

And she's free!

Now, wouldn't the offer of a free human PA strike you as too good to be true?

Indeed it would. So realise this about Siri: she's continuously reporting back to Apple about your every move. If Apple were a PA placement agency, what they get in return for the free secretarial services is a full transcript of all you've said, everyone you've been in touch with, everything you've done. Apple won't say what they plan to do with all this data, how long they'll keep it, nor who they'll share it with. Apple's Privacy Policy (dated October 2011, accessed 12 March 2012) doesn't even mention Siri nor the collection of the voice-to-text data.

When you dictate your mails and text messages to Siri, you’re providing Apple with content that's usually off limits to carriers, phone companies and ISPs. Siri is an end run around telecommunicationss intercept laws.

Of course there are many, many examples of where free social media apps mask a commercial bargain. Face recognition is the classic case. It was first made available on photo sharing sites as a neat way to organise one’s albums, but then Facebook went further by inviting photo tags from users and then automatically identifying people in other photos on others' pages. What's happening behind the scenes is that Facebook is running its face recognition templates over the billions of photos in their databases (which were originally uploaded for personal use long before face recognition was deployed). Given their business model and their track record, we can be certain that Facebook is using face recognition to identify everyone they possibly can, and thence work out fresh associations between countless people and situations accidentally caught on camera. Combine this with image processing and visual search technology (like Google's "Goggles") and the big social media companies have an incredible new eye in the sky. They can work out what we're doing, when, where and with whom. Nobody will need to like expressly "like" anything anymore when Facebook can see what cars we're driving, what brands we're wearing, where we spend our vacations, what we're eating, what makes us laugh. Apple, Facebook and others have understandably invested hundreds of millions of dollars in image recognition start-ups and intellectual property; with these tools they convert the hitherto anonymous image collections in Picassa, Flickr and the like into content-addressable PII gold mines. It's the next frontier of Big Data.

Now, there wouldn't be much wrong with these sorts of arrangements if the social media corporations were up-front about them. In their Privacy Policies they should detail what Personal Information they are extracting and collecting from all the voice and image data; they should explain why they collect this information, what they plan to do with it, how long they will retain it, and how they promise to limit secondary usage. They should explain that biometrics technology allows them to generate brand new PII out of members' snapshots and utterances. And they should acknowledge that by rendering data identifiable, they become accountable in many places under privacy and data protection laws for its safekeeping as PII. It's just not good enough to vaguely reserve their rights to "use personal information to help us develop, deliver, and improve our products, services, content, and advertising". They should treat their customers -- and all those innocents about whom they collect PII indirectly -- with proper respect, and stop pretending that 'service improvement' is what they're up to.

Siri along with face recognition herald a radical new type of privatised surveillance, and on a breathtaking scale. While Facebook stealthily "x-ray" photo albums without consent, Apple now has even more intimate access to our daily routines and personal habits. And they don’t even pay as much as a penny for our thoughts.

As cool as Siri may be, I myself will decline to use any natural language processing while the software runs in the cloud, and while the service providers refuse to restrain their use of my voice data. I'll wait for NLP to be done on my device with my data kept private.

And I'd happily pay cold hard cash for that kind of app, instead of having an infomopoly embed itself in my personal affairs.

Posted in Social Media, Privacy, Language, Biometrics, Big Data, Social Networking

Information companies and the Use Limitation Principle

Google has copped a lot of flak over its move to join up all services with the cover story that it's simply rationalising its privacy policies. Amongst those defending Google is another information company, Bloomberg. In this post, I want to draw attention to details of Australian privacy law that Bloomberg is oblivious to. Other jurisdictions with OECD based data protection legislation (and that's a lot of the world) may present the same challenge to Google's and Bloomberg's simplistic view of privacy. Let's take a closer look.

In an editorial on March 1, Bloomberg positively thrilled to an alleged over-reaction of privacy advocates:

You’d think Google had announced it would start collecting terabytes of data about you, your neighbor and your dog, if he’s ever online.

Then Bloomberg's editors asserted:

You’d be wrong: Google already does that. Google is not collecting any new information; rather, it is sharing (with itself) more of the information it already has [emphasis added].

But it is Bloomberg that's wrong.

The Use Limitation principle holds that custodians of Personal Information should not put that PI to secondary uses unrelated to the primary purpose for which it was collected. Nobody using Blogger or YouTube for instance over the years could have foreseen that one day their posts and videos would be mashed up with Google's boundless data mines and put to any old comemrcial purpose Google sees fit.

Use Limitation is really basic. One cannot really believe Google doesn't get it; their ambit claim that what they're doing is good for privacy because now there's a single simple privacy policy just doesn't pass muster.

But in Australia, the situation for the big infomopolies is potentially even more restrictive, with recent legally enforceable interpretations of the Use Limitation principle expressly nullifying the presumption that 'sharing information with itself' is ok for heterogeneous organisations.

The Privacy Commissioner for the State of Victoria has advised that "entities within the Victorian public sector should not assume that, because one part of the organisation collected some personal information, this can disclosed to any other part of the organisation without regard for [the Use & Disclosure Principle]" Ref: Guidelines to the Information Privacy Principles, Office of the Victorian Privacy Commissioner, Edition 3, November 2011.

This advice derives from a tribunal ruling elsewhere in Australia, which I discussed at length in another blog post: http://lockstep.com.au/blog/2011/09/04/the-ultimate-opt-out. In that case, patient information collected by a counsellor in a hospital was shared without the patient's consent with another specialist, and the patient's rights were ruled to have been violated.

The relevance of these matters in the current discussion about Google amalgamating services is that the Australian legal system has taken a conservative view of what it means to share personal information within large organisations. Technically, the ruling is that individuals have the right to be informed about internal disclosures, and they may have the right to withdraw their consent.

Let's remember that Australian law is not as strict as that of European states like Germany, and is not enforced as energetically. With OECD principles forming the basis for all these sorts of data protection regulations, I suspect that European states will reach the same conclusions, that Google is not in fact entirely free to share information 'with itself'.

Case law around OECD Privacy Principles is clearly fluid. Big infomopolies need to take more care not to presume what the law actually says.

But let's be less legalistic about this, and instead make this appeal to Google: If you truly have the interests of customers at heart, then please heed civil rights, reconsider how people expect their treasured private information to be handled, and try not to take their online permissiveness for granted.

Posted in Social Media, Privacy

Strippers are better off than Facebook users

Journalist Farhad Manjoo at Slate recently lampooned the privacy interests of Facebook users, quipping sarcastically that "the very idea of making Facebook a more private place borders on the oxymoronic, a bit like expecting modesty at a strip club". Funny.

A stripper might seem the archetype of promiscuity but she has a great deal of control over what's going on. There are strict limits to what she does and moreover, what others including the club are allowed to do to her. Strip club customers are banned from taking photos and exploiting the actors' exuberance, and only the most unscrupulous club would itself take advantage of the show for secondary purposes.

Facebook offers no such protection to their own members.

While people do need to be prudent on the Internet, the real privacy problem with Facebook is not the promiscuity of some of its members, but the blatant and boundless way that it pirates personal information. Regardless of the privacy settings, Facebook reserves all rights to do anything it likes with PI, behind the backs of even its most reserved users. That is the fundamental and persistent privacy breach. It's obscene.

Update 5 Dec 2011

Farhad Manjoo took me to task on Twitter and the Slate site [though his comments at Slate have since disappeared] saying I misunderstood the strip club analogy. He said what he really meant was propriety, not modesty: visitors to strip clubs shouldn't expect propriety and Facebook users shouldn't expect privacy. But I don't see how refining the metaphor makes his point any clearer or, to be frank, any less odious. I haven't been to a lot of strip clubs, but I think that their patrons know pretty much what to expect. Facebook on the other hand is deceptive (and has been officially determined to be so by the FTC). Strip clubs are overt; Facebook is tricky.

Manjoo blames the victims, saying that if people want privacy they shouldn't use Facebook at all. The headline on his article says users are as much to blame for Facebook's privacy woes as Mark Zuckerberg. This is just tacit acceptance of a Wild West, everyone-for-themselves morality that runs through so much of the Internet. We should debate the difference between what is and and what ought to be happening on the Internet, rather than accepting rampant piracy of PI and leaving hapless users to their own devices. The sorts of privacy intrusions that Facebook foists on its users are not intrinsic. Facebook doesn't have to construct biometric templates without the subjects' permission as soon as someone else tags them in photos, neither does it have to continuously run those biometric templates over third party photo data (probably uploaded for other reasons). Facebook could if it desired delete the biometric templates when users ask for tags to be removed, or at the very least alert users to what's going on in the backiground with photo tags. If photo tagging was just for the fun of the users, rather than commercial exploitation, Facebook would promise in its Privacy Policy not to put biometric templates to secondary purposes. But no, Facebook doesn't even mention these things in its Policy.

Some of us -- including both Manjoo and me -- have realised that everything Facebook does is calculated to extract commercial value from the Personal Information it collects and creates. But I don't belittle Facebook's users for falling for the trickery.

Posted in Social Networking, Social Media, Privacy, Internet, Culture

If it sounds too good to be true, it probably is

Imagine a new secretarial agency that provides you with a Personal Assistant. They're a really excellent PA. They look after your diary, place calls, make bookings, plan your travel, send messages for you, take dictation. Like all good PAs, they get to know you, so they'll even help decide where to have dinner.

And you'll never guess: there's no charge!

But ... at the end of each day, the PA reports back to their agency, and provides a full transcript of all you've said, everyone you've been in touch with, everything you've done. The agency won't say what they plan to do with all this data, how long they'll keep it, nor who they'll share it with.

If you're still interested in this deal, here's the PA's name: Siri.

Seriously now ... Siri may be a classic example of the unfair bargain at the core of free social media. Natural language processing is a fabulous idea of course, and will improve the usability of smart phones many times over. But Siri is only "free" because Apple are harvesting personal information with the intent to profit from it. A cynic could even call it a Trojan Horse.

There wouldn't be anything wrong with this bargain if Apple were up-front about it. In their Privacy Policy they should detail what Personal Information they are collecting out of all the voice data; they should explain why they collect it, what they plan to do with it, how long they will retain it, and how they might limit secondary usage. It's not good enough to vaguely reserve their rights to "use personal information to help us develop, deliver, and improve our products, services, content, and advertising".

Apple's Privacy Policy today (dated 21 June 2010 [*]) in fact makes no mention of voice data at all, nor the import of contacts and other PI from the iPhone to help train its artificial intelligence algorithms.

I myself will decline to use Siri while the language processing is done in the cloud, and while Apple does not constrain its use of my voice data. I'll wait for NLP to be done on the device with the data kept private. And I'd happily pay for that app.

Update 28 Nov 2011

Apple updated their Privacy Policy in October, but curiously, the document still makes no mention of Siri, nor voice data in general. By rights (literally in Europe) Apple's Privacy Policy should detail amongst other things why it retains identifiable voice data, and what future use it plans to make of the data.

Posted in Social Networking, Social Media, Privacy, Cloud

Calling for a moratorium on SM facial recognition

Further update 27 Nov 2012: It is reported that new facial recognition services invite you to upload photos to find look alikes in pornography. That is, you can find out if the "girl next door" has a secret life. It's such an egregious threat to privacy that I call again for a moratorium on facial recognition. Mine will not be a popular nor politically correct view, but I reckon this technology is so intrinsically unsafe that we should suspend its use while we agree on ways to control its application.

I'd like to see a moratorium on commercial facial recognition.

It should be of acute concern that photos originally uploaded for personal use are being rendered personally identifiable, and put to secondary purposes by social media companies, who are silent in their Privacy Policies about what those purposes might be. The essence of privacy is control, and with facial recognition, people are utterly impotent: you might be identified through FR by virtue of being snapped at a party or in a public place by someone you never even met.

It’s clear that sites like Facebook have facial recognition bots poring over their image libraries, because this is how they generate tag suggestions. Crucially, when a user asks for a tag to be removed, Facebook does not automatically remove the underlying template that joins the distilled biometric data to the user’s name. That requires a separate and obscure request not mentioned in their Privacy Policy.[Update August 2012: earlier this year, Facebook made welcome changes. More information is provided now about facial recognition, and when tag suggestions is turned off, templates are indeed deleted. Their Privacy Policy however still leaves much to be desired for it does not restrain Facebook's secondary use of biometric templates.]

Identifiable faces in photos are an incredible resource. Combined with image analysis for picking out features like place names, buildings and logos, FR enables social media companies to work out countless new connections to add to their commercial lifeblood. They will be able to work out what we like - the brands we wear, cars we drive, the phones we use, airlines we fly, the places we frequent - without us having to expressly 'Like' anything.

Many users may be unaware of the rich metadata that goes with their photos and which then supercharges their linkages, including data about when a photo was taken, and in many cases where, thanks to GPS or geolocation in their camera phones. And then there is the metadata that the social media service adds, like the name of the user who uploaded the files. And from now on, who else is in it.

By encouraging its members to tag their friends, and then making tag suggestions which are validated by their subjects, Facebook is crowd-sourcing the calibration of its FR algorithms. Even if users are wily enough to have the templates deleted, at the very least Facebook still benefits from the learning to improve its mathematics. All this volunteer testing and training by Facebook’s members is another example of the unfair bargain and false pretenses under which Facebook harvests Personal Information.

As we're seeing in Europe, it appears that current Data Protection laws will put the brakes on facial recognition. There is a straightforward threshold issue: facial recognition converts hitherto anonymous image data into Personally Identifiable Information (in enormous volumes) and thus OECD style Privacy Principles apply. The custodians of this PII must in many jurisdictions account for the necessity of collecting it, they must limit themselves in how they use & disclose it, and they must be transparent in their Privacy Policies about these matters. Collection and use of biometric data may also be subject to consent rules; it may be necessary for individuals to consent in advance to the creation of biometric templates, which is a thorny issue when so many photographs in Facebook were taken by other people and uploaded without the subjects even being aware of it.

Where is all this heading?

Automated facial recognition is a lot like granting social media companies x-ray vision into millions and millions of personal photo albums. As the FR bots do their work, it’s equivalent to magically tattooing names onto the foreheads of people in the photos. And then they can figure out where everyone was, at different points in time, who they were hanging with, and what they were doing. In effect, social media companies can stitch together global surveillance tapes.

Today those "tapes" will be patchy, but they will become steadily more complete and detailed over time, as users innocently upload more and more imagery, and as the biometric efficacy improves.

Posted in Social Networking, Social Media, Privacy, Biometrics

Other thoughts on Real Names

I'm going to follow my own advice and not accept the premise of Google's and Facebook's Real Names policy that it somehow is good for quality. My main rebuttal of Real Names is that it's a commercial tactic and not a well grounded worthy social policy.

But here are a few other points I would make if I did want to argue the merits of anonymity - a quality and basic right I honestly thought was unimpeachable!

Nothing to hide? Puhlease!

Much of the case for Real Names riffs on the tired old 'nothing to hide' argument. This tough-love kind of view that respectable people should not be precious about privacy tends to be the preserve of middle class, middle aged white men who through accident of birth have never personally experienced persecution, or had grounds to fear it.

I wish more of the privileged captains of the Internet could imagine that expressing one's political or religious views (for example) brings personal risks to many of the dispossessed or disadvantaged in the world. And as Identity Woman points out, we're not just talking about resistance fighters in the Middle East but also women in 21st century America who are pilloried for challenging the sexist status quo!

Some have argued that people who fear for their own safety should take their networking offline. That's an awfully harsh perpetuation of the digital divide. I don't deny that there are other ways for evil states to track us down online, and that using pseudonyms is no guarantee of safety. The Internet is indeed a risky place for conducting resistance for those who have mortal fears of surveillance. But ask the people who recently rose up on the back of social media if the risks were worth it, and the answer will be yes. Now ask them if the balance changes under a Real Names policy. And who benefits?

Some of the Internet metaphors are so bad they’re not even wrong

Some continue to compare the Internet with a "public square" and suggest there should be no expectation of privacy. In response, I note first of all that the public-private dichotomy is a red herring. Information privacy law is about controlling the flow of Personally Identifiable Information. Most privacy law doesn't care whether PII has come from the public domain or not: corporations and governments are not allowed to exploit PII harvested without consent.

Let's remember the standard set piece of spy movies where agents retreat to busy squares to have their most secret conversations. One's everyday activities in "public" are actually protected in many ways by the nature of the traditional social medium. Our voices don't carry far, and we can see who we're talking to. Our disclosures are limited to the people in our vicinity, we can whisper or use body language to obfuscate our messages, there is no retention of our PII, and so on. These protections are shattered by information technologies.

If Google's and Facebook's call for the end of anonymity were to extend to public squares, we'd be talking about installing CCTVs, tatooing peoples' names on their foreheads, recording everyone's comings and goings, and providing those records to any old private company to make whatever commercial use they see fit.

Medical OSN apartheid

What about medical social networking, which is one of the next frontiers for patient centric care, especially of mental health. Are patients supposed to use their real names for "transparency" and "integrity"? Of course not, because studies show participation in healthcare in general depends on privacy, and many patients decline to seek treatment if they fear they will be exposed.

Now, Real Names advocates would no doubt seek to make medical OSN a special case, but that would imply an expectation that all healthcare discussions be taken off regular social circles. That's just not how real life socialising occurs.

Anonymity != criminality

There's a recurring angle that anonymity is somehow unlawful or unscrupulous. This attitude is based more on guesswork than criminology. If there were serious statistics on crime being aided and abetted by anonymity then we could debate this point, but there aren't. All we have are wild pronouncements like Eugene Kaspersky's call for an Internet Passport. It seems to me that a great deal of crime is enabled by having too much identity online. It's ludicrous that I should hand over so much Personal Information to establish my bona fides in silly little transactions, when we all know that data is being hoovered up and used behind our backs by identity thieves.

And the idea that OSNs have crime prevention at heart when they force us to use "real names" is a little disingenuous when their response to bullying, child pornography, paedophilia and so on has for so long been characterised by keeping themselves at a cool distance.

What’s real anyway?

What’s so real about "real names" anyway? It's not like Google or Facebook they can check them (in fact, when it suited their purposes, the OSNs previously disclaimed any ability to verify names).

But more's the point, given names are arbitrary. It's perfectly normal for people growing up to not "identify with" the names their parents picked for them (or indeed to not identity with their parents at all). We all put some distance between our adult selves and our childhoods. A given family name is no more real in any social sense than any other handle we choose for ourselves.

Posted in Social Media, Security, Privacy, Nymwars, Internet, Identity, e-health, Culture, Social Networking

'Cybernfreude' and Wikileaks

Wikileaks has long been invaluable. So I hate to think that in flooding us with mostly mundane diplomatic cables, they may have over-played their hand. By humiliating governments, they may have provoked the authorities into truly radical controls over the Internet. And for what? To feed the front pages of an increasingly tabloid press. Honestly, there hasn't been a single revelation all week in the Slatternly Morning Herald befitting investigative journalism. Shrill gossip has trumped real scandal. The signal to noise ratio is so low now that it devalues and demeans Wikileaks' other beneficiaries.

I signed GetUp's petition in support of Wikileaks, but I ducked the protest march. I suspect many of the protestors are simply relishing the embarrassment of our loathed political leaders. Wikileaks should be bigger than this. The movement now seems to be sustained largely by what I would call cybernfreude: taking pleasure in the online misfortune of others.

Posted in Social Media, Security, Privacy