Let's embrace Identity Plurality
In information security we’ve been saddled for years with the tacit assumption that deep down we each have one “true” identity, and that the best way to resolve rights and responsibilities is to render that identity as unique. This “singular identity” paradigm has had a profound and unhelpful influence on security and its sub-disciplines like authentication, PKI, biometrics and federated identity management.
Federated Identity is basically a sort of mash-up of the things that are known about us in different contexts. When describing federated identity, its proponents often point out how drivers licences are presented to boot-strap a new relationship. But it is a category error to abstract this case to as an example of Federated ID, because while a licence might prove your identity when joining a video store, it does not persist in that relationship. Instead the individual is given a new identity: that of a video store member.
A less trivial example is your identity as an employee. When you sign on, HR might sight your driver licence to make sure they get your legal name correct. But thereafter you carry a company ID badge – your identity in that context. You do not present your driver licence to get in the door at work.
Federated Identity posits, often implicitly, that we only really need one identity. The "Identity 2.0" movement properly stresses the multiplicity of our relationships but it usually seeks to hang all relationships off one ID. The beguiling yet utopian OSCON2005 presentation by Dick Hardt shows vividly how many ways there are to be known (although Harte went a step too far when he tried to create a single, albeit fuzzy, uber identity transcending all contexts).
I favor an alternate view - that each of us actually exercises a portfolio of separate identities and that we switch between them in different contexts. This is not an academic distinction; it really makes a big difference where you draw the line on how much you need to know to set a unique identity.
Kim Cameron’s seminal Laws of Identity deliberately promoted the plurality of identity. Cameron included a fresh definition of digital identity as “a set of claims made by one digital subject about itself or another digital subject”. He knew that this relativist definition might be unfamiliar, admitting that it “does not jive with some widely held beliefs – for example that within a given context, identities have to be unique”.
That "widely held belief" seems to be a special product of the computer age. Before the advent of “Identity Management”, we lived happily in a world of plural identities. Each of us could be by turns a citizen, an employee, a chartered professional, a customer, a bank account holder, a credit cardholder, a patient, a club member, another club official, and so on. It was seemingly only after we started getting computer accounts that it occurred to people to think in terms of one "primary" identity threading a number of secondary roles. Conventional Access Control insists on a singular authentication of who I am, followed by multiple authorisations of what I am entitled to do. This principle was laid down by computer scientists in the 1970s.
The idea that we need to establish a true identity before granting access to particular services is unhelpful to many modern online services. Consider the importance of confidentiality in "apomediation" (where people seek medical information from non technical but "expert" patients) and online psychological counselling. Few will enrol in these important new patient-managed healthcare services if they have to identify themselves before providing an alias. Instead, participants in medical social networking will feel strongly that their avatars’ identities in and of themselves are real.
Despite the efforts of Kim Cameron and others, the singular identity paradigm has proved hard to shake. In practice, and despite the plurality in the Laws of Identity, most federated identity formulations actually reuse identities across totally unrelated contexts, in order to conveniently hang multiple roles off the one identity.
The old paradigm also explains the surprisingly easy acceptance of biometrics. The very idea of biometric authentication plays straight into the world view that each user has one “true” identity. Yet these technologies are deeply problematic; in practice their accuracy is disappointing; worse, in the event a biometric is ever stolen, it's impossible with any of today's solutions to cancel and re-issue the identity. Biometrics’ overwhelming intuitive appeal must be based on an idea that what matters in all transactions is the biological person. But it’s not. In most real world transactions, the role is all that matters. Only rarely (such as when investigating fraud) do we go to the forensic extreme of knowing the person.
There are grave risks if we insist on the individual being bodily involved in routine transactions. It would make everything intrinsically linked, violating inherently and irreversibly the most fundamental privacy principle: Don’t collect personal information when it’s not required.
Why are so many people willing to embrace biometrics in spite of their risks and imperfections? It may be because we’ve been inadvertently seduced by the idea of a single identity.
Posted in Identity, Federated Identity, Culture, Biometrics
For all the talk of ecosystems ...
Yet another breathless report crossed my desk via Twitter this morning where the rise of mobile payments is predicted to lead to cards and cash "disappearing", in this case by 2020. Notably, this hyperventilation comes not from a tech vendor but instead from a "research" company.
So I started to wonder why the success of mobile payments (or any other disruptive technology) is so often framed in terms of winner-take-all. Surely we can imagine new payments modalities being super successful without having to see plastic cards and cash disappear? It might just be that press releases and Twitter tend towards polar language. More likely, and not unrelatedly, it's because a lot of people really think this way.
It's especially ironic given how the term "ecosystem" tops most Buzzword Bingo cards these days. If commentators were to actually think ecologically for a minute they'd realise that the extinction of a Family or Order at the hands of another is very rare indeed.
Understanding biometrics and their necessary fallibility
In practice, the most important thing about biometrics is their fallibility. Because of the vagaries of human traits and the way they vary from day to day, biometrics have to cope with the same person appearing a little different each time they front up. Inevitably this means that occasionally a biometric system will confuse one person with another. So what? Well, there are two major foibles of all biometrics that go unmentioned by most vendors:
1. There is an inherent trade off in all biometrics, between their ability to discriminate between different people (specificity) and their ability to properly recognise all users (sensitivity). You can't have it both ways; a system that is very specific will be more inclined to reject a legitimate user, and conversely, a system that never fails to recognise you will also tend to occasionally confuse you with someone else. Yet biometrics vendors often quote their best case False Reject and False Accept figures side by side, as if they're achievable simultaneously.
2. The only way to improve sensitivity and specificity at the same time is to tighten the enrolment and scanning conditions and/or the mathematical models that underpin the algorithms. In other words, to make the systems choosier. This is why really serious biometrics like face recognition for passports and driver licences require stringent lighting conditions and image quality, and why we should be wary of biometrics in mobile devices where there is almost no control over lighting and sound.
Uncertainty accumulates
The least technical criticism of biometrics concerns the fallibility of all measurement methods. Cameras, sensors and microphones – like human eyes and ears – are imperfect, and the ability of a biometric authentication system to distinguish between subtly different people is limited by the precision of the input devices.
Even if the underlying biological traits of interest are truly unique, it does not follow that our machinery will be able to measure them faithfully. Take the iris. This biometric is often promoted with the impressive claim that the probability of two individuals’ iris patterns matching is one in ten to the power of 78. These are literally astronomical odds; there are fewer atoms in the universe than 10-to-the-78. Yet does this figure necessarily tell us how accurate the end-to-end biometric system really is? Consider the fact that there are ten billion stars in the Milky Way. If two people look up in the night sky and each pick a star at random, is the probability of a match one in ten billion? Of course not, because of the limits of our measurement apparatus, in this case the naked eye. Interference too affects the precision of any measurement; the odds of two people in a big city picking the same star might be no better than one in a hundred.
The Sensitivity-Specificity tradeoff: False Positives and False Negatives
Biometric authentication entails a long chain of processing steps, all of which are imperfect. Each step introduces a small degree of uncertainty, as shown in the schematic below. Uncertainty is inescapable even before the first processing step, because the body part being measured can never appear exactly the same. The angle and pressure of a finger on a scanner, the distance of a face from a camera, the tone and volume of the voice, the background noise and lighting, the cleanliness of a lens all change from day to day. A biometric system cannot afford to be too sensitive to subtle variations, or else it can fail to recognise its target; a biometric must tolerate variation in the input, and inevitably this means the system can sometimes confuse its target for someone else.

Therefore all biometric systems inevitably commit two types of error:
1. A “False Negative” is when the system fails to recognise someone who is legitimately enrolled. False Negatives arise if the system cannot cope with subtle changes to the person’s features, the way they present themselves to the scanner, slight variations between scanners at different sites, and so on.
2. A “False Positive” is when the system confuses a stranger with someone else who is already enrolled. This may result from the system being rather too tolerant of variability from one day to another, or from site to site.
False Positives and False Negatives are inescapably linked. If we wish to make a given biometric system more specific – so that it is less likely to confuse strangers with enrolled users – then it will inevitably become less sensitive, tending to wrongly reject legitimate enrolled users more often.
The following schematics illustrate how a highly specific biometric system tends to commit more False Negatives, while a highly sensitive system exhibits relatively more False Positives.

A design decision has to be made when implementing biometrics as to which type of error is less problematic. Where stopping impersonation is paramount, such as in a data centre or missile silo, a biometric system would be biased towards false negatives. Where user convenience is rated highly and where the consequences of fraud are not irreversible, as with Automatic Teller Machines, a biometric might be biased more towards false positives. For border control applications, the sensitivity-specificity trade-off is a very difficult problem, with significant downsides associated with both types of error – either immigration security breaches, or long queues of restless passengers.
Any biometric system, in principle at least, can be tuned towards higher sensitivity or higher specificity, depending on the overall desired balance of security versus convenience. The performance at different thresholds is conventionally shown by a "Detection Error Tradeoff" (DET) curve.
Biometrics vendors tend to keep their DET curves confidential, and usually release commercial solutions where the ratio of False Accept Rate (FAR) to False Reject Rate (FRR) is fixed. The following DET curves are over ten years old but they remain some of the few examples that are publicly available, and they usefully compare several biometric technologies side by side.

Ref: "Biometric Product Testing Final Report" Issue 1.0, 2001 by the UK Government Communications Electronics Security Group (CESG).
Vendors occasionally specify the "Equal Error Rate" for their solutions. It's important to understand what this spec is for. No real world biometric that I'm aware of is deployed with FAR and FRR tuned to be the same. Instead, the EER should be used as a benchmark for broadly comparing different technologies.
EER provides another useful ready reckoner. If a vendor specifies for example FAR = 0.0001% and FRR = 0.01% and yet you find that the EER is, say, 1% -- that is, greater than both the quoted FAR and FRR -- then you know that the vendor is quoting best case figures that cannot be realised simultaneously. Just look at the DET curves above. When False Accept Rate is 0.1% (ie false positives of 1 in a 1000) the False Reject Rate for ranges from at least 5% to as much as 30%. And we can see that an FAR of 0.0001% is really extreme; for most biometrics, such specificity leads to False Rejects of one in two or worse, rendering the solution unusable.
Failure To Enrol
Over and above the issues of False Positives and False Negatives is the unfortunate fact that not everyone will be able to enrol in a given biometric authentication system. At its extremes, this reality is obvious: individuals with missing fingers, or a severe speech impediment for example, may never be able to use certain biometrics.
However, failure to enrol has a deeper significance for more normal users. To minimise False Positives and False Negatives at the same time (as illustrated in the next figiure), a biometric method generally must tighten requirements on the quality of its input data. A fingerprint scanner for instance will perform better on high definition images, where more fingerprint features can be reliably extracted. If a fingerprint detector sets a relatively stringent cut-off for the quality of the image, then it may not be possible to enrol people who happen to have inherently faint fingerprints, such as the elderly, or those with particular skin conditions.

More subtle still is the effect of modelling assumptions within biometric algorithms. In order to make sense of biological traits, the algorithm has to have certain expectations built into it as to how the features of interest generally appear and how those features vary across the population; after all, it is the quantifiable variation in features which allows for different individuals to be told apart. Therefore, face and voice recognition algorithms in particular might be optimised for the statistical characteristics of certain racial groups or nationalities, making it difficult for people from other groups to be enrolled.
The impossibility of enrolling 100% of the population into any biometric security system has important implications for public policy. Clearly there can be at least the perception of discrimination against certain minority groups, if factors like age, foreign accent, ethnicity, disabilities, and/or medical conditions impede the effectiveness of a biometric system. And careful consideration must be given to what fall-back security provisions will be offered to those who cannot be enrolled. If there is a presumption that a biometric somehow provides superior security, then special measures may be necessary to provide equivalent security for the un-enrolled minority.
Posted in Biometrics
Guilty until proven innocent
Once again, in relation to charges levelled against their own, politicians have claimed that like everyone else, they deserve the presumption of innocence. But the old saw "innocent until proven guilty" is no universal human right. It is merely a corollary of the 18th century Blackstone's Formulation: "Better that ten guilty persons escape than that one innocent suffer".
For persons in positions of trust -- politicians, police officers, customs officers, judges and so on -- different calculations apply. The community cuts public officers less slack, because the consequences of their misconduct are far reaching. When only one bad apple can spoil the barrel, Blackstone's Formulation patently does not apply. It is probably better that 10 innocent politicians (or police officers or airport baggage handlers) lose their jobs than for one wrongdoer to stay in place.
If politicians agree to be held to higher standards than members of the public, then as part of the bargain, they cede the presumption of innocence.
Photo data as crude oil
It's been said that "data is the new oil". The immense stores of Personal Information gifted to Facebook, Google et al by their users are like crude oil reserves: raw material to be tapped, refined, processed and value-added. I'm especially interested in photo data, and the rapid evolution of tools for monetising it. These tools range from embedded metadata in the uploded photos, through to increasingly sophisticated object recognition and facial recognition algorithms.
Image analysis can extract place names and product names from photos, and recognise objects. It can re-identify faces using biometric templates that users have helpfully created by tagging their friends in entirely unrelated images. Image analysis lets social media companies work out what you're doing, when and where, and who you're doing it with. If Facebook can work out from a photo that you're enjoying a coffee at a recognisable retail outlet, they don't need you to expressly "Like" it. Nor do you have to actively check in to the cafe when most phones tag their photos with geolocation data. Instead, Facebook will automatically file away another little bit of Personal Information, to be melded into the amazingly rich picture they're relentlessly building up.
The ability to extract value from photo data defines a new black-gold rush. Like petroleum engineering, Image Analysis is high tech stuff. There is extraordinary R&D going on in face recognition and object recognition, and the "infomopolies" like Apple, Google and Facebook pay big bucks for IP and startups in this space.
I think there is only one way to look at Facebook's acquisition of Instagram. With 250 million new pictures being added everyday, Instagram is like an undeveloped crude oil field. As such, a billion dollars seems like a bargain.
So Facebook's core business isn't all of a sudden photo sharing. It always was and always will be PI refining:


Posted in Social Media, Privacy
Killing two birds with one chip
Last week saw the biggest credit card data breach for a while, with around 1.5 million card numbers being stolen by organised crime from processor Global Payments [updated figures per Global Payments investor conference call, Apr 2nd].
So now there will be another few rounds of debate about how to harden these cardholder databases against criminal infiltration, and whether or not the processor was PCI-DSS compliant. Meanwhile, stolen card numbers can be replayed with impugnity and all the hapless customers can do is monitor their accounts for suspicious activity -- which can occur years later.
These days, the main use for stolen payment card data is Card Not Present (CNP) fraud. Traditional "carding" -- where data stolen by skimming is duplicated onto blank mag stripe cards to fool POS terminals or ATMs -- has been throttled in most places by Chip-and-PIN, leaving CNP as organised crime's preferred modus operandi. CNP fraud now makes up three quarters of all card fraud in markets like Australia, and is growing at 40-50% p.a.
All card fraud exploits a specific weakness in the "Four Party" card settlement system shown below. The model is decades old, and remains the foundation of internationally interoperable cards. In a triumph of technology neutrality, the four party arrangement was unchanged by the advent of e-commerce. The one problem with the system is that merchants accepting card numbers may be vulnerable to stolen numbers. Magnetic stripe terminals and Internet servers are unable to tell original cardholder data from copies replayed by fraudsters.

The most important improvment to the payments system was and still is to make card numbers non-replayable. Chip-and-PIN stops carding thanks to cryptographic processes implemented in hardware (the chip) where they cannot be tampered with, and where the secret keys that criminals would need are inaccessible. In essence, a Chip-and-PIN card encrypts customer data within the secure chip (actually, digitally signs it) using keys that never leave the confines of the integrated circuit. Even if a criminal obtains the card holder data, they are unable to apply the additional cryptographic transformations to create legible EMV card-present transactions. This is how Chip-and-PIN stemmed skimming and carding.

CNP fraud is just online carding, fuelled by industrial scale theft of customer records by organised crime, like the recent Global Payments episode. While the PCI-DSS regime reduces accidental losses and amateur attacks, it remains powerless to stop determined criminals, let alone corrupt insiders. When card numbers are available by the tens of millions, and worth several dollars each ($25 or more for platinum cards) truly nothing can stop them from being purloined.
The best way to tackle CNP fraud is to leverage the same hardware based cryptography that prevents skimming and carding.

Lockstep Technologies has developed and proven such a solution. Our award winning Stepwise digitally signs CNP transactions within an EMV chip, rendering card details sent to the merchant non-replayable. The merchant server checks a Stepwise CNP transaction using standard public key libraries; a valid Stepwise transaction can only have come from a genuine Chip-and-PIN card under the control of its holder.
All serious transaction and payments systems use hardware cryptography. The classic examples include mobile telephones' SIM cards, EMV chips, the Hardware Security Modules mandated by financial regulators in all ATMs, and the "secure elements" of NFC devices. With well designed hardware security, we gain a robust upper hand in the cybercrime arms race. So let's stop struggling with flabby distracting systems like 3D Secure, and let's stop pretending that PCI-DSS audits will stop organised crime getting hold of card numbers by the million. Instead, let's kill two birds with one stone and use chips to secure both card present and CNP transactions.
Stepwise creates uniquely secure, fast and easy-to-use CNP payments. It has zero impact on the security certifications of digital signature capable EMV chips, and zero impact on existing four party card processing arrangements.
For more details, please see http://lockstep.com.au/technologies/stepwise.
Posted in Smartcards, Payments, Fraud
CNP fraud is just online carding
I recently posted the latest Card Not Present fraud figures for Australia. Technologically, CNP fraud is not a novel problem. We already have the tools and the cardholder habits to solve the CNP problem. We should look at the experience of skimming and carding, which was another tech problem that demanded a smart tech solution.
Card Not Present fraud is simply online carding.
A magnetic stripe card keeps the cardholder's details as a string of ones and zeroes, stored in the clear, and presents that string to a POS terminal or ATM. It's easy for a criminal to scan the ones and zeroes and copy them to a blank card.
In general terms, EMV or Chip-and-PIN cards work by encrypting those ones and zeros in the chip so they can only be correctly decoded by the terminal equipment. In reality the explanation is somewhat more complex, involving asymmetric cryptography, but for the purposes of explaining the parallel between skimming/carding and CNP fraud, we can skip the details. The salient point is that EMV cards prevent carding by using encryption inside the secure chip using keys that cannot be tampered with or substituted by an attacker.
As with mag stripe cards, conventional Card Not Present transactions transmit cleartext cardholder data, this time to a merchant server. On its own, a server cannot tell the difference between the original data and a copy, just as a POS terminal cannot tell an original bank issued cards from a criminal's copy.
Lockstep Technologies was first to see the parallel between skimming/carding and CNP fraud. Our solution "Stepwise" uses the same cryptographic technology in chip cards that prevents carding to digitally sign transactions created at a browser or mobile device. Stepwise signatures can be verified at any merchant server, using standard built-in software libraries and a widely distributed "master key".

I presented the Stepwise solution to the Payments Innovation stream at Cards & Payments Australia 2012 last week. The presentation is available here.
See also technical details here and a live demo on the ABC TV "New Inventors" program.
Posted in Smartcards, Payments, Fraud
Card Not Present now three quarters of all fraud
The Australian Payments Clearing Association (APCA) releases card fraud statistics every six months for the preceding 12m period. Lockstep monitors these figures, condenses them and plots the trend data.
Here's the latest picture of Australian payment card fraud in three major categories over the past six financial years.

Card fraud by skimming and counterfeiting is holding steady, thanks to the security of EMV chip-and-PIN cards. Card Not Present (CNP) fraud is the preferred modus operandum of organised crime, and continues to grow unabated. The increase in CNP fraid from last financial year was 46%; CNP now represents 71% -- or nearly three quarters -- of total annual card fraud.
What's to be done about this never ending problem?
- The credit card associations' flagship online payment protocol "3D Secure", rolled out selectively and tentatively overseas, is loathed by customers and merchants alike. 3D Secure is virtually unknown in Australia.
- There have been various attempts to stem the tide of stolen cardholder details that fuels CNP fraud. Examples include 'big iron' software changes like "Tokenization" and the PCI-DSS security audit regime, which has proven expensive and largely futile. Arguments raged over whether Heartland Payments Systems (which suffered the world's biggest card data theft in 2009) was "really" PCI-DSS compliant. It's become so arbitrary that by the time the Sony PSN was breached last year with the loss of up to 70 million credit cards (nobody really knows how many) the question of whether Sony was PCI compliant never even came up.
- Or we could get smart and exploit the same cryptographic security that allows chip cards to stop skimming, to protect cardholder details between the user's device and the merchant server. See Lockstep Technologies' award winning Stepwise CNP security solution.
Ski runs and LOAs
In Identity Management, Levels of Assurance are an attempt to standardise the riskiness of online transactions and the commensurate authentication strength needed to secure them. Quaternary LOAs (levels 1/2/3/4) have been instituted by governments in the USA, Australia and elsewhere, and they're a cornerstone of federated identity programs like NSTIC.
All LOA formulations are based on risk management methodologies like the international standard ISO 31000. The common approach is for organisations to assess both the impact and expected likelihood of all important adverse events (threats) using metrics customised to the local business conditions and objectives. The severity of security threats can be calculated in all sorts of ways. Some organisations can put a dollar price on the impact of a threat; others look at qualititative or political effects. And the capacity to cover the downside means that the same sort of incident might be thought "minor" at a big pharmaceutical company but "catastrophic" at a small Clinical Research Organisation.
I've blogged before that one problem with LOAs is that risk ratings aren't transferrable. Risk management standards like ISO 31000 are forumulated for internal customised use, so their results are not inherently meaningful between organisations.
Just look at another type of risk rating: the colours of ski runs.
All ski resorts around the world badge the degree of difficulty of their runs the same way: Green, Blue, Black and sometimes Double Black. But do these labels mean anything between resorts? Is a Blue run at Aspen the same as a Blue at Thredbo? No. These colours are not like currency, so skiers are free to boast "that Black isn't nearly as tough as the Black I did last week".
LOAs are just like this. They're local. They're based on risk metrics (and risk appetites) that are not uniform across organisations. They cannot interoperate.
As far as I am aware, there are as yet no examples of LOA 3 or 4 credentials issued by one IdP being relied on by external Service Providers. When there's a lot at stake, organisations prefer to use their own identities and risk management processes. And it's the same with skiing. A risk averse skier at the top of a Black run needs more than the pat assurance of others; they will make up their own mind about the risk of going down the hill.
Posted in Language, Federated Identity
A penny for your marketable thoughts?
Most people think that Apple's Siri is the coolest thing they've ever seen on a smart phone. It certainly is a milestone in practical human-machine interfaces, and will be widely copied. The combination of deep search plus natural language processing (NLP) plus voice recognition is dynamite.
And Siri also marks a new milestone in privacy invasion. I predict Siri will become the poster girl for PII piracy, the exemplar of the sly bargain for Personal Information at the heart of most social media.
If you haven't had the pleasure ... Siri is a wondrous new function built into the latest iPhone. It’s the state-of-the-art in artificial intelligence and NLP. You speak directly to Siri, ask her questions (yes, she's female) and tell her what to do with many of your other apps. Siri integrates with mail, text messaging, maps, search, weather, calendar and so on. Ask her "Will I need an umbrella in the morning?" and she'll look up the weather for you – after checking your calendar to see what city you’ll be in tomorrow. It's amazing.
Natural Language Processing is a fabulous idea of course. It radically improves the usability of smart phones, and even their safety with much improved hands-free operation.
An important technical detail is that NLP is very demanding on computing power. In fact it's beyond the capability of today's smart phones, even if each of them alone is more powerful than all of NASA's computers in 1969!. So all Siri's hard work is actually done on Apple's mainframe computers scattered around the planet. That is, all your interactions with Siri are sent into the cloud.
Imagine Siri was a human personal assistant. Imagine she's looking after your diary, placing calls for you, booking meetings, planning your travel, taking dictation, sending emails and text messages for you, reminding you of your appointments, even your significant other’s birthday. She's getting to know you all the while, learning your habits, your preferences, your personal and work-a-day networks.
And she's free!
Now, wouldn't the offer of a free human PA strike you as too good to be true?
Indeed it would. So realise this about Siri: she's continuously reporting back to Apple about your every move. If Apple were a PA placement agency, what they get in return for the free secretarial services is a full transcript of all you've said, everyone you've been in touch with, everything you've done. Apple won't say what they plan to do with all this data, how long they'll keep it, nor who they'll share it with. Apple's Privacy Policy (dated October 2011, accessed 12 March 2012) doesn't even mention Siri nor the collection of the voice-to-text data.
When you dictate your mails and text messages to Siri, you’re providing Apple with content that's usually off limits to carriers, phone companies and ISPs. Siri is an end run around telecommunicationss intercept laws.
Of course there are many, many examples of where free social media apps mask a commercial bargain. Face recognition is the classic case. It was first made available on photo sharing sites as a neat way to organise one’s albums, but then Facebook went further by inviting photo tags from users and then automatically identifying people in other photos on others' pages. What's happening behind the scenes is that Facebook is running its face recognition templates over the billions of photos in their databases (which were originally uploaded for personal use long before face recognition was deployed). Given their business model and their track record, we can be certain that Facebook is using face recognition to identify everyone they possibly can, and thence work out fresh associations between countless people and situations accidentally caught on camera. Combine this with image processing and visual search technology (like Google's "Goggles") and the big social media companies have an incredible new eye in the sky. They can work out what we're doing, when, where and with whom. Nobody will need to like expressly "like" anything anymore when Facebook can see what cars we're driving, what brands we're wearing, where we spend our vacations, what we're eating, what makes us laugh. Apple, Facebook and others have understandably invested hundreds of millions of dollars in image recognition start-ups and intellectual property; with these tools they convert the hitherto anonymous image collections in Picassa, Flickr and the like into content-addressable PII gold mines. It's the next frontier of Big Data.
Now, there wouldn't be much wrong with these sorts of arrangements if the social media corporations were up-front about them. In their Privacy Policies they should detail what Personal Information they are extracting and collecting from all the voice and image data; they should explain why they collect this information, what they plan to do with it, how long they will retain it, and how they promise to limit secondary usage. They should explain that biometrics technology allows them to generate brand new PII out of members' snapshots and utterances. And they should acknowledge that by rendering data identifiable, they become accountable in many places under privacy and data protection laws for its safekeeping as PII. It's just not good enough to vaguely reserve their rights to "use personal information to help us develop, deliver, and improve our products, services, content, and advertising". They should treat their customers -- and all those innocents about whom they collect PII indirectly -- with proper respect, and stop pretending that 'service improvement' is what they're up to.
Siri along with face recognition herald a radical new type of privatised surveillance, and on a breathtaking scale. While Facebook stealthily "x-ray" photo albums without consent, Apple now has even more intimate access to our daily routines and personal habits. And they don’t even pay as much as a penny for our thoughts.
As cool as Siri may be, I myself will decline to use any natural language processing while the software runs in the cloud, and while the service providers refuse to restrain their use of my voice data. I'll wait for NLP to be done on my device with my data kept private.
And I'd happily pay cold hard cash for that kind of app, instead of having an infomopoly embed itself in my personal affairs.
Posted in Social Networking, Social Media, Privacy, Language, Biometrics