In 2002, a couple of Japanese visitors to Australia swapped passports with each other before walking through an automatic biometric border control gate being tested at Sydney airport. The facial recognition algorithm falsely matched each of them to the others' passport photo. These gentlemen were in fact part of an international aviation industry study group and were in the habit of trying to fool biometric systems then being trialed round the world.
When I heard about this successful prank, I quipped that the algorithms were probably written by white people - because we think all Asians look the same. Colleagues thought I was making a typical sick joke, but actually I was half-serious. It did seem to me that the choice of facial features thought to be most distinguishing in a facial recognition model could be culturally biased.
Since that time, border control face recognition has come a long way, and I have not heard of such errors for many years. Until today.
The San Francisco Chronicle of July 21 carries a front page story about the cloud storage services of Google and Flickr mislabeling some black people as gorillas (see updated story, online). It's a quite incredible episode. Google has apologized. Its Chief Architect of social, Yonatan Zunger, seems mortified judging by his tweets as reported, and is investigating.
The newspaper report quotes machine learning experts who suggest programmers with limited experience of diversity may be to blame. That seems plausible to me, although I wonder where exactly the algorithm R&D gets done, and how much control is to be had over the biometric models and their parameters along the path from basic research to application development.
So man has literally made software in his own image.
The public is now being exposed to Self Driving Cars, which are heavily reliant on machine vision, object recognition and artificial intelligence. If this sort of software can't tell people from apes in static photos given lots of processing time, how does it perform in real time, with fleeting images, subject to noise, and with much greater complexity? It's easy to imagine any number of real life scenarios where an autonomous car will have to make a split-second decision between two pretty similar looking objects appearing unexpectedly in its path.
The general expectation is that Self Driving Cars (SDCs) will be tested to exhaustion. And so they should. But if cultural partiality is affecting the work of programmers, it's possible that testers have suffer the same blind spots without knowing it. Maybe the offending photo labeling programs were never verified with black people. So how are the test cases for SDCs being selected? What might happen when an SDC ventures into environments and neighborhoods where its programmers have never been?
Everybody in image processing and artificial intelligence should be humbled by the racist photo labeling. With the world being eaten by software, we need to reflect really deeply on how such design howlers arise. And frankly double check if we're ready to let computer programs behind the wheel.
If the digital economy is really the economy then it's high time we moved beyond hoping that we can simply train users to be safe online. Is the real economy only for heros who can protect themselves in the jungle, writing their own code. As if they're carrying their own guns? Or do we as a community build structures and standards and insist on technologies that work for all?
For most people, the World Wide Web experience still a lot like watching cartoons on TV. The human-machine interface is almost the same. The images and actions are just as synthetic; crucially, nothing on a web browser is real. Almost anything goes -- just as the Roadrunner defies gravity in besting Coyote, there are no laws of physics that temper the way one bit of multimedia leads to the next. Yes, there is a modicum of user feedback in the way we direct some of the action when browsing and e-shopping, but it's quite illusory; for the most part all we're really doing is flicking channels across a billion pages.
It's the suspension of disbelief when browsing that lies at the heart of many of the safety problems we're now seeing. Inevitably we lose our bearings in the totally synthetic World Wide Web. We don't even realise it, we're taken in by a virtual reality, and we become captive to social engineering.
But I don't think it's possible to tackle online safety by merely countering users' credulity. Education is not the silver bullet, because the Internet is really so technologically complex and abstract that it lies beyond the comprehension of most lay people.
Using the Internet 'safely' today requires deep technical skills, comparable to the level of expertise needed to operate an automobile circa 1900. Back then you needed to be able to do all your own mechanics [roughly akin to the mysteries of maintaining anti-virus software], look after the engine [i.e. configure the operating system and firewall], navigate the chaotic emerging road network [there's yet no trusted directory for the Internet, nor any road rules], and even figure out how to fuel the contraption [consumer IT supply chains is about as primitive as the gasoline industry was 100 years ago]. The analogy with the early car industry becomes especially sharp for me when I hear utopian open source proponents argue that writing ones own software is the best way to be safe online.
The Internet is so critical (I'd have thought this was needless to say) that we need ways of working online that don't require us to all be DIY experts.
I wrote a first draft of this blog six years ago, and at that time I called for patience in building digital literacy and sophistication. "It took decades for safe car and road technologies to evolve, and the Internet is still really in its infancy" I said in 2009. But I'm less relaxed about his now, on the brink of the Internet of Things. It's great that the policy makers like the US FTC are calling on connected device makers to build in security and privacy, but I suspect the Internet of Things will require the same degree of activist oversight and regulation as does the auto industry, for the sake of public order and the economy. Do we have the appetite to temper breakneck innovation with safety rules?
A letter to the editor of The Saturday Paper, published Nov 15, 2014.
In his otherwise fresh and sympathetic “Web of abuse” (November 8-14), Martin McKenzie-Murray unfortunately concludes by focusing on the ability of victims of digital hate to “[rationally] assess their threat level”. More’s the point, symbolic violence is still violent. The threat of sexual assault by men against women is inherently terrifying and damaging, whether it is carried out or not. Any attenuation of the threat of rape dehumanises all of us.
There’s a terrible double standard among cyber-libertarians. When good things happen online – such as the Arab Spring, WikiLeaks, social networking and free education – they call the internet a transformative force for good. Yet they can play down digital hate crimes as “not real”, and disown their all-powerful internet as just another communications medium.
Stephen Wilson, Five Dock, NSW.
The "Right to be Forgotten" debate reminds me once again of the cultural differences between technology and privacy.
On September 30, I was honoured to be part of a panel discussion hosted by the IEEE on RTBF; a recording can be viewed here. In a nutshell, the European Court of Justice has decided that European citizens have the right to ask search engine businesses to suppress links to personal information, under certain circumstances. I've analysed and defended the aims of the ECJ in another blog.
One of the IEEE talking points was why RTBF has attracted so much scorn. My answer was that some critics appear to expect perfection in the law; when they look at the RTBF decision, all they see is problems. Yet nobody thinks this or any law is perfect; the question is whether it helps improve the balance of rights in a complex and fast changing world.
It's a little odd that technologists in particular are so critical of imperfections in the law, when they know how flawed is technology. Indeed, the security profession is almost entirely concerned with patching problems, and reminding us there will never be perfect security.
Of course there will be unwanted side-effects of the new RTBF rules and we should trust that over time these will be reviewed and dealt with. I wish that privacy critics could be more humble about this unfolding environment. I note that when social conservatives complain about online pornography, or when police decry encryption as a tool of criminals, technologists typically play those problems down as the unintended consequences of new technologies, which on average overwhelmingly do good not evil.
And it's the same with the law. It really shouldn't be necessary to remind anyone that laws have unintended consequences, for they are the stuff of the entire genre of courtroom drama. So everyone take heart: the good guys nearly always win in the end.
It's long been said that if you're getting something for free online, then you're not the customer, you're the product. It's a reference to the one-sided bargain for personal information that powers so many social businesses - the way that "infomopolies" as I call them exploit the knowledge they accumulate about us.
Now it's been revealed that we're even lower than product: we're lab rats.
Facebook data scientist Adam Kramer, with collaborators from UCSF and Cornell, this week reported on a study in which they tested how Facebook users respond psychologically to alternatively positive and negative posts. Their experimental technique is at once ingenious and shocking. They took the real life posts of nearly 700,000 Facebook members, and manipulated them, turning them slightly up- or down-beat. And then Kramer at al measured the emotional tone in how people reading those posts reacted in their own feeds. See Experimental evidence of massive-scale emotional contagion through social networks, Adam Kramer,Jamie Guillory & Jeffrey Hancock, in Proceedings of the National Academy of Sciences, v111.24, 17 June 2014.
The resulting scandal has been well-reported by many, including Kashmir Hill in Forbes, whose blog post nicely covers how the affair has unfolded, and includes a response by Adam Kramer himself.
Plenty has been written already about the dodgy (or non-existent) ethics approval, and the entirely contemptible claim that users gave "informed consent" to have their data "used" for research in this way. I draw attention to the fact that consent forms in properly constituted human research experiments are famously thick. They go to great pains to explain what's going on, the possible side effects and potential adverse consequences. The aim of a consent form is to leave the experimental subject in no doubt whatsoever as to what they're signing up for. Contrast this with the Facebook Experiment where they claim informed consent was represented by a fragment of one sentence buried in thousands of words of the data usage agreement. And Kash Hill even proved that the agreement was modified after the experiment started! These are not the actions of researchers with any genuine interest in informed consent.
I was also struck by Adam Kramer's unvarnished description of their motives. His response to the furore (provided by Hill in her blog) is, as she puts it, tone deaf. Kramer makes no attempt whatsoever at a serious scientific justification for this experiment:
- "The reason we did this research is because we care about the emotional impact of Facebook and the people that use our product ... [We] were concerned that exposure to friends’ negativity might lead people to avoid visiting Facebook.
That is, this large scale psychological experiment was simply for product development.
Some apologists for Facebook countered that social network feeds are manipulated all the time, notably by advertisers, to produce emotional responses.
Now that's interesting, because for their A-B experiment, Kramer and his colleagues took great pains to make sure the subjects were unaware of the manipulation. After all, the results would be meaningless if people knew what they were reading had been emotionally fiddled with.
In contrast, the ad industry has always insisted that today's digital consumers are super savvy, and they know the difference between advertising and real-life. Yet the foundation of the Facebook experiment is that users are unaware of how their online experience is being manipulated. The ad industry's illogical propaganda [advertising is just harmless fun, consumers can spot the ads, they're not really affected by ads all that much ... Hey, with a minute] has only been further exposed by the Facebook Experiment.
Advertising companies and Social Networks are increasingly expert at covertly manipulating perceptions, and now they have the data, collected dishonestly, to prove it.
For the past year, oncologists at the Memorial Sloan Kettering Cancer Centre in New York have been training IBM’s Watson – the artificial intelligence tour-de-force that beat allcomers on Jeopardy – to help personalise cancer care. The Centre explains that "combining [their] expertise with the analytical speed of IBM Watson, the tool has the potential to transform how doctors provide individualized cancer treatment plans and to help improve patient outcomes". Others are speculating already that Watson could "soon be the best doctor in the world".
I have no doubt that when Watson and things like it are available online to doctors worldwide, we will see overall improvements in healthcare outcomes, especially in parts of the world now under-serviced by medical specialists [having said that, the value of diagnosing cancer in poor developing nations is questionable if they cannot go on to treat it]. As with Google's self-driving car, we will probably get significant gains eventually, averaged across the population, from replacing humans with machines. Yet some of the foibles of computing are not well known and I think they will lead to surprises.
For all the wondrous gains made in Artificial Intelligence, where Watson now is the state-of-the art, A.I. remains algorithmic, and for that, it has inherent limitations that don't get enough attention. Computer scientists and mathematicians have know for generations that some surprisingly straightforward problems have no algorithmic solution. That is, some tasks cannot be accomplished by any universal step-by-step codified procedure. Examples include the Halting Problem and the Travelling Salesperson Problem. If these simple challenges have no algorithm, we need be more sober in our expectations of computerised intelligence.
A key limitation of any programmed algorithm is that it must make its decisions using a fixed set of inputs that are known and fully characterised (by the programmer) at design time. If you spring an unexpected input on any computer, it can fail, and yet that's what life is all about -- surprises. No mathematician seriously claims that what humans do is somehow magic; most believe we are computers made of meat. Nevertheless, when paradoxes like the Halting Problem abound, we can be sure that computing and cognition are not what they seem. We should hope these conundrums are better understood before putting too much faith in computers doing deep human work.
And yet, predictably, futurists are jumping ahead to imagine "Watson apps" in which patients access the supercomputer for themselves. Even if there were reliable algorithms for doctoring, I reckon the "Watson app" is a giant step, because of the complex way the patient's conditions are assessed and data is gathered for the diagnosis. That is, the taking of the medical history.
In these days of billion dollar investments in electronic health records (EHRs), we tend to think that medical decisions are all about the data. When politicians announce EHR programs they often boast that patients won't have to go through the rigmarole of giving their history over and over again to multiple doctors as they move through an episode of care. This is actually a serious misunderstanding of the importance in clinical decision-making of the interaction between medico and patient when the history is taken. It's subtle. The things a patient chooses to tell, the things they seem to be hiding, and the questions that make them anxious, all guide an experienced medico when taking a history, and provide extra cues (metadata if you will) about the patient’s condition.
Now, Watson may well have the ability to navigate this complexity and conduct a very sophisticated Q&A. It will certainly have a vastly bigger and more reliable memory of cases than any doctor, and with that it can steer a dynamic patient questionnaire. But will Watson be good enough to be made available direct to patients through an app, with no expert human mediation? Or will a host of new input errors result from patients typing their answers into a smart phone or speaking into a microphone, without any face-to-face subtlety (let alone human warmth)? It was true of mainframes and it’s just as true of the best A.I.: Bulldust in, bulldust out.
Finally, Watson's existing linguistic limitations are not to be underestimated. It is surely not trivial that Watson struggles with puns and humour. Futurist Mark Pesce when discussing Watson remarked in passing that scientists don’t understand the "quirks of language and intelligence" that create humour. The question of what makes us laugh does in fact occupy some of the finest minds in cognitive and social science. So we are a long way from being able to mechanise humour. And this matters because for the foreseeable future, it puts a great deal of social intercourse beyond AI's reach.
In between the extremes of laugh-out-loud comedy and a doctor’s dry written notes lies a spectrum of expressive subtleties, like a blush, an uncomfortable laugh, shame, and the humiliation that goes with some patients’ lived experience of illness. Watson may understand the English language, but does it understand people?
Watson can answer questions, but good doctors ask a lot of questions too. When will this amazing computer be able to hold the sort of two-way conversation that we would call a decent "bedside manner"?
Have a disruptive technology implementation story? Get recognised for your leadership. Apply for the 2014 SuperNova Awards for leaders in disruptive technology.
My Constellation Research colleague Alan Lepofsky as been working on new ways to characterise users in cyberspace. Frustrated with the oversimplified cliche of the "Digital Millennials", Alan has developed a fresh framework for categorizing users according to their comfort with technology and their actual knowledge of it. See his new research report "Segmenting Audiences by Digital Proficiency".
This sort of schema could help frame the answers to some vital open questions. In today's maelstrom of idealism and hyperbole, we're struggling to predict how things are going to turn out, and to build appropriate policies and management structures. We are still guessing how the digital revolution is really going to change the human condition? We're not yet rigorously measuring the sorts of true changes, if any, that the digital transformation is causing.
We hold such disparate views about cyberspace right now. When the Internet does good – for example through empowering marginalized kids at schools, fueling new entrepreneurship, or connecting disadvantaged communities – it is described as a power for good, a true "paradigm shift". But when it does bad – as when kids are bullied online or when phishing scams hook inexperienced users – then the Internet is said to be just another communications medium. Such inconsistent attitudes are with us principally because the medium is still so new. Yet we all know how important it is, and that far reaching policy decisions are being made today. So it’s good to see new conceptual frameworks for analyzing the range of ways that people engage with and utilise the Internet.
Vast fortunes are being made through online business models that purport to feed a natural hunger to be social. With its vast reach and zero friction, the digital medium might radically amplify aspects of the social drive, quite possibly beyond what nature intended. As supremely communal beings, we humans have evolved elaborate social bearings for getting on in diverse groups, and we've built social conventions that govern how we meet, collaborate, refer, partner, buy and sell, amalgamate, marry, and split. We are incredibly adept at reading body language, spotting untruths, and gaming each other for protection or for personal advantage. In cyberspace, few of the traditional cues are available to us; we literally lose our bearings online. And therefore naive Internet users fall prey to spam, fake websites and all manner of scams.
How are online users adapting to their new environment and evolving new instincts? I expect there will be interesting correlations between digital resilience and the sophistication measures in Alan’s digital proficiency framework. We might expect Digital Natives to be better equipped inherently to detect and respond to online threats, although they might be somewhat more at risk by virtue of being more active. I wonder too if the risk-taking behavior which exacerbates some online risks for adolescents would be relatively more common amongst Digital Immigrants? By the same token, the Digital Skeptics who are knowledgeable yet uncomfortable may be happy staying put in that quadrant, or only venturing out for selected cyber activities, because they’re consciously managing their digital exposure.
We certainly do need new ways like Alan's Digital Proficiency Framework to understand society’s complex "Analog to Digital" conversion. I commend it to you.
This article was first published as
Wilson, S. (2013). What happened to privacy?". Issues, 104, 41-44
Wilson, S. (2013). What happened to privacy?". Australasian Science, Dec 2013, 30-33.
The cover of Newsweek magazine on 27 July 1970 featured a cartoon couple cowered by computer and communications technology, and the urgent all-caps headline “IS PRIVACY DEAD?”
Four decades on, Newsweek is dead, but we’re still asking the same question.
Every generation or so, our notions of privacy are challenged by a new technology. In the 1880s (when Warren and Brandeis developed the first privacy jurisprudence) it was photography and telegraphy; in the 1970s it was computing and consumer electronics. And now it’s the Internet, a revolution that has virtually everyone connected to everyone else (and soon everything) everywhere, and all of the time. Some of the world’s biggest corporations now operate with just one asset – information – and a vigorous “publicness” movement rallies around the purported liberation of shedding what are said by writers like Jeff Jarvis (in his 2011 book “Public Parts”) to be old fashioned inhibitions. Online Social Networking, e-health, crowd sourcing and new digital economies appear to have shifted some of our societal fundamentals.
However the past decade has seen a dramatic expansion of countries legislating data protection laws, in response to citizens’ insistence that their privacy is as precious as ever. And consumerized cryptography promises absolute secrecy. Privacy has long stood in opposition to the march of invasive technology: it is the classical immovable object met by an irresistible force.
So how robust is privacy? And will the latest technological revolution finally change privacy forever?
Soaking in information
We live in a connected world. Young people today may have grown tired of hearing what a difference the Internet has made, but a crucial question is whether relatively new networking technologies and sheer connectedness are exerting novel stresses to which social structures have yet to adapt. If “knowledge is power” then the availability of information probably makes individuals today more powerful than at any time in history. Search, maps, Wikipedia, Online Social Networks and 3G are taken for granted. Unlimited deep technical knowledge is available in chat rooms; universities are providing a full gamut of free training via Massive Open Online Courses (MOOCs). The Internet empowers many to organise in ways that are unprecedented, for political, social or business ends. Entirely new business models have emerged in the past decade, and there are indications that political models are changing too.
Most mainstream observers still tend to talk about the “digital” economy but many think the time has come to drop the qualifier. Important services and products are, of course, becoming inherently digital and whole business categories such as travel, newspapers, music, photography and video have been massively disrupted. In general, information is the lifeblood of most businesses. There are countless technology-billionaires whose fortunes are have been made in industries that did not exist twenty or thirty years ago. Moreover, some of these businesses only have one asset: information.
Banks and payments systems are getting in on the action, innovating at a hectic pace to keep up with financial services development. There is a bewildering array of new alternative currencies like Linden dollars, Facebook Credits and Bitcoins – all of which can be traded for “real” (reserve bank-backed) money in a number of exchanges of varying reputation. At one time it was possible for Entropia Universe gamers to withdraw dollars at ATMs against their virtual bank balances.
New ways to access finance have arisen, such as peer-to-peer lending and crowd funding. Several so-called direct banks in Australia exist without any branch infrastructure. Financial institutions worldwide are desperate to keep up, launching amongst other things virtual branches and services inside Online Social Networks (OSNs) and even virtual worlds. Banks are of course keen to not have too many sales conducted outside the traditional payments system where they make their fees. Even more strategically, banks want to control not just the money but the way the money flows, because it has dawned on them that information about how people spend might be even more valuable than what they spend.
Privacy in an open world
For many for us, on a personal level, real life is a dynamic blend of online and physical experiences. The distinction between digital relationships and flesh-and-blood ones seems increasingly arbitrary; in fact we probably need new words to describe online and offline interactions more subtly, without implying a dichotomy.
Today’s privacy challenges are about more than digital technology: they really stem from the way the world has opened up. The enthusiasm of many for such openness – especially in Online Social Networking – has been taken by some commentators as a sign of deep changes in privacy attitudes. Facebook's Mark Zuckerberg for instance said in 2010 that “People have really gotten comfortable not only sharing more information and different kinds, but more openly and with more people - and that social norm is just something that has evolved over time”. And yet serious academic investigation of the Internet’s impact on society is (inevitably) still in its infancy. Social norms are constantly evolving but it’s too early to tell to if they have reached a new and more permissive steady state. The views of information magnates in this regard should be discounted given their vested interest in their users' promiscuity.
At some level, privacy is about being closed. And curiously for a fundamental human right, the desire to close off parts of our lives is relatively fresh. Arguably it’s even something of a “first world problem”. Formalised privacy appears to be an urban phenomenon, unknown as such to people in villages when everyone knew everyone – and their business. It was only when large numbers of people congregated in cities that they became concerned with privacy. For then they felt the need to structure the way they related to large numbers of people – family, friends, work mates, merchants, professionals and strangers – in multi-layered relationships. So privacy was borne of the first industrial revolution. It has taken prosperity and active public interest to create the elaborate mechanisms that protect our personal privacy from day to day and which we take for granted today: the postal services, direct dial telephones, telecommunications regulations, individual bedrooms in large houses, cars in which we can escape or a while, and now of course the mobile handset.
Privacy is about respect and control. Simply put, if someone knows me, then they should respect what they know; they should exercise restraint in how they use that knowledge, and be guided by my wishes. Generally, privacy is not about anonymity or secrecy. Of course, if we live life underground then unqualified privacy can be achieved, yet most of us exist in diverse communities where we actually want others to know a great deal about us. We want merchants to know our shipping address and payment details, healthcare providers to know our intimate details, hotels to know our travel plans and so on. Practical privacy means that personal information is not shared arbitrarily, and that individuals retain control over the tracks of their lives.
Big Data: Big Future
Big Data tools are being applied everywhere, from sifting telephone call records to spot crimes in the planning, to DNA and medical research. Every day, retailers use sophisticated data analytics to mine customer data, ostensibly to better uncover true buyer sentiments and continuously improve their offerings. Some department stores are interested in predicting such major life changing events as moving house or falling pregnant, because then they can target whole categories of products to their loyal customers.
Real time Big Data will become embedded in our daily lives, through several synchronous developments. Firstly computing power, storage capacity and high speed Internet connectivity all continue to improve at exponential rates. Secondly, there are more and more “signals” for data miners to choose from. No longer do you have to consciously tell your OSN what you like or what you’re doing, because new augmented reality devices are automatically collecting audio, video and locational data, and trading it around a complex web of digital service providers. And miniaturisation is leading to a whole range of smart appliances, smart cars and even smart clothes with built-in or ubiquitous computing.
The privacy risks are obvious, and yet the benefits are huge. So how should we think about the balance in order to optimise the outcome? Let’s remember that information powers the new digital economy, and the business models of many major new brands like Facebook, Twitter, Four Square and Google incorporate a bargain for Personal Information. We obtain fantastic services from these businesses “for free” but in reality they are enabled by all that information we give out as we search, browse, like, friend, tag, tweet and buy.
The more innovation we see ahead, the more certain it seems that data will be the core asset of cyber enterprises. To retain and even improve our privacy in the unfolding digital world, we must be able to visualise the data flows that we’re engaged in, evaluate what we get in return for our information, and determine a reasonable trade of costs and benefits
Is Privacy Dead? If the same rhetorical question needs to be asked over and over for decades, then it’s likely the answer is no.
Biometrics seems to be going gang busters in the developing world. I fear we're seeing a new wave of technological imperialism. In this post I will examine whether the biometrics field is mature enough for the lofty social goal of empowering the world's poor and disadvantaged with "identity".
The independent Center for Global Development has released a report "Identification for Development: The Biometrics Revolution" which looks at 160 different identity programs using biometric technologies. By and large, it's a study of the vital social benefits to poor and disadvantaged peoples when they gain an official identity and are able to participate more fully in their countries and their markets.
The CGD report covers some of the kinks in how biometrics work in the real world, like the fact that a minority of people can be unable to enroll and they need to be subsequently treated carefully and fairly. But I feel the report takes biometric technology for granted. In contrast, independent experts have shown there is insufficient science for biometric performance to be predicted in the field. I conclude biometrics are not ready to support such major public policy initiatives as ID systems.
The state of the science of biometrics
I recently came across a weighty assessment of the science of biometrics presented by one of the gurus, Jim Wayman, and his colleagues to the NIST IBPC 2010 biometric testing conference. The paper entitled "Fundamental issues in biometric performance testing: A modern statistical and philosophical framework for uncertainty assessment" should be required reading for all biometrics planners and pundits.
Here are some important extracts:
[Technology] testing on artificial or simulated databases tells us only about the performance of a software package on that data. There is nothing in a technology test that can validate the simulated data as a proxy for the “real world”, beyond a comparison to the real world data actually available. In other words, technology testing on simulated data cannot logically serve as a proxy for software performance over large, unseen, operational datasets. [p15, emphasis added].
In a scenario test, [False Non Match Rate and False Match Rate] are given as rates averaged over total transactions. The transactions often involve multiple data samples taken of multiple persons at multiple times. So influence quantities extend to sampling conditions, persons sampled and time of sampling. These quantities are not repeatable across tests in the same lab or across labs, so measurands will be neither repeatable nor reproducible. We lack metrics for assessing the expected variability of these quantities between tests and models for converting that variability to uncertainty in measurands.[p17].
To explain, a biometric "technology test" is when a software package is exercised on a standardised data set, usually in a bake-off such as NIST's own biometric performance tests over the years. And a "scenario test" is when the biometric system is tested in the lab using actual test subjects. The meaning of the two dense sentences underlined by me in the extracts is: technology test results from one data set do not predict performance on any other data set or scenario, and biometrics practitioners still have no way to predict the accuracy of their solutions in the real world.
The authors go on:
[To] report false match and false non-match performance metrics for [iris and face recognition] without reporting on the percentage of data subjects wearing contact lenses, the period of time between collection of the compared image sets, the commercial systems used in the collection process, pupil dilation, and lighting direction is to report "nothing at all". [pp17-18].
And they conclude, amongst other things:
[False positive and false negative] measurements have historically proved to be neither reproducible nor repeatable except in very limited cases of repeated execution of the same software package against a static database on the same equipment. Accordingly, "technology" test metrics have not aligned well with "scenario" test metrics, which have in turn failed to adequately predict field performance. [p22].
The limitations of biometric testing has repeatedly been stressed by no less an authority than the US FBI. In their State-of-the-Art Biometric Excellence Roadmap (SABER) Report the FBI cautions that:
For all biometric technologies, error rates are highly dependent upon the population and application environment. The technologies do not have known error rates outside of a controlled test environment. Therefore, any reference to error rates applies only to the test in question and should not be used to predict performance in a different application. [p4.10]
The SABER report also highlighted a widespread weakness in biometric testing, namely that accuracy measurements usually only look at accidental errors:
The intentional spoofing or manipulation of biometrics invalidates the “zero effort imposter” assumption commonly used in performance evaluations. When a dedicated effort is applied toward fooling biometrics systems, the resulting performance can be dramatically different. [p1.4]
A few years ago, the Future of Identity in the Information Society Consortium ("FIDIS", a research network funded by the European Community’s Sixth Framework Program) wrote a major report on forensics and identity systems. FIDIS looked at the spoofability of many biometrics modalities in great detail (pp 28-69). These experts concluded:
Concluding, it is evident that the current state of the art of biometric devices leaves much to be desired. A major deficit in the security that the devices offer is the absence of effective liveness detection. At this time, the devices tested require human supervision to be sure that no fake biometric is used to pass the system. This, however, negates some of the benefits these technologies potentially offer, such as high-throughput automated access control and remote authentication. [p69]
Biometrics in public policy
To me, biometrics is in an appalling and astounding state of affairs. The prevailing public understanding of how these technologies work is utopian, based probably on nothing more than science fiction movies, and the myth of biometric uniqueness. In stark contrast, scientists warn there is no telling how biometrics will work in the field, and the FBI warns that bench testing doesn't predict resistance to attack. It's very much like the manufacturer of a safe confessing to a bank manager they don't know how it will stand up in an actual burglary.
This situation has bedeviled enterprise and financial services security for years. Without anyone admitting it, it's possible that the slow uptake of biometrics in retail and banking (save for Japan and their odd hand vein ATMs) is a result of hard headed security officers backing off when they look deep into the tech. But biometrics is going gang busters in the developing world, with vendors thrilling to this much bigger and faster moving market.
The stakes are so very high in national ID systems, especially in the developing world, where resistance to their introduction is relatively low, for various reasons. I'm afraid there is great potential for technological imperialism, given the historical opacity of this industry and its reluctance to engage with the issues.
To be sure vendors are not taking unfair advantage of the developing world ID market, they need to answer some questions:
- Firstly, how do they respond to Jim Wayman, the FIDIS Consortium and the FBI? Is it possible to predict how finger print readers, face recognition and iris scanners are going to operate, over years and years, in remote and rural areas?
- In particular, how good is liveness detection? Can these solutions be trusted in unattended operation for such critical missions as e-voting?
- What contingency plans are in place for biometric ID theft? Can the biometric be cancelled and reissued if compromised? Wouldn't it be catastrophic for the newly empowered identity holder to find themselves cut out of the system if their biometric can no longer be trusted?
The message is urgent. It's often yelled in prominent headlines, with an implied challenge. The new masters of the digital universe urge the masses: C'mon, get with the program! Innovate! Don't be so precious! Don't you grok that Information Wants To Be Free? Old fashioned privacy is holding us back!
The stark choice posited between privacy and digital liberation is rarely examined with much intellectual rigor. Often, "privacy is dead" is just a tired fatalistic response to the latest breach or the latest eye popping digital development, like facial recognition, or a smartphone's location monitoring. In fact, those who earnestly assert that privacy is over are almost always trying to sell us something, be it sneakers, or a political ideology, or a wanton digital business model.
Is it really too late for privacy? Is the "genie out of the bottle"? Even if we accepted the ridiculous premise that privacy is at odds with progress, no it's not too late, for a couple of reasons. Firstly, the pessimism (or barely disguised commercial opportunism) generally confuses secrecy for privacy, and secondly because frankly, we aint seen nothin yet!
Technology certainly has laid us bare. Behavioural modeling, facial recognition, Big Data mining, natural language processing and so on have given corporations X-Ray vision into our digital lives. While exhibitionism has been cultivated and normalised by the informopolists, even the most guarded social network users may be defiled by data prospectors who, without consent, upload their contact lists, pore over their photo albums, and mine their shopping histories.
So yes, a great deal about us has leaked out into what some see as an infinitely extended neo-public domain. And yet we can be public and retain our privacy at the same time.
Some people seem defeated by privacy's definitional difficulties, yet information privacy is simply framed, and corresponding data protection laws readily understood. Information privacy is basically a state where those who know us are restrained in what they can do with the knowledge they have about us. Privacy is about respect, and protecting individuals against exploitation. It is not about secrecy or even anonymity. There are few cases where ordinary people really want to be anonymous. We actually want businesses to know -- within limits -- who we are, where we are, what we've done, what we like, but we want them to respect what they know, to not share it with others, and to not take advantage of it in unexpected ways. Privacy means that organisations behave as though it's a privilege to know us. Privacy can involve businesses and governments giving up a little bit of power.
Many have come to see privacy as literally a battleground. The grassroots Cryptoparty movement came together around the heady belief that privacy means hiding from the establishment. Cryptoparties teach participants how to use Tor and PGP, and they spread a message of resistance. They take inspiration from the Arab Spring where encryption has of course been vital for the security of protestors and organisers. One Cryptoparty I attended in Sydney opened with tributes from Anonymous, and a number of recorded talks by activists who ranged across a spectrum of political issues like censorship, copyright, national security and Occupy. I appreciate where they're coming from, for the establishment has always overplayed its security hand, and run roughshod over privacy. Even traditionally moderate Western countries have governments charging like china shop bulls into web filtering and ISP data retention, all in the name of a poorly characterised terrorist threat. When governments show little sympathy for netizenship, and absolutely no understanding of how the web works, it's unsurprising that sections of society take up digital arms in response.
Yet going underground with encryption is a limited privacy stratagem, for DIY crypto is incompatible with the majority of our digital dealings. In fact the most nefarious, uncontrolled and ultimately the most dangerous privacy harms come from mainstream Internet businesses and not government. Assuming one still wants to shop online, use a credit card, tweet, and hang out on Facebook, we still need privacy protections. We need limitations on how our Personally Identifiable Information (PII) is used by all the services we deal with. We need department stores to refrain from extracting sensitive health information from our shopping habits, merchants to not use our credit card numbers as customer reference numbers, shopping malls to not track patrons by their mobile phones, and online social networks to not x-ray our photo albums by biometric face recognition.
Going out in public never was a license for others to invade our privacy. We ought not to respond to online privacy invasions as if cyberspace is a new Wild West. We have always relied on regulatory systems of consumer protection to curb the excesses of business and government, and we should insist on the same in the digital age. We should not have to hide away if privacy is agreed to mean respecting the PII of customers, users and citizens, and restraining what data custodians do with that precious resource.
I ask anyone who thinks it's too late to reassert our privacy to think for a minute about where we're heading. We're still in the early days of the social web, and the information "innovators" have really only just begun. Look at what they've done so far:
- Big Data. The most notorious recent example of the power of data mining comes from Target's covert research into identifying customers who are pregnant based on their buying habits. Big Data practitioners are so enamoured with their ability to extract secrets from "public" data they seem blithely unaware that by generating fresh PII from their raw materials they are in fact collecting it as far as Information Privacy Law is concerned. As such, they’re legally liable for the privacy compliance of their cleverly synthesised data, just as if they had expressly gathered it all by questionnaire.
As an aside, I'm not one of those who fret that technology has outstripped privacy law. Principles-based Information Privacy law copes well with most of this technology. OECD privacy principles (enacted in over 100 countries) and the US FIPPs require that companies be transarent about what PII they collect and why, and that they limit the ways in which PII is used for unrelated purposes and how it may be disclosed. These principles are decades old and yet they have been recently re-affirmed by German regulators recently over Facebook's surreptitious use of facial recognition. I expect that Siri will attract like scrutiny as it rolls out in continental Europe.
So what's next?
- Google Glass may, in the privacy stakes, surpass both Siri and facial recognition of static photos. If actions speak louder than words, imagine the value to Google of digitising and knowing exactly what we do in real time.
- Facial recognition as a Service and the sale of biometric templates may be tempting for the photo sharing sites. If and when biometric authentication spreads into retail payments and mobile device security, these systems will face the challenge of enrollment. It might be attractive to share face templates previously collected by Facebook and voice prints by Apple.
So, is it really too late for privacy? The information magnates and national security zealots may hope so, but surely even cynics will see there is great deal at stake, and that it might be just a little too soon to rush to judge something as important as this.