Second Day Reflections from CIS Monterey.
Follow along on Twitter at #CISmcc (for the Monterey Conference Centre).
The attributes push
At CIS 2013 in Napa a year ago, several of us sensed a critical shift in focus amongst the identerati - from identity to attributes. OIX launched the Attributes Exchange Network (AXN) architecture, important commentators like Andrew Nash were saying, 'hey, attributes are more interesting than identity', and my own #CISnapa talk went so far as to argue we should forget about identity altogether. There was a change in the air, but still, it was all pretty theoretical.
Twelve months on, and the Attributes push has become entirely practical. If there was a Word Cloud for the NSTIC session, my hunch is that "attributes" would dominate over "identity". Several live NSTIC pilots are all about the Attributes.
ID.me is a new company started by US military veterans, with the aim of improving access for the veterans community to discounted goods and services and other entitlements. Founders Matt Thompson and Blake Hall are not identerati -- they're entirely focused on improving online access for their constituents to a big and growing range of retailers and services, and offer a choice of credentials for proving veterans bona fides. It's central to the ID.me model that users reveal as little as possible about their personal identities, while having their veterans' status and entitlements established securely and privately.
Another NSTIC pilot Relying Party is the financial service sector infrastructure provider Broadridge. Adrian Chernoff, VP for Digital Strategy, gave a compelling account of the need to change business models to take maximum advantage of digital identity. Broadridge recently annoucned a JV with Pitney Bowes called Inlet, which will enable the secure sharing of discrete and validated attributes - like name, address and social security number - in an NSTIC compliant architecture.
Yesterday I said in my CISmcc diary that I hoped to change my mind at #CISmcc about something, and half way through Day 2, I was delighted it was already happening. I've got a new attitude about NSTIC.
Over the past six months, I had come to fear http://www.nist.gov/nstic/">NSTIC had lost its way. It's hard to judge totally accurately when lurking on the webcast from Sydney (at 4:00am) but the last plenary seemed pedestrian to me. And I'm afraid to say that some NSTIC committees have got a little testy. But today's NSTIC session here was a turning point. Not only are there a number or truly exciting pilots showing real progress, but Jeremy Grant has credible plans for improving accountability and momentum, and the new technology lead Paul Grassi is thinking outside the box and speaking out of school. The whole program seems fresh all over again.
In a packed presentation, Grassi impressed me enormously on a number of points:
- Firstly, he advocates a pragmatic NSTIC-focused extension of the old US government Authentication Guide NIST SP 800-63. Rather than a formal revision, a companion document might be most realistic. Along the way, Grassi really nailed an issue which we identity professionals need to talk about more: language. He said that there are words in 800-63 that are "never used anywhere else in systems development". No wonder, as he says, it's still "hard to implement identity"!
- Incidentally I chatted some more with Andrew Hughes about language; he is passionate about terms, and highlights that our term "Relying Party" is an especially terrible distraction for Service Providers whose reason-for-being has nothing to do with "relying" on anyone!
- Secondly, Paul Grassi wants to "get very aggressive on attributes", including emphasis on practical measurement (since that's really what NIST is all about). I don't think I need to say anything more about that than Bravo!
- And thirdly, Grassi asked "What if we got rid of LOAs?!". This kind of iconoclastic thinking is overdue, and was floated as part of a broad push to revamp the way government's orthodox thinking on Identity Assurance is translated to the business world. Grassi and Grant don't say LOAs can or should be abandoned by government, but they do see that shoving the rounded business concepts of identity into government's square hole has not done anyone much credit.
Just one small part of NSTIC annoyed me today: the persistent idea that federation hubs are inherently simpler than one-to-one authentication. They showed the following classic sort of 'before and after' shots, where it seems self-evident that a hub (here the Federal Cloud Credential Exchange FCCX) reduces complexity. The reality is that multilateral brokered arrangements between RPs and IdPs are far more complex than simple bilateral direct contracts. And moreover, the new forms of agreements are novel and untested in real world business. The time and cost and unpredictability of working out these new arrangements is not properly accounted for and has often been fatal to identity federations.
The dog barks and this time the caravan turns around
One of the top talking points at #CISmcc has of course been FIDO. The FIDO Alliance goes from strength to strength; we heard they have over 130 members now (remember it started with four or five less than 18 months ago). On Saturday afternoon there was a packed-out FIDO show case with six vendors showing real FIDO-ready products. And today there was a three hour deep dive into the two flagship FIDO protocols UAF (which enables better sharing of strong authentication signals such that passwords may be eliminated) to and U2F (which standardises and strengthens Two Factor Authentication).
FIDO's marketing messages are improving all the time, thanks to a special focus on strategic marketing which was given its own working group. In particular, the Alliance is steadily clarifying the distinction between identity and authentication, and sticking adamantly to the latter. In other words, FIDO is really all about the attributes. FIDO leaves identity as a problem to be addressed further up the stack, and dedicates itself to strengthening the authentication signal sent from end-point devices to servers.
The protocol tutorials were excellent, going into detail about how "Attestation Certificates" are used to convey the qualities and attributes of authentication hardware (such as device model, biometric modality, security certifications, elapsed time since last user verification etc) thus enabling nice fine-grained policy enforcement on the RP side. To my mind, UAF and U2F show how nature intended PKI to have been used all along!
Some confusion remains as to why FIDO has two protocols. I heard some quiet calls for UAF and U2F to converge, yet that would seem to put the elegance of U2F at risk. And it's noteworthy that U2F is being taken beyond the original one time password 2FA, with at least one biometric vendor at the showcase claiming to use it instead of the heavier UAF.
Surprising use cases
Finally, today brought more fresh use cases from cohorts of users we socially privileged identity engineers for the most part rarely think about. Another NSTIC pilot partner is AARP, a membership organization providing "information, advocacy and service" to older people, retirees and other special needs groups. AARP's Jim Barnett gave a compelling presentation on the need to extend from the classic "free" business models of Internet services, to new economically sustainable approaches that properly protect personal information. Barnett stressed that "free" has been great and 'we wouldn't be where we are today without it' but it's just not going to work for health records for example. And identity is central to that.
There's so much more I could report if I had time. But I need to get some sleep before another packed day. All this changing my mind is exhausting.
Cheers again from Monterey.
First Day Reflections from CIS Monterey.
Follow along on Twitter at #CISmcc (for the Monterey Conference Centre).
The Cloud Identity Summit really is the top event on the identity calendar. The calibre of the speakers, the relevance and currency of the material, the depth and breadth of the cohort, and the international spread are all unsurpassed. It's been great to meet old cyber-friends in "XYZ Space" at last -- like Emma Lindley from the UK and Lance Peterman. And to catch up with such talented folks like Steffen Sorensen from New Zealand once again.
A day or two before, Ian Glazer of Salesforce asked in a tweet what we were expecting to get out of CIS. And I replied that I hoped to change my mind about something. It's unnerving to have your understanding and assumptions challenged by the best in the field ... OK, sometimes it's outright embarrassing ... but that's what these events are all about. A very wise lawyer said to me once, around 1999 at the dawn of e-commerce, that he had changed his mind about authentication a few times up to that point, and that he fully expected to change his mind again and again.
I spent most of Saturday in Open Identity Foundation workshops. OIDF chair Don Thibeau enthusiastically stressed two new(ish) initiatives: Mobile Connect in conjunction with the mobile carrier trade association GSM Association @GSMA, and HIE Connect for the health sector. For the uninitiated, HIE means Health Information Exchange, namely a hub for sharing structured e-health records among hospitals, doctors, pharmacists, labs, e-health records services, allied health providers, insurers, drug & device companies, researchers and carers; for the initiated, we know there is some language somewhere in which the letters H.I.E. stand for "Not My Lifetime".
But seriously, one of the best (and pleasantly surprising) things about HIE Connect as the OIDF folks tell it, is the way its leaders unflinchingly take for granted the importance of privacy in the exchange of patient health records. Because honestly, privacy is not a given in e-health. There are champions on the new frontiers like genomics that actually say privacy may not be in the interests of the patients (or more's the point, the genomics businesses). And too many engineers in my opinion still struggle with privacy as something they can effect. So it's great -- and believe me, really not obvious -- to hear the HIE Connects folks -- including Debbie Bucci from the US Dept of Health and Human Services, and Justin Richer of Mitre and MIT -- dealing with it head-on. There is a compelling fit for the OAUTH and OIDC protocols here, with their ability to manage discrete pieces of information about users (patients) and to permission them all separately. Having said that, Don and I agree that e-health records permissioning and consent is one of the great UI/UX challenges of our time.
Justin also highlighted that the RESTful patterns emerging for fine-grained permissions management in healthcare are not confined to healthcare. Debbie added that the ability to query rare events without undoing privacy is also going to be a core defining challenge in the Internet of Things.
MyPOV: We may well see tremendous use cases for the fruits of HIE Exchange before they're adopted in healthcare!
In the afternoon, we heard from Canadian and British projects that have been working with the Open Identity Exchange (OIX) program now for a few years each.
Emma Lindley presented the work they've done in the UK Identity Assurance Program (IDAP) with social security entitlements recipients. These are not always the first types of users we think of for sophisticated IDAM functions, but in Britain, local councils see enormous efficiency dividends from speeding up the issuance of eg disabled parking permits, not to mention reducing imposters, which cost money and lead to so much resentment of the well deserved. Emma said one Attributes Exchange beta project reduced the time taken to get a 'Blue Badge' permit from 10 days to 10 minutes. She went on to describe the new "Digital Sources of Trust" initiative which promises to reconnect under-banked and under-documented sections of society with mainstream financial services. Emma told me the much-abused word "transformational" really does apply here.
MyPOV: The Digital Divide is an important issue for me, and I love to see leading edge IDAM technologies and business processes being used to do something about it -- and relatively quickly.
Then Andre Boysen of SecureKey led a discussion of the Canadian identity ecosystem, which he said has stabilised nicely around four players: Federal Government, Provincial Govt, Banks and Carriers. Lots of operations and infrastructure precedents from the payments industry have carried over.
Andre calls the smart driver license of British Columbia the convergence of "street identity and digital identity".
MyPOV: That's great news - and yet comparable jurisdictions like Australia and the USA still struggle to join governments and banks and carriers in an effective identity synthesis without creating great privacy and commercial anxieties. All three cultures are similarly allergic to identity cards, but only in Canada have they managed to supplement drivers licenses with digital identities with relatively high community acceptance. In nearly a decade, Australia has been at a standstill in its national understanding of smartcards and privacy.
For mine, the CIS Quote of the Day came from Scott Rice of the Open ID Foundation. We all know the stark problem in our industry of the under-representation of Relying Parties in the grand federated identity projects. IdPs and carriers so dominate IDAM. Scott asked us to imagine a situation where "The auto industry was driven by steel makers". Governments wouldn't put up with that for long.
Can someone give us the figures? I wonder if Identity and Access Management is already more economically ore important than cars?!
Cheers from Monterey, Day 1.
This is a re-post from Constellation's website to raise awareness for the SuperNova awards.
Call for Applications for 2014 SuperNova Awards for Leaders in Disruptive Technology
Deadline August 1, 2014
In its fourth year, the Constellation SuperNova Awards will recognize seven individuals who demonstrate true innovation through their application and adoption of new and emerging technologies. As always, we’re searching for leaders and teams who have overcome the odds to successfully apply emerging and disruptive technologies for their organizations. Special emphasis will be given to projects that seek to redefine how the enterprise uses technology on a large scale.
We’re searching for the boldest, most transformative technology projects out there. Applications will be judged by Constellation analysts and some of the most influential thought leaders in enterprise technology. If you or someone you know has what it takes to compete in the SuperNova Awards, fill out the application here: http://constellationr.com/node/2108/apply
Learn more about last year's winners:
- Consumerization of IT & The New C-Suite - Chris Plescia, IT Leader, Collaboration, Nationwide
- Matrix Commerce - Alan Hilburn, Director – IT Transportation & Operations, PSC, LLC
- Data to Decisions - Roman Coba, Chief Information Officer, McCain Foods Limited
- Digital Marketing Transformation - Karen Simmons, Senior Director, Enterprise Data Warehouse, Kelley Blue Book Co., Inc.
- Future of Work - Greg Hicks, Director IT, Social and Collaborative Innovation, UnitedHealth Group
- Next Generation Customer Experience - Pierre Bourbonniere, Head of Marketing, La Societe de transport de Montreal (STM)
- Technology Optimization & Innovaton - Don Whittington, Vice president and CIO, Florida Crystals Corporation
About the SuperNova Awards
The Constellation SuperNova Awards are the first and only awards to celebrate the leaders and teams who have overcome the odds to successfully apply emerging and disruptive technologies for their organizations. We at Constellation know advancing the adoption of disruptive technology is not easy. Disruptive technology adoption often faces resistance from supporters of the status quo, myopia, and financial constraint. We believe actors fighting these forces to champion disruptive technology within their organizations help, not only their organizations, but society as a whole to realize the potential of new and emerging technologies.
This annual search for innovators includes an all star judging panel, substantial prizes, invite-only admission and speaking opportunities at Constellation's premier innovation summit - Connected Enterprise.
Who can enter?
The awards are open to end users only. End users at vendor companies may enter the awards. Vendors and agencies may submit on their customer's behalf but must enter their customer's details and have their approval. We will disqualify any vendor applications without end user contact information.
Who should enter?
If you have overcome the odds to successfully implement a disruptive technology solution in your organization, we want to hear your story! Special attention is paid to implementation stories involving overcoming adversity and resulting in business model transformation.
Apply now: http://constellationr.com/node/2108/apply
The judging process is comprised of two phases.
- Phase I: Judging panel reviews applications to determine SuperNova Award finalists
- Phase II: Voting opens to the public. A combination of the public and judges votes will determine the winners of the SuperNova Awards. Judges votes are weighted at 75% of the total.
Winners are announced at the SuperNova Awards Gala Dinner during Connected Enterprise.
A notable list of technology thought leaders, analysts, and journalists will judge the SuperNova Awards. See the full list of judges here: http://constellationr.com/events/supernova/2014/judges
Award categories center around Constellation's business research themes. Award categories:
- Consumerization of IT & The New C-Suite
- Data to Decisions
- Digital Marketing Transformation
- Future of Work
- Matrix Commerce
- Next Generation Customer Experience
- Technology Optimization & Innovation
The SuperNova Award Winners will be announced live, on stage, at the SuperNova Awards Gala Dinner on October 29, 2014 on the first night of Constellation's Connected Enterprise.
Finalists in each category will be awarded one complimentary ticket to Constellation's Connected Enterprise.
Winners in each category will win a one-year subscription to Constellation’s Research Library.
- May 22, 2014 application process begins.
- August 1, 2014 last day for submissions.
- August 22, 2014 finalists announced and invited to Connected Enterprise.
- September 8, 2014 voting opens to the public
- October 1, 2014 polls close
- October 29, 2014 Winners announced, SuperNova Awards Gala Dinner at Connected Enterprise
Apply now: http://constellationr.com/node/2108/apply
Posted in Constellation Research
It's long been said that if you're getting something for free online, then you're not the customer, you're the product. It's a reference to the one-sided bargain for personal information that powers so many social businesses - the way that "infomopolies" as I call them exploit the knowledge they accumulate about us.
Now it's been revealed that we're even lower than product: we're lab rats.
Facebook data scientist Adam Kramer, with collaborators from UCSF and Cornell, this week reported on a study in which they tested how Facebook users respond psychologically to alternatively positive and negative posts. Their experimental technique is at once ingenious and shocking. They took the real life posts of nearly 700,000 Facebook members, and manipulated them, turning them slightly up- or down-beat. And then Kramer at al measured the emotional tone in how people reading those posts reacted in their own feeds. See Experimental evidence of massive-scale emotional contagion through social networks, Adam Kramer,Jamie Guillory & Jeffrey Hancock, in Proceedings of the National Academy of Sciences, v111.24, 17 June 2014.
The resulting scandal has been well-reported by many, including Kashmir Hill in Forbes, whose blog post nicely covers how the affair has unfolded, and includes a response by Adam Kramer himself.
Plenty has been written already about the dodgy (or non-existent) ethics approval, and the entirely contemptible claim that users gave "informed consent" to have their data "used" for research in this way. I want to draw attention here to Adam Kramer's unvarnished description of their motives. His response to the furore (provided by Hill in her blog) is, as she puts it, tone deaf. Kramer makes no attempt whatsover at a serious scientific justification for this experiment:
- "The reason we did this research is because we care about the emotional impact of Facebook and the people that use our product ... [We] were concerned that exposure to friends’ negativity might lead people to avoid visiting Facebook.
That is, this large scale psychological experiment was simply for product development.
Some apologists have, I hear, countered that social network feeds are manipulated all the time, notably by advertisers, to produce emotional responses.
Now that's interesting, because for their A-B experiment, Kramer and his colleagues took great pains to make sure the subjects were unaware of the manipulation. After all, the results would be meaningless if people knew what they were reading had been emotionally fiddled with.
In contrast, the ad industry has always insisted that today's digital consumers are super savvy, and they know the difference between advertising and real-life. Advertising is therefore styled as just a bit of harmless fun. But this line is I think further exposed by the Facebook Experiment as self-serving mythology, crafted by people who are increasingly expert at covertly manipulating perceptions, and who now have the data, collected dishonestly, to prove it.
International hotels are a fantastic target for identity thieves. Hotel databases don't just hold credit card numbers and billing addresses (which are held for weeks in advance of a stay and for weeks afterwards to cover incidentals), but for many customers the hotel also has their home address, mobile phone number, driver licence number, airline memberships and arrival flight details. And even passport number is routinely collected by hotels in Asia. It's a complete cornucopia for criminals.
And the most dangerous, most difficult to control threat vector in the hotel industry won't be war-driving or SQL injection attacks or any of the other high tech hacking tools used by organised crime. It will be the inside job. Thousands of itinerant hotel workers in every corner of the world have the opportunity to access office systems after hours, and simply download the contents of central databases to a thumb drive.
The vulnerability of hotel databases to identity thieves has clear implications for national security. I trust that counter terrorism agencies are working on this problem? These databases reveal the forward travel plans for thousands of VIPs worldwide.
We should expect that organised criminals and terrorist organisations are tapped into hotel databases as we speak, and are mining them systematically.
We live in an age where billionaires are self-made on the back of the most intangible of assets – the information they have amassed about us. That information used to be volunteered in forms and questionnaires and contracts but increasingly personal information is being observed and inferred.
The modern world is awash with data. It’s a new and infinitely re-usable raw material. Most of the raw data about us is an invisible by-product of our mundane digital lives, left behind by the gigabyte by ordinary people who do not perceive it let alone understand it.
Many Big Data and digital businesses proceed on the basis that all this raw data is up for grabs. There is a particular widespread assumption that data in the "public domain" is free-for-all, and if you’re clever enough to grab it, then you’re entitled to extract whatever you can from it.
In the webinar, I'll try to show how some of these assumptions are naive. The public is increasingly alarmed about Big Data and averse to unbridled data mining. Excessive data mining isn't just subjectively 'creepy'; it can be objectively unlawful in many parts of the world. Conventional data protection laws turn out to be surprisingly powerful in in the face of Big Data. Data miners ignore international privacy laws at their peril!
Today there are all sorts of initiatives trying to forge a new technology-privacy synthesis. They go by names like "Privacy Engineering" and "Privacy by Design". These are well meaning efforts but they can be a bit stilted. They typically overlook the strengths of conventional privacy law, and they can miss an opportunity to engage the engineering mind.
It’s not politically correct but I believe we must admit that privacy is full of contradictions and competing interests. We need to be more mature about privacy. Just as there is no such thing as perfect security, there can never be perfect privacy either. And is where the professional engineering mindset should be brought in, to help deal with conflicting requirements.
If we’re serious about Privacy by Design and Privacy Engineering then we need to acknowledge the tensions. That’s some of the thinking behind Constellation's new Big Privacy compact. To balance privacy and Big Data, we need to hold a conversation with users that respects the stresses and strains, and involves them in working through the new privacy deal.
The webinar will cover these highlights of the Big Privacy pact:
- Respect and Restraint
- Super transparency
- And a fair deal for Personal Information.
Have a disruptive technology implementation story? Get recognised for your leadership. Apply for the 2014 SuperNova Awards for leaders in disruptive technology.
For the past year, oncologists at the Memorial Sloan Kettering Cancer Centre in New York have been training IBM’s Watson – the artificial intelligence tour-de-force that beat allcomers on Jeopardy – to help personalise cancer care. The Centre explains that "combining [their] expertise with the analytical speed of IBM Watson, the tool has the potential to transform how doctors provide individualized cancer treatment plans and to help improve patient outcomes". Others are speculating already that Watson could "soon be the best doctor in the world".
I have no doubt that when Watson and things like it are available online to doctors worldwide, we will see overall improvements in healthcare outcomes, especially in parts of the world now under-serviced by medical specialists [having said that, the value of diagnosing cancer in poor developing nations is questionable if they cannot go on to treat it]. As with Google's self-driving car, we will probably get significant gains eventually, averaged across the population, from replacing humans with machines. Yet some of the foibles of computing are not well known and I think they will lead to surprises.
For all the wondrous gains made in Artificial Intelligence, where Watson now is the state-of-the art, A.I. remains algorithmic, and for that, it has inherent limitations that don't get enough attention. Computer scientists and mathematicians have know for generations that some surprisingly straightforward problems have no algorithmic solution. That is, some tasks cannot be accomplished by any universal step-by-step codified procedure. Examples include the Halting Problem and the Travelling Salesperson Problem. If these simple challenges have no algorithm, we need be more sober in our expectations of computerised intelligence.
A key limitation of any programmed algorithm is that it must make its decisions using a fixed set of inputs that are known and fully characterised (by the programmer) at design time. If you spring an unexpected input on any computer, it can fail, and yet that's what life is all about -- surprises. No mathematician seriously claims that what humans do is somehow magic; most believe we are computers made of meat. Nevertheless, when paradoxes like the Halting Problem abound, we can be sure that computing and cognition are not what they seem. We should hope these conundrums are better understood before putting too much faith in computers doing deep human work.
And yet, predictably, futurists are jumping ahead to imagine "Watson apps" in which patients access the supercomputer for themselves. Even if there were reliable algorithms for doctoring, I reckon the "Watson app" is a giant step, because of the complex way the patient's conditions are assessed and data is gathered for the diagnosis. That is, the taking of the medical history.
In these days of billion dollar investments in electronic health records (EHRs), we tend to think that medical decisions are all about the data. When politicians announce EHR programs they often boast that patients won't have to go through the rigmarole of giving their history over and over again to multiple doctors as they move through an episode of care. This is actually a serious misunderstanding of the importance in clinical decision-making of the interaction between medico and patient when the history is taken. It's subtle. The things a patient chooses to tell, the things they seem to be hiding, and the questions that make them anxious, all guide an experienced medico when taking a history, and provide extra cues (metadata if you will) about the patient’s condition.
Now, Watson may well have the ability to navigate this complexity and conduct a very sophisticated Q&A. It will certainly have a vastly bigger and more reliable memory of cases than any doctor, and with that it can steer a dynamic patient questionnaire. But will Watson be good enough to be made available direct to patients through an app, with no expert human mediation? Or will a host of new input errors result from patients typing their answers into a smart phone or speaking into a microphone, without any face-to-face subtlety (let alone human warmth)? It was true of mainframes and it’s just as true of the best A.I.: Bulldust in, bulldust out.
Finally, Watson's existing linguistic limitations are not to be underestimated. It is surely not trivial that Watson struggles with puns and humour. Futurist Mark Pesce when discussing Watson remarked in passing that scientists don’t understand the "quirks of language and intelligence" that create humour. The question of what makes us laugh does in fact occupy some of the finest minds in cognitive and social science. So we are a long way from being able to mechanise humour. And this matters because for the foreseeable future, it puts a great deal of social intercourse beyond AI's reach.
In between the extremes of laugh-out-loud comedy and a doctor’s dry written notes lies a spectrum of expressive subtleties, like a blush, an uncomfortable laugh, shame, and the humiliation that goes with some patients’ lived experience of illness. Watson may understand the English language, but does it understand people?
Watson can answer questions, but good doctors ask a lot of questions too. When will this amazing computer be able to hold the sort of two-way conversation that we would call a decent "bedside manner"?
Have a disruptive technology implementation story? Get recognised for your leadership. Apply for the 2014 SuperNova Awards for leaders in disruptive technology.
The latest Snowden revelations include the NSA's special programs for extracting photos and identifying from the Internet. Amongst other things the NSA uses their vast information resources to correlate location cues in photos -- buildings, streets and so on -- with satellite data, to work out where people are. They even search especially for passport photos, because these are better fodder for facial recognition algorithms. The audacity of these government surveillance activities continues to surprise us, and their secrecy is abhorrent.
Yet an ever greater scale of private sector surveillance has been going on for years in social media. With great pride, Facebook recently revealed its R&D in facial recognition. They showcased the brazenly named "DeepFace" biometric algorithm, which is claimed to be 97% accurate in recognising faces from regular images. Facebook has made a swaggering big investment in biometrics.
Data mining needs raw material, there's lots of it out there, and Facebook has been supremely clever at attracting it. It's been suggested that 20% of all photos now taken end up in Facebook. Even three years ago, Facebook held 10,000 times as many photographs as the Library of Congress:
And Facebook will spend big buying other photo lodes. Last year they tried to buy Snapchat for the spectacular sum of three billion dollars. The figure had pundits reeling. How could a start-up company with 30 people be worth so much? All the usual dot com comparisons were made; the offer seemed a flight of fancy.
But no, the offer was a rational consideration for the precious raw material that lies buried in photo data.
Snapchat generates at least 100 million new images every day. Three billion dollars was, pardon me, a snap. I figure that at a ballpark internal rate of return of 10%, a $3B investment is equivalent to $300M p.a. so even if the Snapchat volume stopped growing, Facebook would have been paying one cent for every new snap, in perpetuity.
These days, we have learned from Snowden and the NSA that communications metadata is just as valuable as the content of our emails and phone calls. So remember that it's the same with photos. Each digital photo comes from a device that embeds within the image metadata usually including the time and place of when the picture was taken. And of course each Instagram or Snapchat is a social post, sent by an account holder with a history and rich context in which the image yields intimate real time information about what they're doing, when and where.
- When you access or use our Services, we automatically collect information about you, including:
- Usage Information: When you send or receive messages via our Services, we collect information about these messages, including the time, date, sender and recipient of the Snap. We also collect information about the number of messages sent and received between you and your friends and which friends you exchange messages with most frequently.
- Log Information: We log information about your use of our websites, including your browser type and language, access times, pages viewed, your IP address and the website you visited before navigating to our websites.
- Device Information: We may collect information about the computer or device you use to access our Services, including the hardware model, operating system and version, MAC address, unique device identifier, phone number, International Mobile Equipment Identity ("IMEI") and mobile network information. In addition, the Services may access your device's native phone book and image storage applications, with your consent, to facilitate your use of certain features of the Services.
Snapchat goes on to declare it may use any of this information to "personalize and improve the Services and provide advertisements, content or features that match user profiles or interests" and it reserves the right to share any information with "vendors, consultants and other service providers who need access to such information to carry out work on our behalf".
So back to the data mining: nothing stops Snapchat -- or a new parent company -- running biometric facial recognition over the snaps as they pass through the servers, to extract additional "profile" information. And there's an extra kicker that makes Snapchats extra valuable for biometric data miners. The vast majority of Snapchats are selfies. So if you extract a biometric template from a snap, you already know who it belongs to, without anyone having to tag it. Snapchat would provide a hundred million auto-calibrations every day for facial recognition algorithms! On Facebook, the privacy aware turn off photo tagging, but with Snapchats, self identification is inherent to the experience and is unlikely to be ever be disabled.
As I've discussed before, the morbid thrill of Snowden's spying revelations has tended to overshadow his sober observations that when surveillance by the state is probably inevitable, we need to be discussing accountability.
While we're all ventilating about the NSA, it's time we also attended to private sector spying and properly debated the restraints that may be appropriate on corporate exploitation of social data.
Personally I'm much more worried that an infomopoly has all my selfies.
Have a disruptive technology implementation story? Get recognised for your leadership. Apply for the 2014 SuperNova Awards for leaders in disruptive technology.
My Constellation Research colleague Alan Lepofsky as been working on new ways to characterise users in cyberspace. Frustrated with the oversimplified cliche of the "Digital Millennials", Alan has developed a fresh framework for categorizing users according to their comfort with technology and their actual knowledge of it. See his new research report "Segmenting Audiences by Digital Proficiency".
This sort of schema could help frame the answers to some vital open questions. In today's maelstrom of idealism and hyperbole, we're struggling to predict how things are going to turn out, and to build appropriate policies and management structures. We are still guessing how the digital revolution is really going to change the human condition? We're not yet rigorously measuring the sorts of true changes, if any, that the digital transformation is causing.
We hold such disparate views about cyberspace right now. When the Internet does good – for example through empowering marginalized kids at schools, fueling new entrepreneurship, or connecting disadvantaged communities – it is described as a power for good, a true "paradigm shift". But when it does bad – as when kids are bullied online or when phishing scams hook inexperienced users – then the Internet is said to be just another communications medium. Such inconsistent attitudes are with us principally because the medium is still so new. Yet we all know how important it is, and that far reaching policy decisions are being made today. So it’s good to see new conceptual frameworks for analyzing the range of ways that people engage with and utilise the Internet.
Vast fortunes are being made through online business models that purport to feed a natural hunger to be social. With its vast reach and zero friction, the digital medium might radically amplify aspects of the social drive, quite possibly beyond what nature intended. As supremely communal beings, we humans have evolved elaborate social bearings for getting on in diverse groups, and we've built social conventions that govern how we meet, collaborate, refer, partner, buy and sell, amalgamate, marry, and split. We are incredibly adept at reading body language, spotting untruths, and gaming each other for protection or for personal advantage. In cyberspace, few of the traditional cues are available to us; we literally lose our bearings online. And therefore naive Internet users fall prey to spam, fake websites and all manner of scams.
How are online users adapting to their new environment and evolving new instincts? I expect there will be interesting correlations between digital resilience and the sophistication measures in Alan’s digital proficiency framework. We might expect Digital Natives to be better equipped inherently to detect and respond to online threats, although they might be somewhat more at risk by virtue of being more active. I wonder too if the risk-taking behavior which exacerbates some online risks for adolescents would be relatively more common amongst Digital Immigrants? By the same token, the Digital Skeptics who are knowledgeable yet uncomfortable may be happy staying put in that quadrant, or only venturing out for selected cyber activities, because they’re consciously managing their digital exposure.
We certainly do need new ways like Alan's Digital Proficiency Framework to understand society’s complex "Analog to Digital" conversion. I commend it to you.
I've just completed a major new Constellation Research report looking at how today's privacy practices cope with Big Data. The report draws together my longstanding research on the counter-intuitive strengths of technology-neutral data protection laws, and melds it with my new Constellation colleagues' vast body of work in data analytics. The synergy is honestly exciting and illuminating.
Big Data promises tremendous benefits for a great many stakeholders but the potential gains are jeopardised by the excesses of a few. Some cavalier online businesses are propelled by a naive assumption that data in the "public domain" is up for grabs, and with that they often cross a line.
For example, there are apps and services now that will try to identify pictures you take of strangers in public, by matching them biometrically against data supersets compiled from social networking sites and other publically accessible databases. Many find such offerings quite creepy but they may be at a loss as to what to do about it, or even how to think through the issues objectively. Yet the very metaphor of data mining holds some of the clues. If, as some say, raw data is like crude oil, just waiting to be mined and exploited by enterprising prospecters, then surely there are limits, akin to mining permits?
Many think the law has not kept pace with technology, and that digital innovators are free to do what they like with any data they can get their hands on. But technologists repreatedly underestimate the strength of conventional data protection laws and regulations. The extraction of PII from raw data may be interpreted under technology neutral privacy principles as an act of Collection and as such is subject to existing statutes. Around the world, Google thus found they are not actually allowed to gather Personal Data that happens to be available in unencrypted Wi-Fi transmission as StreetView cars drive by homes and offices. And Facebook found they are not actually allowed to automatically identify people in photos through face recognition without consent. And Target probably would find, if they tried it outside the USA, that they cannot flag selected female customers as possibly pregnant by analysing their buying habits.
On the other hand, orthodox privacy policies and static user agreements do not cater for the way personal data can be conjured tomorrow from raw data collected today. Traditional privacy regimes require businesses to spell out what personally identifiable information (PII) they collect and why, and to restrict secondary usage. Yet with Big Data, with the best will in the world, a company might not know what data analytics will yield down the track. If mutual benefits for business and customer alike might be uncovered, a freeze-frame privacy arrangement may be counter-productive.
Thus the fit between data analytics and data privacy standards is complex and sometimes surprising. While existing laws are not to be underestimated, we do need something new. As far as I know it was Ray Wang in his Harvard Business Review blog who first called for a fresh privacy compact amongst users and businesses.
The spirit of data privacy is simply framed: organisations that know us should respect the knowledge they have, they should be open about what they know, and they should be restrained in what they do with it. In the Age of Big Data, let's have businesses respect the intelligence they extract from data mining, just as they should respect the knowledge they collect directly through forms and questionnaires.
I like the label "Big Privacy"; it is grandly optimistic, like "Big Data" itself, and at the same time implies a challenge to do better than regular privacy practices.
Ontario Privacy Commissioner Dr Ann Cavoukian writes about Big Privacy, describing it simply as "Privacy By Design writ large". But I think there must be more to it than that. Big Data is quantitatively but also qualitatively different from ordinary data analyis.
To summarise the basic elements of a Big Data compact:
- Respect and Restraint: In the face of Big Data’s temptations, remember that privacy is not only about what we do with PII; just as important is what we choose not to do.
- Super transparency: Who knows what lies ahead in Big Data? If data privacy means being open about what PII is collected and why, then advanced privacy means going further, telling people more about the business models and the sorts of results data mining is expected to return.
- Engage customers in a fair deal for PII: Information businesses ought to set out what PII is really worth to them (especially when it is extracted in non-obvious ways from raw data) and offer a fair "price" for it, whether in the form of "free" products and services, or explicit payment.
- Really innovate in privacy: There’s a common refrain that “privacy hampers innovation” but often that's an intellectually lazy cover for reserving the right to strip-mine PII. Real innovation lies in business practices which create and leverage PII while honoring privacy principles.
My report, "Big Privacy" Rises to the Challenges of Big Data may be downloaded from the Constellation Research website.