Lockstep

Mobile: +61 (0) 414 488 851
Email: swilson@lockstep.com.au

Digital Disruption - Melbourne

Ray Wang tells us now that writing a book and launching a company are incredibly fulfilling things to do - but ideally, not at the same time. He thought it would take a year to write "Disrupting Digital Business", but since it overlapped with building Constellation Research, it took three! But at the same time, his book is all the richer for that experience.

Ray is on a world-wide book tour (tweeting under the hash tag #cxotour). I was thrilled to participate in the Melbourne leg last week. We convened a dinner at Melbourne restaurant The Deck and were joined by a good cross section of Australian private and public sector businesses. There were current and recent executives from Energy Australia, Rio Tinto, the Victorian Government and Australia Post among others, plus the founders of several exciting local start-ups. And we were lucky to have special guests Brian Katz and Ben Robbins - two renowned mobility gurus.

The format for all the launch events has one or two topical short speeches from Constellation analysts and Associates, and a fireside chat by Ray. In Melbourne, we were joined by two of Australia's deep digital economy experts, Gavin Heaton and Joanne Jacobs. Gavin got us going on the night, surveying the importance of innovation, and the double edged opportunities and threats of digital disruption.

Then Ray spoke off-the-cuff about his book, summarising years of technology research and analysis, and the a great many cases of business disruption, old and new. Ray has an encyclopedic grasp of tech-driven successes and failures going back decades, yet his presentations are always up-to-the-minute and full of practical can-do calls to action. He's hugely engaging, and having him on a small stage for a change lets him have a real conversation with the audience.

Speaking with no notes and PowerPoint-free, Ray ranged across all sorts of disruptions in all sorts of sectors, including:


  • Sony's double cassette Walkman (which Ray argues playfully was their "last innovation")
  • Coca Cola going digital, and the speculative "ten cent sip"
  • the real lesson of the iPhone: geeks spend time arguing about whether Apple's technology is original or appropriated, when the point is their phone disrupted 20 or more other business models
  • the contrasting Boeing 787 Dreamliner and Airbus A380 mega jumbo - radically different ways to maximise the one thing that matters to airlines: dollars per passenger-miles, and
  • Uber, which observers don't always fully comprehend as a rich mix of mobility, cloud and Big Data.

And I closed the scheduled part of the evening with a provocation on privacy. I asked the group to think about what it means to call any online business practice "creepy". Have community norms and standards really changed in the move online? What's worse: government surveillance for political ends, or private sector surveillance for profit? If we pay for free online services with our personal information, do regular consumers understand the bargain? And if cynics have been asking "Is Privacy Dead?" for over 100 years, doesn't it mean the question is purely rhetorical? Who amongst us truly wants privacy to be over?!

The discussion quickly attained a life of its own - muscular, but civilized. And it provided ample proof that whatever you think about privacy, it is complicated and surprising, and definitely disruptive! (For people who want to dig further into the paradoxes of modern digital privacy, Ray and I recently recorded a nice long chat about it).

Here are some of the Digital Disruption tour dates coming up:

Enjoy!

Posted in Social Media, Privacy, Internet, Constellation Research, Cloud, Big Data

You can de-identify but you can't hide

Acknowledgement: Daniel Barth-Jones kindly engaged with me after this blog was initially published, and pointed out several significant factual errors, for which I am grateful.

In 2014, the New York Taxi & Limousine Company (TLC) released a large "anonymised" dataset containing 173 million taxi rides taken in 2013. Soon after, software developer Vijay Pandurangan managed to undo the hashed taxi registration numbers. Subsequently, privacy researcher Anthony Tockar went on to combine public photos of celebrities getting in or out of cabs, to recreate their trips. See Anna Johnston's analysis here.

This re-identification demonstration has been used by some to bolster a general claim that anonymity online is increasingly impossible.

On the other hand, medical research advocates like Columbia University epidemiologist Daniel Barth-Jones argue that the practice of de-identification can be robust and should not be dismissed as impractical on the basis of demonstrations such as this. The identifiability of celebrities in these sorts of datasets is a statistical anomaly reasons Barth-Jones and should not be used to frighten regular people out of participating in medical research on anonymised data. He wrote in a blog that:

    • "However, it would hopefully be clear that examining a miniscule proportion of cases from a population of 173 million rides couldn’t possibly form any meaningful basis of evidence for broad assertions about the risks that taxi-riders might face from such a data release (at least with the taxi medallion/license data removed as will now be the practice for FOIL request data)."

As a health researcher, Barth-Jones is understandably worried that re-identification of small proportions of special cases is being used to exaggerate the risks to ordinary people. He says that the HIPAA de-identification protocols if properly applied leave no significant risk of re-id. But even if that's the case, HIPAA processes are not applied to data across the board. The TLC data was described as "de-identified" and the fact that any people at all (even stand-out celebrities) could be re-identified from data does create a broad basis for concern - "de-identified" is not what it seems. Barth-Jones stresses that in the TLC case, the de-identification was fatally flawed [technically: it's no use hashing data like registration numbers with limited value ranges because the hashed values can be reversed by brute force] but my point is this: who among us who can tell the difference between poorly de-identified and "properly" de-identified?

And how long can "properly de-identified" last? What does it mean to say casually that only a "minuscule proportion" of data can be re-identified? In this case, the re-identification of celebrities was helped by the fact lots of photos of them are readily available on social media, yet there are so many photos in the public domain now, regular people are going to get easier to be identified.

But my purpose here is not to play what-if games, and I know Daniel advocates statistically rigorous measures of identifiability. We agree on that -- in fact, over the years, we have agreed on most things. The point I am trying to make in this blog post is that, just as nobody should exaggerate the risk of re-identification, nor should anyone play it down. Claims of de-identification are made almost daily for really crucial datasets, like compulsorily retained metadata, public health data, biometric templates, social media activity used for advertising, and web searches. Some of these claims are made with statistical rigor, using formal standards like the HIPAA protocols; but other times the claim is casual, made with no qualification, with the aim of comforting end users.

"De-identified" is a helluva promise to make, with far-reaching ramifications. Daniel says de-identification researchers use the term with caution, knowing there are technical qualifications around the finite probability of individuals remaining identifiable. But my position is that the fine print doesn't translate to the general public who only hear that a database is "anonymous". So I am afraid the term "de-identified" is meaningless outside academia, and in casual use is misleading.

Barth-Jones objects to the conclusion that "it's virtually impossible to anonymise large data sets" but in an absolute sense, that claim is surely true. If any proportion of people in a dataset may be identified, then that data set is plainly not "anonymous". Moreover, as statistics and mathematical techniques (like facial recognition) improve, and as more ancillary datasets (like social media photos) become accessible, the proportion of individuals who may be re-identified will keep going up.

[Readers who wish to pursue these matters further should look at the recent Harvard Law School online symposium on "Re-identification Demonstrations", hosted by Michelle Meyer, in which Daniel Barth-Jones and I participated, among many others.]

Both sides of this vexed debate need more nuance. Privacy advocates have no wish to quell medical research per se, nor do they call for absolute privacy guarantees, but we do seek full disclosure of the risks, so that the cost-benefit equation is understood by all. One of the obvious lessons in all this is that "anonymous" or "de-identified" on their own are poor descriptions. We need tools that meaningfully describe the probability of re-identification. If statisticians and medical researchers take "de-identified" to mean "there is an acceptably small probability, namely X percent, of identification" then let's have that fine print. Absent the detail, lay people can be forgiven for thinking re-identification isn't going to happen. Period.

And we need policy and regulatory mechanisms to curb inappropriate re-identification. Anonymity is a brittle, essentially temporary, and inadequate privacy tool.

I argue that the act of re-identification ought to be treated as an act of Algorithmic Collection of PII, and regulated as just another type of collection, albeit an indirect one. If a statistical process results in a person's name being added to a hitherto anonymous record in a database, it is as if the data custodian went to a third party and asked them "do you know the name of the person this record is about?". The fact that the data custodian was clever enough to avoid having to ask anyone about the identity of people in the re-identified dataset does not alter the privacy responsibilities arising. If the effect of an action is to convert anonymous data into personally identifiable information (PII), then that action collects PII. And in most places around the world, any collection of PII automatically falls under privacy regulations.

It looks like we will never guarantee anonymity, but the good news is that for privacy, we don't actually need to. Privacy is the protection you need when you affairs are not anonymous, for privacy is a regulated state where organisations that have knowledge about you are restrained in what they do with it. Equally, the ability to de-anonymise should be restricted in accordance with orthodox privacy regulations. If a party chooses to re-identify people in an ostensibly de-identified dataset, without a good reason and without consent, then that party may be in breach of data privacy laws, just as they would be if they collected the same PII by conventional means like questionnaires or surveillance.

Surely we can all agree that re-identification demonstrations serve to shine a light on the comforting claims made by governments for instance that certain citizen datasets can be anonymised. In Australia, the government is now implementing telecommunications metadata retention laws, in the interests of national security; the metadata we are told is de-identified and "secure". In the UK, the National Health Service plans to make de-identified patient data available to researchers. Whatever the merits of data mining in diverse fields like law enforcement and medical research, my point is that any government's claims of anonymisation must be treated critically (if not skeptically), and subjected to strenuous and ongoing privacy impact assessment.

Privacy, like security, can never be perfect. Privacy advocates must avoid giving the impression that they seek unrealistic guarantees of anonymity. There must be more to privacy than identity obscuration (to use a more technically correct term than "de-identification"). Medical research should proceed on the basis of reasonable risks being taken in return for beneficial outcomes, with strong sanctions against abuses including unwarranted re-identification. And then there wouldn't need to be a moral panic over re-identification if and when it does occur, because anonymity, while highly desirable, is not essential for privacy in any case.

Posted in Social Media, Privacy, Identity, e-health, Big Data

Identity Management Moves from Who to What

The State Of Identity Management in 2015

Constellation Research recently launched the "State of Enterprise Technology" series of research reports. These assess the current enterprise innovations which Constellation considers most crucial to digital transformation, and provide snapshots of the future usage and evolution of these technologies.

My second contribution to the state-of-the-state series is "Identity Management Moves from Who to What". Here's an excerpt from the report:

Introduction

In spite of all the fuss, personal identity is not usually important in routine business. Most transactions are authorized according to someone’s credentials, membership, role or other properties, rather than their personal details. Organizations actually deal with many people in a largely impersonal way. People don’t often care who someone really is before conducting business with them. So in digital Identity Management (IdM), one should care less about who a party is than what they are, with respect to attributes that matter in the context we’re in. This shift in focus is coming to dominate the identity landscape, for it simplifies a traditionally multi-disciplined problem set. Historically, the identity management community has made too much of identity!

Six Digital Identity Trends for 2015

SoS IdM Summary Pic

1. Mobile becomes the center of gravity for identity. The mobile device brings convergence for a decade of progress in IdM. For two-factor authentication, the cell phone is its own second factor, protected against unauthorized use by PIN or biometric. Hardly anyone ever goes anywhere without their mobile - service providers can increasingly count on that without disenfranchising many customers. Best of all, the mobile device itself joins authentication to the app, intimately and seamlessly, in the transaction context of the moment. And today’s phones have powerful embedded cryptographic processors and key stores for accurate mutual authentication, and mobile digital wallets, as Apple’s Tim Cook highlighted at the recent White House Cyber Security Summit.

2. Hardware is the key – and holds the keys – to identity. Despite the lure of the cloud, hardware has re-emerged as pivotal in IdM. All really serious security and authentication takes place in secure dedicated hardware, such as SIM cards, ATMs, EMV cards, and the new Trusted Execution Environment mobile devices. Today’s leading authentication initiatives, like the FIDO Alliance, are intimately connected to standard cryptographic modules now embedded in most mobile devices. Hardware-based identity management has arrived just in the nick of time, on the eve of the Internet of Things.

3. The “Attributes Push” will shift how we think about identity. In the words of Andrew Nash, CEO of Confyrm Inc. (and previously the identity leader at PayPal and Google), “Attributes are at least as interesting as identities, if not more so.” Attributes are to identity as genes are to organisms – they are really what matters about you when you’re trying to access a service. By fractionating identity into attributes and focusing on what we really need to reveal about users, we can enhance privacy while automating more and more of our everyday transactions.

The Attributes Push may recast social logon. Until now, Facebook and Google have been widely tipped to become “Identity Providers”, but even these giants have found federated identity easier said than done. A dark horse in the identity stakes – LinkedIn – may take the lead with its superior holdings in verified business attributes.

4. The identity agenda is narrowing. For 20 years, brands and organizations have obsessed about who someone is online. And even before we’ve solved the basics, we over-reached. We've seen entrepreneurs trying to monetize identity, and identity engineers trying to convince conservative institutions like banks that “Identity Provider” is a compelling new role in the digital ecosystem. Now at last, the IdM industry agenda is narrowing toward more achievable and more important goals - precise authentication instead of general identification.

Digital Identity Stack (3 1)

5. A digital identity stack is emerging. The FIDO Alliance and others face a challenge in shifting and improving the words people use in this space. Words, of course, matter, as do visualizations. IdM has suffered for too long under loose and misleading metaphors. One of the most powerful abstractions in IT was the OSI networking stack. A comparable sort of stack may be emerging in IdM.

6. Continuity will shape the identity experience. Continuity will make or break the user experience as the lines blur between real world and virtual, and between the Internet of Computers and the Internet of Things. But at the same time, we need to preserve clear boundaries between our digital personae, or else privacy catastrophes await. “Continuous” (also referred to as “Ambient”) Authentication is a hot new research area, striving to provide more useful and flexible signals about the instantaneous state of a user at any time. There is an explosion in devices now that can be tapped for Continuous Authentication signals, and by the same token, rich new apps in health, lifestyle and social domains, running on those very devices, that need seamless identity management.

A snapshot at my report "Identity Moves from Who to What" is available for download at Constellation Research. It expands on the points above, and sets out recommendations for enterprises to adopt the latest identity management thinking.

Posted in Trust, Social Networking, Security, Privacy, Identity, Federated Identity, Constellation Research, Biometrics, Big Data

Correspondence in Nature magazine

I had a letter to the editor published in Nature on big data and privacy.

Data protection: Big data held to privacy laws, too

Stephen Wilson
Nature 519, 414 (26 March 2015) doi:10.1038/519414a
Published online 25 March 2015

Letter as published

Privacy issues around data protection often inspire over-engineered responses from scientists and technologists. Yet constraints on the use of personal data mean that privacy is less about what is done with information than what is not done with it. Technology such as new algorithms may therefore be unnecessary (see S. Aftergood, Nature 517, 435–436; 2015).

Technology-neutral data-protection laws afford rights to individuals with respect to all data about them, regardless of the data source. More than 100 nations now have such data-privacy laws, typically requiring organizations to collect personal data only for an express purpose and not to re-use those data for unrelated purposes.

If businesses come to know your habits, your purchase intentions and even your state of health through big data, then they have the same privacy responsibilities as if they had gathered that information directly by questionnaire. This is what the public expects of big-data algorithms that are intended to supersede cumbersome and incomplete survey methods. Algorithmic wizardry is not a way to evade conventional privacy laws.

Stephen Wilson
Constellation Research, Sydney, Australia.
steve@constellationr.com

Posted in Science, Privacy, Big Data

The Google Advisory Council

In May 2014, the European Court of Justice (ECJ) ruled that under European law, people have the right to have certain information about them delisted from search engine results. The ECJ ruling was called the "Right to be Forgotten", despite it having little to do with forgetting (c'est la vie). Shortened as RTBF, it is also referred to more clinically as the "Right to be Delisted" (or simply as "Google Spain" because that was one of the parties in the court action). Within just a few months, the RTBF has triggered conferences, public debates, and a TEDx talk.

Google itself did two things very quickly in response to the RTBF ruling. First, it mobilised a major team to process delisting requests. This is no mean feat -- over 200,000 requests have been received to date; see Google's transparency report. However it's not surprising they got going so quickly as they already have well-practiced processes for take-down notices for copyright and unlawful material.

Secondly, the company convened an Advisory Council of independent experts to formulate strategies for balancing the competing rights and interests bound up in RTBF. The Advisory Council delivered its report in January; it's available online here.

I declare I'm a strong supporter of RTBF. I've written about it here and here, and participated in an IEEE online seminar. I was impressed by the intellectual and eclectic make-up of the Council, which includes a past European Justice Minister, law professors, and a philosopher. And I do appreciate that the issues are highly complex. So I had high expectations of the Council's report.

Yet I found it quite barren.

Recap - the basics of RTBF

EU Justice Commissioner Martine Reicherts in a speech last August gave a clear explanation of the scope of the ECJ ruling, and acknowledged its nuances. Her speech should be required reading. Reicherts summed up the situation thus:

    • What did the Court actually say on the right to be forgotten? It said that individuals have the right to ask companies operating search engines to remove links with personal information about them – under certain conditions - when information is inaccurate, inadequate, irrelevant, outdated or excessive for the purposes of data processing. The Court explicitly ruled that the right to be forgotten is not absolute, but that it will always need to be balanced against other fundamental rights, such as the freedom of expression and the freedom of the media – which, by the way, are not absolute rights either.

High tension

Everyone concerned acknowledges there are tensions in the RTBF ruling. The Google Advisory Council Report mentions these tensions (in Section 3) but sadly spends no time critically exploring them. In truth, all privacy involves conflicting requirements, and to that extent, many features of RTBF have been seen before. At p5, the Report mentions that "the [RTBF] Ruling invokes a data subject’s right to object to, and require cessation of, the processing of data about himself or herself" (emphasis added); the reader may conclude, as I have, that the computing of search results by a search engine is just another form of data processing.

One of the most important RTBF talking points is whether it's fair that Google is made to adjudicate delisting requests. I have some sympathies for Google here, and yet this is not an entirely novel situation in privacy. A standard feature of international principles-based privacy regimes is the right of individuals to have erroneous personal data corrected (this is, for example, OECD Privacy Principle No. 7 - Individual Participation, and Australian Privacy Principle No. 13 - Correction of Personal Information). And at the top of p5, the Council Report cites the right to have errors rectified. So it is standard practice that a data custodian must have means for processing access and correction requests. Privacy regimes expect there to be dispute resolution mechanisms too, operated by the company concerned. None of this is new. What seems to be new to some stakeholders is the idea that the results of a search engine is just another type of data processing.

A little rushed

The Council explains in the Introduction to the Report that it had to work "on an accelerated timeline, given the urgency with which Google had to begin complying with the Ruling once handed down". I am afraid that the Report shows signs of being a little rushed.


  • There are several spelling errors.
  • The contributions from non English speakers could have done with some editing.
  • Less trivially, many of the footnotes need editing; it's not always clear how a person's footnoted quote supports the text.
  • More importantly, the Advisory Council surely operated with Terms of Reference, yet there is no clear explanation of what those were. At the end of the introduction, we're told the group was "convened to advise on criteria that Google should use in striking a balance, such as what role the data subject plays in public life, or whether the information is outdated or no longer relevant. We also considered the best process and inputs to Google’s decision making, including input from the original publishers of information at issue, as potentially important aspects of the balancing exercise." I'm surprised there is not a more complete and definitive description of the mission.
  • It's not actually clear what sort of search we're all talking about. Not until p7 of the Report does the qualified phrase "name-based search" first appear. Are there other types of search for which the RTBF does not apply?
  • Above all, it's not clear that the Council has reached a proper conclusion. The Report makes a number of suggestions in passing, and there is a collection of "ideas" at the back for improving the adjudication process, but there is no cogent set of recommendations. That may be because the Council didn't actually reach consensus.

And that's one of the most surprising things about the whole exercise. Of the eight independent Council members, five of them wrote "dissenting opinions". The work of an expert advisory committee is not normally framed as a court-like determination, from which members might dissent. And even if it was, to have the majority of members "dissent" casts doubt on the completeness or even the constitution of the process. Is there anything definite to be dissented from?

Jimmy Wales, the Wikipedia founder and chair, was especially strident in his individual views at the back of the Report. He referred to "publishers whose works are being suppressed" (p27 of the Report), and railed against the report itself, calling its recommendation "deeply flawed due to the law itself being deeply flawed". Can he mean the entire Charter of Fundamental Rights of the EU and European Convention on Human Rights? Perhaps Wales is the sort of person that denies there are any nuances in privacy, because "suppressed" is an exaggeration if we accept that RTBF doesn't cause anything to be forgotten. In my view, it poisons the entire effort when unqualified insults are allowed to be hurled at the law. If Wales thinks so little of the foundation of both the ECJ ruling and the Advisory Council, he might have declined to take part.

A little hollow

Strangely, the Council's Report is altogether silent on the nature of search. It's such a huge part of their business that I have to think the strength of Google's objection to RTBF is energised by some threat it perceives to its famously secret algorithms.

The Google business was founded on its superior Page Rank search method, and the company has spent fantastic funds on R&D, allowing it to keep a competitive edge for a very long time. And the R&D continues. Curiously, just as everyone is debating RTBF, Google researchers published a paper about a new "knowledge based" approach to evaluating web pages. Surely if page ranking was less arbitrary and more transparent, a lot of the heat would come out of RTBF.

Of all the interests to balance in RTBF, Google's business objectives are actually a legitimate part of the mix. Google provides marvelous free services in exchange for data about its users which it converts into revenue, predominantly through advertising. It's a value exchange, and it need not be bad for privacy. A key component of privacy is transparency: people have a right to know what personal information about them is collected, and why. The RTBF analysis seems a little hollow without frank discussion of what everyone gets out of running a search engine.

Further reading

Posted in Social Media, Privacy, Internet, Big Data

Free search.

Search engines are wondrous things. I myself use Google search umpteen times a day. I don't think I could work or play without it anymore. And yet I am a strong supporter of the contentious "Right to be Forgotten". The "RTBF" is hotly contested, and I am the first to admit it's a messy business. For one thing, it's not ideal that Google itself is required for now to adjudicate RTBF requests in Europe. But we have to accept that all of privacy is contestable. The balance of rights to privacy and rights to access information is tricky. RTBF has a long way to go, and I sense that European jurors and regulators are open and honest about this.

One of the starkest RTBF debating points is free speech. Does allowing individuals to have irrelevant, inaccurate and/or outdated search results blocked represent censorship? Is it an assault on free speech? There is surely a technical-legal question about whether the output of an algorithm represents "free speech", and as far as I can see, that question remains open. Am I the only commentator suprised by this legal blind spot? I have to say that such uncertainty destabilises a great deal of the RTBF dispute.

I am not a lawyer, but I have a strong sense that search outputs are not the sort of thing that constitutes speech. Let's bear in mind what web search is all about.

Google search is core to its multi-billion dollar advertising business. Search results are not unfiltered replicas of things found in the public domain, but rather the subtle outcome of complex Big Data processes. Google's proprietary search algorithm is famously secret, but we do know how sensitive it is to context. Most people will have noticed that search results change day by day and from place to place. But why is this?

When we enter search parameters, the result we get is actually Google's guess about what we are really looking for. Google in effect forms a hypothesis, drawing on much more than the express parameters, including our search history, browsing history, location and so on. And in all likelihood, search is influenced by the many other things Google gleans from the way we use its other properties -- gmail, maps, YouTube, hangouts and Google+ -- which are all linked now under one master data usage policy.

And here's the really clever thing about search. Google monitors how well it's predicting our real or underlying concerns. It uses a range of signals and metrics, to assess what we do with search results, and it continuously refines those processes. This is what Google really gets out of search: deep understanding of what its users are interested in, and how they are likely to respond to targeted advertising. Each search result is a little test of Google's Artificial Intelligence, which, as some like to say, is getting to know us better than we know ourselves.

As important as they are, it seems to me that search results are really just a by-product of a gigantic information business. They are nothing like free speech.

Posted in Privacy, Internet, Big Data

The Creepy Test

I'm going to assume readers know what's meant by the "Creepy Test" in privacy. Here's a short appeal to use the Creepy Test sparingly and carefully.

The most obvious problem with the Creepy Test is its subjectivity. One person's "creepy" can be another person's "COOL!!". For example, a friend of mine thought it was cool when he used Google Maps to check out a hotel he was going to, and the software usefully reminded him of his check-in time (evidently, Google had scanned his gmail and mashed up the registration details next time he searched for the property). I actually thought this was way beyond creepy; imagine if it wasn't a hotel but a mental health facility, and Google was watching your psychiatric appointments.

In fact, for some people, creepy might actually be cool, in the same way as horror movies or chilli peppers are cool. There's already an implicit dare in the "Nothing To Hide" argument. Some brave souls seem to brag that they haven't done anything they don't mind being made public.

Our sense of what's creepy changes over time. We can get used to intrusive technologies, and that suits the agendas of infomoplists who make fortunes from personal data, hoping that we won't notice. On the other hand, objective and technology-neutral data privacy principles have been with us for over thirty years, and by and large, they work well to address contemporary problems like facial recognition, the cloud, and augmented reality glasses.

Using folksy terms in privacy might make the topic more accessible to laypeople, but it tends to distract from the technicalities of data privacy regulations. These are not difficult matters in the scheme of things; data privacy is technically about objective and reasonable controls on the collection, use and disclosure of personally identifiable information. I encourage anyone with an interest in privacy to spend time familiarising themselves with common Privacy Principles and the definition of Personal Information. And then it's easy to see that activities like Facebook's automated face recognition and Tag Suggestions aren't merely creepy; they are objectively unlawful!

Finally and most insideously, when emotive terms like creepy are used in debating public policy, it actually disempowers the critical voices. If "creepy" is the worst thing you can say about a given privacy concern, then you're marginalised.

We should avoid being subjective about privacy. By all means, let's use the Creepy Test to help spot potential privacy problems, and kick off a conversation. But as quickly as possible, we need to reduce privacy problems to objective criteria and, with cool heads, debate the appropriate responses.

See also A Theory of Creepy: Technology, Privacy and Shifting Social Norms by Omer Tene and Jules Polonetsky.

      • "Alas, intuitions and perceptions of 'creepiness' are highly subjective and difficult to generalize as social norms are being strained by new technologies and capabilities". Tene & Polonetsky.

Posted in Privacy, Biometrics, Big Data

The state of the state: Privacy enters Adolescence

Constellation Research recently launched the "State of Enterprise Technology" series of research reports. The series assesses the current state of the enterprise technologies Constellation consider crucial to digital transformation, and provide snapshots of the future usage and evolution of these technologies. Constellation will continue to publish reports in our State of Enterprise Technology series throughout Q1.

My first contribution to this series, "Privacy Enters Adolescence", focuses on Safety and Privacy. I've looked at information data privacy in 2015, and identified seven trends of which you should be aware in order to potect your customer's information.

Here's an excerpt from the report:

Digital Safety and Privacy

Constellation's business theme of Digital Safety and Privacy is all about the art and science of maximizing the information assets of a business, including its most important assets – its people. Our research in this theme enables clients to capitalize on cloud, mobility, Big Data and the Internet of Things, without compromising the digital safety of the business, and the privacy and trust of your end users.

Seven Digital Safety and Privacy Trends for 2015


  • Consumers have not given up privacy - they've been tricked out of it. The impression is easily formed that people just don’t care about privacy anymore. Yet there is no proof that privacy is dead. In fact, a robust study of young adults has shown no major difference between them and older people on the importance of privacy.
  • Private sector surveillance is overshadowed by government intrusion, but is arguably just as bad. There is nothing inevitable about private sector surveillance. Consumers are waking up to the fact that digital business models are generating unprecedented fortunes on the back of the personal data they are giving away in loyalty programs, social networks, search, cloud email, and fitness trackers. Most people remain blissfully ignorant of what's being done with all that data, but we see budding signs of resentment from consumers whose every interaction is exploited without their consent.
  • The U.S. is the Canary Islands of privacy. The United States remains the only major economy without broad-based information privacy laws.
  • Privacy is more about politics than technology. Privacy can be seen as a power play between individual rights and the interests of governments and businesses.
  • The land grab for "public" data accelerates. Data is an immensely valuable raw material. More than data mining, Big Data is really about data refining. And unlike the stuff of traditional extraction industries, data seems inexhaustible, and the cost of extraction is near zero. Something akin to land rights for privacy may be the future.
  • Data literacy will be key to digital safety. Computer literacy is one thing, but data literacy is different and less well defined so far. When we go online, we don’t have the familiar social cues, so now we need to develop new ones. And we need to build up a common understanding of how data flows in the digital economy. Data literacy is more than being able to work an operating system, a device and umpteen apps: it means having meaningful mental models of what goes on in computers.
  • Privacy will get worse before it gets better. Privacy is messy, even in jurisdictions where data protection rules are well entrenched. Consider the controversial new Right to Be Forgotten ruling of the European Court of Justice, which resulted in plenty of unintended consequences, and collisions with other jurisprudence, namely the United States' protection of free speech.

My report "Privacy Enters Adolescence" can be downloaded here. It expands on the points above, and sets out recommendations for improving awareness of how personal data flows in the digital economy, negotiating better deals in the data-for-value bargain, and the conduct of Privacy Impact Assessments.

Posted in Social Media, Privacy, Cloud, Big Data

The State of the State of Privacy

Constellation Research analysts are wrapping up a very busy 2014 with a series of "State of the State" reports. For my part I've looked at the state of privacy, which I feel is entering its adolescent stage.

Here's a summary.

1. Consumers have not given up privacy - they've been tricked out of it.
The impression is easily formed that people just don’t care about privacy anymore, but in fact people are increasingly frustrated with privacy invasions. They’re tired of social networks mining users’ personal lives; they are dismayed that video game developers can raid a phone’s contact lists with impunity; they are shocked by the deviousness of Target analyzing women’s shopping histories to detect pregnant customers; and they are revolted by the way magnates help themselves to operational data like Uber’s passenger movements for fun or allegedly for harassment – just because they can.

2. Private sector surveillance is overshadowed by government intrusion, but is arguably just as bad.
Edward Snowden’s revelations of a massive military-industrial surveillance effort were of course shocking, but they should not steal all the privacy limelight. In parallel with and well ahead of government spy programs, the big OSNs and search engine companies have been gathering breathtaking amounts of data, all in the interests of targeted advertising. These data stores have come to the attention of the FBI and CIA who must be delighted that someone else has done so much of their spying for them. These businesses boast that they know us better than we know ourselves. That’s chilling. We need to break through into a post-Snowden world.

3. The U.S. is the Canary Islands of privacy.
The United States remains the only major economy without broad-based information privacy laws. There are almost no restraints on what American businesses may do with personal information they collect from their customers, or synthesize from their operations. In the rest of the world, most organizations must restrict their collection of data, limit the repurposing of data, and disclose their data handling practices in full. Individuals may want to move closer to European-style privacy protection, while many corporations prefer the freedom they have in America to hang on to any data they like while they figure out how to make money out of it. Digital companies like to call this “innovation” and grandiose claims are made about its criticality for the American economy, but many consumers would prefer the sort of innovation that respects their privacy while delivering value-for-data.

4. Privacy is more about politics than technology.
Privacy can be seen as a power play between individual rights and the interests of governments and businesses. Most of us actually want businesses to know quite a lot about us, but we expect them to respect what they know and to be restrained in how they use it. Privacy is less about what organizations do with information than what they choose not to do with it. Hence, privacy cannot be a technology issue. It is not about keeping things secret but rather, keeping them close. Privacy is actually the protection we need when things are not secret.

5. Land grab for “public” data accelerates.

Image Analysis As Cracking Tower (1 1 1)
Data is an immensely valuable raw material. We should re-frame unstructured data as “information ore”. More than data mining, Big Data is really about data refining but unlike the stuff of traditional extractive industries, data seems inexhaustible, and the cost of extraction is near zero. A huge amount of Big Data activity is propelled by the misconception that data in the public domain is free for all. The reality is that many data protection laws govern the collection and use of personal data regardless of where it comes from. That is, personal data in the “public domain” is in fact encumbered. This is counter-intuitive to many, yet many public resources are regulated - including minerals, electromagnetic spectrum and intellectual property.

6. Data literacy will be key to digital safety.
Computer literacy is one thing, but data literacy is different and less tangible. We have strong privacy intuitions that have evolved over centuries but in cyberspace we lose our bearings. We don’t have the familiar social cues when we go online, so now we need to develop new ones. And we need to build up a common understanding of how data flows in the digital economy. Today we train kids in financial literacy to engender a first-hand sense of how commerce works; data literacy may become even more important as a life skill. It's more than being able to work an operating system, a device and umpteen apps. It means having meaningful mental models of what goes on in computers. Without understanding this, we can’t construct effective privacy policies or privacy labeling.

7. Privacy will get worse before it gets better.
Privacy is messy, even where data protection rules are well entrenched. Consider the controversial Right To Be Forgotten in Europe, which requires search engine operators to provide a mechanism for individuals to request removal of old, inaccurate and harmful reports from results. The new rule has been derived from existing privacy principles, which treat the results of search algorithms as a form of synthesis rather than a purely objective account of history, and therefore hold the search companies partly responsible for the offense their processes might produce. Yet, there are plenty of unintended consequences, and collisions with other jurisprudence. The sometimes urgent development of new protections for old civil rights is never plain sailing.

My report "Privacy Enters Adolescence" can be downloaded here. It expands on the points above, and sets out recommendations for improving awareness of how Personal Data flows in the digital economy, negotiating better deals in the data-for-value bargain, the conduct of Privacy Impact Assessments, and developing a "Privacy Bill of Rights".

Posted in Social Networking, Social Media, Internet, Constellation Research, Big Data

The Constellation Research Disruption Checklist for 2015

The Constellation Research analyst team has assembled a "year end checklist", offering suggestions designed to enable you to take better control of your digital strategy in 2015. We offer these actions to help you dominate "digital disruption" in the new year.

1. Matrix Commerce: Scrub your data

Guy Courtin

When it comes to Matrix Commerce, companies need to focus on the basics first. What are the basics? Cleaning up and getting your data in order. Much is discussed about the evolution of supply chains and the surrounding technologies. However these solutions are only as useful as the data that feeds them. Many CxOs that we have spoken to have discussed the need to focus on cleaning up their data. First work on a data audit to identify the most important sources of data for your efforts in Matrix Commerce. Second, focus on the systems that can process and make sense of this data. Finally, determine the systems and business processes that will be optimized with these improvements. Matrix Commerce starts with the right data. The systems and business processes that layer on top of this data are only as useful as the data. CxOs must continue to organize and clean their data house.

2. Safety and Privacy - Create your Enterprise Information Asset Inventory

Steve Wilson

In 2015, get on top of your information assets. When information is the lifeblood of your business, make sure you understand what really makes it valuable. Create (or refresh) your Enterprise Information Asset Inventory, and then think beyond the standard security dimensions of Confidentiality, Integrity and Availability. What sets your information apart from your competitors? Is it more complete, more up-to-date, more original or harder to acquire? To maximise the value of information, innovative organisations are gauging it in terms of utility, currency, jurisdictional certainty, privacy compliance and whatever other facets matter the most in their business environment. These innovative organizations structure their information technology and security functions to not merely protect the enterprise against threats, but to deliver the right data when and where it's needed most. Shifting from defensive security to strategic informatics is the key to success in the digital economy. Learn more about creating an information asset inventory.

3. Data to Decisions - Create your Big Data Plan of Action

Andy Mulholland

Big Data is arriving at the end of the hype cycle. In 2015, real-time decision support using ‘smart data’ extracted from Big Data will manifest as a requirement for competitiveness. Digital Business, or even just online sellers, are all reducing reaction and response times. Enterprises have huge business and technology investments in data that need to support their daily activities better, so its time to pivot from using Big Data for analysis and start examining how to deliver Smart Data to users and automated online systems. What is Smart Data? Well, let's say creating your organization's definition of Smart Data is priority number one in your Big Data strategy. Transformation in Digital markets requires a transformation in the competitive use of Big Data. Request a meeting with Constellation's CTO in residence, Andy Mulholland.

4. Next Gen CXP - Make Customer Experience Instinctual

Natalie Petouhoff

Stop thinking of Customer Experience as a functional or departmental initiative and start thinking about experience from the customer’s point of view.

Customers don’t distinguish between departments when they require service from your organization. Customer Experience is a responsibility shared amongst all employees. However, the division of companies into functional departments with separate goals means that customer experience is often fractured. Rid your company of this ethos in 2015 by using design thinking to create a culture of cohesive customer experience.

Ensure all employees live your company mythology, employ the right customer and internal-facing technologies, collect the right data, and make changes to your strategy and products as soon as possible. Read "Five Approaches to Drive Customer Loyalty in a Digital World".

5. Future of Work - Take Advantage of Collaboration

Alan Lepofsky

Over the last few years, there has been a growing movement in the way people communicate and collaborate with their colleagues and customers, shifting from closed systems like email and chat, to more transparent tools like social networks and communities. That trend will continue in 2015 as people become more comfortable with sharing and as collaboration tools become more integrated with the business software they use to get their jobs done. Employees should familiarize themselves with the tools available to them, and learn how to pick the right tool for each of the various scenarios that make up their work day. Read "Enterprise Collaboration: From Simple Sharing to Getting Work Done".

6. Future of Work - Prepare for Demographic Shifts

Holger Mueller

In the next ten years 10% to 20% of the North American and European workforce will retire. Leaders need to understand and prepare for this tremendous shift so performance remains steady as many of the workforce's highly skilled workers retire.

To ensure smooth a smooth transition, ensure your HCM software systems can accommodate a massive number of retirements, successions and career path developments, and new hires from external recruiting.

Constellation fully expects employment to be a sellers market going forward. People leaders should ensure their HCM systems facilitate employee motivation, engagement and retention, lest they lose their best employees to competitors. Read "Globalization, HR, and Business Model Success". Additional cloud HR case studies here and here.

7. Digital Marketing Transformation - Brand Priorities Must Convey Authenticity

Ray Wang

Brand authenticity must dominate digital and analog channels in 2015. Digital personas must not only reflect the brand, but also expand upon the analog experience. Customers love the analog experience, so deliver the same experience digitally. Brand conscious leaders must invest in the digital experience with an eye towards mass personalization at scale. While advertising plays a key role in distributing the brand message, investment in the design of digital experiences presents itself as a key area of investment for 2015. Download free executive brief: Can Brands Keep Their Promise?

8. Consumerization of IT: Use Mobile as the Gateway to Digital Transformation Projects

Ray Wang

Constellation believes that mobile is more than just the device. While smartphones and other devices are key enablers of 'mobile', design in digital transformation should take into account how these technologies address the business value and business model transformation required to deliver on breakthrough innovation. If you have not yet started your digital transformation or are considering using mobile as an additional digital transformation point, Constellation recommends that clients assess how a new generation of enterprise mobile apps can change the business by identifying a cross-functional business problem that cannot be solved with linear thinking, articulating the business problem and benefit, showing how the solution orchestrates new experiences, identifying how analytics and insights can fuel the business model shift, exploiting full native device features, and seeking frictionless experiences. You'll be digital before you know it. Read "Why the Third Generation of Enterprise Mobile is Designed for Digital Transformation"

9. Technology Optimization & Innovation - Prepare Your Public Cloud Strategy

Holger Mueller

In 2015 technology leaders will need to create, adjust and implement their public cloud strategy. Considering estimates pegging Amazon AWS at 15-20% of virtualized servers worldwide, CIOs and CTOs need to actively plan and execute their enterprise’s strategy vis-a-vis the public cloud. Reducing technical debt and establishing next generation best practices to leverage the new ‘on demand’ IT paradigm should be a top priority for CIOs and CTOs seeking organizational competitiveness, greater job security and fewer budget restrictions.

Posted in Social Media, Security, Privacy, Constellation Research, Cloud, Big Data