Lockstep

Mobile: +61 (0) 414 488 851
Email: swilson@lockstep.com.au

Correspondence in Nature magazine

I had a letter to the editor published in Nature on big data and privacy.

Data protection: Big data held to privacy laws, too

Stephen Wilson
Nature 519, 414 (26 March 2015) doi:10.1038/519414a
Published online 25 March 2015

Letter as published

Privacy issues around data protection often inspire over-engineered responses from scientists and technologists. Yet constraints on the use of personal data mean that privacy is less about what is done with information than what is not done with it. Technology such as new algorithms may therefore be unnecessary (see S. Aftergood, Nature 517, 435–436; 2015).

Technology-neutral data-protection laws afford rights to individuals with respect to all data about them, regardless of the data source. More than 100 nations now have such data-privacy laws, typically requiring organizations to collect personal data only for an express purpose and not to re-use those data for unrelated purposes.

If businesses come to know your habits, your purchase intentions and even your state of health through big data, then they have the same privacy responsibilities as if they had gathered that information directly by questionnaire. This is what the public expects of big-data algorithms that are intended to supersede cumbersome and incomplete survey methods. Algorithmic wizardry is not a way to evade conventional privacy laws.

Stephen Wilson
Constellation Research, Sydney, Australia.
steve@constellationr.com

Posted in Science, Privacy, Big Data

The Google Advisory Council

In May 2014, the European Court of Justice (ECJ) ruled that under European law, people have the right to have certain information about them delisted from search engine results. The ECJ ruling was called the "Right to be Forgotten", despite it having little to do with forgetting (c'est la vie). Shortened as RTBF, it is also referred to more clinically as the "Right to be Delisted" (or simply as "Google Spain" because that was one of the parties in the court action). Within just a few months, the RTBF has triggered conferences, public debates, and a TEDx talk.

Google itself did two things very quickly in response to the RTBF ruling. First, it mobilised a major team to process delisting requests. This is no mean feat -- over 200,000 requests have been received to date; see Google's transparency report. However it's not surprising they got going so quickly as they already have well-practiced processes for take-down notices for copyright and unlawful material.

Secondly, the company convened an Advisory Council of independent experts to formulate strategies for balancing the competing rights and interests bound up in RTBF. The Advisory Council delivered its report in January; it's available online here.

I declare I'm a strong supporter of RTBF. I've written about it here and here, and participated in an IEEE online seminar. I was impressed by the intellectual and eclectic make-up of the Council, which includes a past European Justice Minister, law professors, and a philosopher. And I do appreciate that the issues are highly complex. So I had high expectations of the Council's report.

Yet I found it quite barren.

Recap - the basics of RTBF

EU Justice Commissioner Martine Reicherts in a speech last August gave a clear explanation of the scope of the ECJ ruling, and acknowledged its nuances. Her speech should be required reading. Reicherts summed up the situation thus:

    • What did the Court actually say on the right to be forgotten? It said that individuals have the right to ask companies operating search engines to remove links with personal information about them – under certain conditions - when information is inaccurate, inadequate, irrelevant, outdated or excessive for the purposes of data processing. The Court explicitly ruled that the right to be forgotten is not absolute, but that it will always need to be balanced against other fundamental rights, such as the freedom of expression and the freedom of the media – which, by the way, are not absolute rights either.

High tension

Everyone concerned acknowledges there are tensions in the RTBF ruling. The Google Advisory Council Report mentions these tensions (in Section 3) but sadly spends no time critically exploring them. In truth, all privacy involves conflicting requirements, and to that extent, many features of RTBF have been seen before. At p5, the Report mentions that "the [RTBF] Ruling invokes a data subject’s right to object to, and require cessation of, the processing of data about himself or herself" (emphasis added); the reader may conclude, as I have, that the computing of search results by a search engine is just another form of data processing.

One of the most important RTBF talking points is whether it's fair that Google is made to adjudicate delisting requests. I have some sympathies for Google here, and yet this is not an entirely novel situation in privacy. A standard feature of international principles-based privacy regimes is the right of individuals to have erroneous personal data corrected (this is, for example, OECD Privacy Principle No. 7 - Individual Participation, and Australian Privacy Principle No. 13 - Correction of Personal Information). And at the top of p5, the Council Report cites the right to have errors rectified. So it is standard practice that a data custodian must have means for processing access and correction requests. Privacy regimes expect there to be dispute resolution mechanisms too, operated by the company concerned. None of this is new. What seems to be new to some stakeholders is the idea that the results of a search engine is just another type of data processing.

A little rushed

The Council explains in the Introduction to the Report that it had to work "on an accelerated timeline, given the urgency with which Google had to begin complying with the Ruling once handed down". I am afraid that the Report shows signs of being a little rushed.


  • There are several spelling errors.
  • The contributions from non English speakers could have done with some editing.
  • Less trivially, many of the footnotes need editing; it's not always clear how a person's footnoted quote supports the text.
  • More importantly, the Advisory Council surely operated with Terms of Reference, yet there is no clear explanation of what those were. At the end of the introduction, we're told the group was "convened to advise on criteria that Google should use in striking a balance, such as what role the data subject plays in public life, or whether the information is outdated or no longer relevant. We also considered the best process and inputs to Google’s decision making, including input from the original publishers of information at issue, as potentially important aspects of the balancing exercise." I'm surprised there is not a more complete and definitive description of the mission.
  • It's not actually clear what sort of search we're all talking about. Not until p7 of the Report does the qualified phrase "name-based search" first appear. Are there other types of search for which the RTBF does not apply?
  • Above all, it's not clear that the Council has reached a proper conclusion. The Report makes a number of suggestions in passing, and there is a collection of "ideas" at the back for improving the adjudication process, but there is no cogent set of recommendations. That may be because the Council didn't actually reach consensus.

And that's one of the most surprising things about the whole exercise. Of the eight independent Council members, five of them wrote "dissenting opinions". The work of an expert advisory committee is not normally framed as a court-like determination, from which members might dissent. And even if it was, to have the majority of members "dissent" casts doubt on the completeness or even the constitution of the process. Is there anything definite to be dissented from?

Jimmy Wales, the Wikipedia founder and chair, was especially strident in his individual views at the back of the Report. He referred to "publishers whose works are being suppressed" (p27 of the Report), and railed against the report itself, calling its recommendation "deeply flawed due to the law itself being deeply flawed". Can he mean the entire Charter of Fundamental Rights of the EU and European Convention on Human Rights? Perhaps Wales is the sort of person that denies there are any nuances in privacy, because "suppressed" is an exaggeration if we accept that RTBF doesn't cause anything to be forgotten. In my view, it poisons the entire effort when unqualified insults are allowed to be hurled at the law. If Wales thinks so little of the foundation of both the ECJ ruling and the Advisory Council, he might have declined to take part.

A little hollow

Strangely, the Council's Report is altogether silent on the nature of search. It's such a huge part of their business that I have to think the strength of Google's objection to RTBF is energised by some threat it perceives to its famously secret algorithms.

The Google business was founded on its superior Page Rank search method, and the company has spent fantastic funds on R&D, allowing it to keep a competitive edge for a very long time. And the R&D continues. Curiously, just as everyone is debating RTBF, Google researchers published a paper about a new "knowledge based" approach to evaluating web pages. Surely if page ranking was less arbitrary and more transparent, a lot of the heat would come out of RTBF.

Of all the interests to balance in RTBF, Google's business objectives are actually a legitimate part of the mix. Google provides marvelous free services in exchange for data about its users which it converts into revenue, predominantly through advertising. It's a value exchange, and it need not be bad for privacy. A key component of privacy is transparency: people have a right to know what personal information about them is collected, and why. The RTBF analysis seems a little hollow without frank discussion of what everyone gets out of running a search engine.

Further reading

Posted in Social Media, Privacy, Internet, Big Data

Free search.

Search engines are wondrous things. I myself use Google search umpteen times a day. I don't think I could work or play without it anymore. And yet I am a strong supporter of the contentious "Right to be Forgotten". The "RTBF" is hotly contested, and I am the first to admit it's a messy business. For one thing, it's not ideal that Google itself is required for now to adjudicate RTBF requests in Europe. But we have to accept that all of privacy is contestable. The balance of rights to privacy and rights to access information is tricky. RTBF has a long way to go, and I sense that European jurors and regulators are open and honest about this.

One of the starkest RTBF debating points is free speech. Does allowing individuals to have irrelevant, inaccurate and/or outdated search results blocked represent censorship? Is it an assault on free speech? There is surely a technical-legal question about whether the output of an algorithm represents "free speech", and as far as I can see, that question remains open. Am I the only commentator suprised by this legal blind spot? I have to say that such uncertainty destabilises a great deal of the RTBF dispute.

I am not a lawyer, but I have a strong sense that search outputs are not the sort of thing that constitutes speech. Let's bear in mind what web search is all about.

Google search is core to its multi-billion dollar advertising business. Search results are not unfiltered replicas of things found in the public domain, but rather the subtle outcome of complex Big Data processes. Google's proprietary search algorithm is famously secret, but we do know how sensitive it is to context. Most people will have noticed that search results change day by day and from place to place. But why is this?

When we enter search parameters, the result we get is actually Google's guess about what we are really looking for. Google in effect forms a hypothesis, drawing on much more than the express parameters, including our search history, browsing history, location and so on. And in all likelihood, search is influenced by the many other things Google gleans from the way we use its other properties -- gmail, maps, YouTube, hangouts and Google+ -- which are all linked now under one master data usage policy.

And here's the really clever thing about search. Google monitors how well it's predicting our real or underlying concerns. It uses a range of signals and metrics, to assess what we do with search results, and it continuously refines those processes. This is what Google really gets out of search: deep understanding of what its users are interested in, and how they are likely to respond to targeted advertising. Each search result is a little test of Google's Artificial Intelligence, which, as some like to say, is getting to know us better than we know ourselves.

As important as they are, it seems to me that search results are really just a by-product of a gigantic information business. They are nothing like free speech.

Posted in Privacy, Internet, Big Data

The Creepy Test

I'm going to assume readers know what's meant by the "Creepy Test" in privacy. Here's a short appeal to use the Creepy Test sparingly and carefully.

The most obvious problem with the Creepy Test is its subjectivity. One person's "creepy" can be another person's "COOL!!". For example, a friend of mine thought it was cool when he used Google Maps to check out a hotel he was going to, and the software usefully reminded him of his check-in time (evidently, Google had scanned his gmail and mashed up the registration details next time he searched for the property). I actually thought this was way beyond creepy; imagine if it wasn't a hotel but a mental health facility, and Google was watching your psychiatric appointments.

In fact, for some people, creepy might actually be cool, in the same way as horror movies or chilli peppers are cool. There's already an implicit dare in the "Nothing To Hide" argument. Some brave souls seem to brag that they haven't done anything they don't mind being made public.

Our sense of what's creepy changes over time. We can get used to intrusive technologies, and that suits the agendas of infomoplists who make fortunes from personal data, hoping that we won't notice. On the other hand, objective and technology-neutral data privacy principles have been with us for over thirty years, and by and large, they work well to address contemporary problems like facial recognition, the cloud, and augmented reality glasses.

Using folksy terms in privacy might make the topic more accessible to laypeople, but it tends to distract from the technicalities of data privacy regulations. These are not difficult matters in the scheme of things; data privacy is technically about objective and reasonable controls on the collection, use and disclosure of personally identifiable information. I encourage anyone with an interest in privacy to spend time familiarising themselves with common Privacy Principles and the definition of Personal Information. And then it's easy to see that activities like Facebook's automated face recognition and Tag Suggestions aren't merely creepy; they are objectively unlawful!

Finally and most insideously, when emotive terms like creepy are used in debating public policy, it actually disempowers the critical voices. If "creepy" is the worst thing you can say about a given privacy concern, then you're marginalised.

We should avoid being subjective about privacy. By all means, let's use the Creepy Test to help spot potential privacy problems, and kick off a conversation. But as quickly as possible, we need to reduce privacy problems to objective criteria and, with cool heads, debate the appropriate responses.

See also A Theory of Creepy: Technology, Privacy and Shifting Social Norms by Omer Tene and Jules Polonetsky.

      • "Alas, intuitions and perceptions of 'creepiness' are highly subjective and difficult to generalize as social norms are being strained by new technologies and capabilities". Tene & Polonetsky.

Posted in Privacy, Biometrics, Big Data

The state of the state: Privacy enters Adolescence

Constellation Research recently launched the "State of Enterprise Technology" series of research reports. The series assesses the current state of the enterprise technologies Constellation consider crucial to digital transformation, and provide snapshots of the future usage and evolution of these technologies. Constellation will continue to publish reports in our State of Enterprise Technology series throughout Q1.

My first contribution to this series, "Privacy Enters Adolescence", focuses on Safety and Privacy. I've looked at information data privacy in 2015, and identified seven trends of which you should be aware in order to potect your customer's information.

Here's an excerpt from the report:

Digital Safety and Privacy

Constellation's business theme of Digital Safety and Privacy is all about the art and science of maximizing the information assets of a business, including its most important assets – its people. Our research in this theme enables clients to capitalize on cloud, mobility, Big Data and the Internet of Things, without compromising the digital safety of the business, and the privacy and trust of your end users.

Seven Digital Safety and Privacy Trends for 2015


  • Consumers have not given up privacy - they've been tricked out of it. The impression is easily formed that people just don’t care about privacy anymore. Yet there is no proof that privacy is dead. In fact, a robust study of young adults has shown no major difference between them and older people on the importance of privacy.
  • Private sector surveillance is overshadowed by government intrusion, but is arguably just as bad. There is nothing inevitable about private sector surveillance. Consumers are waking up to the fact that digital business models are generating unprecedented fortunes on the back of the personal data they are giving away in loyalty programs, social networks, search, cloud email, and fitness trackers. Most people remain blissfully ignorant of what's being done with all that data, but we see budding signs of resentment from consumers whose every interaction is exploited without their consent.
  • The U.S. is the Canary Islands of privacy. The United States remains the only major economy without broad-based information privacy laws.
  • Privacy is more about politics than technology. Privacy can be seen as a power play between individual rights and the interests of governments and businesses.
  • The land grab for "public" data accelerates. Data is an immensely valuable raw material. More than data mining, Big Data is really about data refining. And unlike the stuff of traditional extraction industries, data seems inexhaustible, and the cost of extraction is near zero. Something akin to land rights for privacy may be the future.
  • Data literacy will be key to digital safety. Computer literacy is one thing, but data literacy is different and less well defined so far. When we go online, we don’t have the familiar social cues, so now we need to develop new ones. And we need to build up a common understanding of how data flows in the digital economy. Data literacy is more than being able to work an operating system, a device and umpteen apps: it means having meaningful mental models of what goes on in computers.
  • Privacy will get worse before it gets better. Privacy is messy, even in jurisdictions where data protection rules are well entrenched. Consider the controversial new Right to Be Forgotten ruling of the European Court of Justice, which resulted in plenty of unintended consequences, and collisions with other jurisprudence, namely the United States' protection of free speech.

My report "Privacy Enters Adolescence" can be downloaded here. It expands on the points above, and sets out recommendations for improving awareness of how personal data flows in the digital economy, negotiating better deals in the data-for-value bargain, and the conduct of Privacy Impact Assessments.

Posted in Social Media, Privacy, Cloud, Big Data

The State of the State of Privacy

Constellation Research analysts are wrapping up a very busy 2014 with a series of "State of the State" reports. For my part I've looked at the state of privacy, which I feel is entering its adolescent stage.

Here's a summary.

1. Consumers have not given up privacy - they've been tricked out of it.
The impression is easily formed that people just don’t care about privacy anymore, but in fact people are increasingly frustrated with privacy invasions. They’re tired of social networks mining users’ personal lives; they are dismayed that video game developers can raid a phone’s contact lists with impunity; they are shocked by the deviousness of Target analyzing women’s shopping histories to detect pregnant customers; and they are revolted by the way magnates help themselves to operational data like Uber’s passenger movements for fun or allegedly for harassment – just because they can.

2. Private sector surveillance is overshadowed by government intrusion, but is arguably just as bad.
Edward Snowden’s revelations of a massive military-industrial surveillance effort were of course shocking, but they should not steal all the privacy limelight. In parallel with and well ahead of government spy programs, the big OSNs and search engine companies have been gathering breathtaking amounts of data, all in the interests of targeted advertising. These data stores have come to the attention of the FBI and CIA who must be delighted that someone else has done so much of their spying for them. These businesses boast that they know us better than we know ourselves. That’s chilling. We need to break through into a post-Snowden world.

3. The U.S. is the Canary Islands of privacy.
The United States remains the only major economy without broad-based information privacy laws. There are almost no restraints on what American businesses may do with personal information they collect from their customers, or synthesize from their operations. In the rest of the world, most organizations must restrict their collection of data, limit the repurposing of data, and disclose their data handling practices in full. Individuals may want to move closer to European-style privacy protection, while many corporations prefer the freedom they have in America to hang on to any data they like while they figure out how to make money out of it. Digital companies like to call this “innovation” and grandiose claims are made about its criticality for the American economy, but many consumers would prefer the sort of innovation that respects their privacy while delivering value-for-data.

4. Privacy is more about politics than technology.
Privacy can be seen as a power play between individual rights and the interests of governments and businesses. Most of us actually want businesses to know quite a lot about us, but we expect them to respect what they know and to be restrained in how they use it. Privacy is less about what organizations do with information than what they choose not to do with it. Hence, privacy cannot be a technology issue. It is not about keeping things secret but rather, keeping them close. Privacy is actually the protection we need when things are not secret.

5. Land grab for “public” data accelerates.

Image Analysis As Cracking Tower (1 1 1)
Data is an immensely valuable raw material. We should re-frame unstructured data as “information ore”. More than data mining, Big Data is really about data refining but unlike the stuff of traditional extractive industries, data seems inexhaustible, and the cost of extraction is near zero. A huge amount of Big Data activity is propelled by the misconception that data in the public domain is free for all. The reality is that many data protection laws govern the collection and use of personal data regardless of where it comes from. That is, personal data in the “public domain” is in fact encumbered. This is counter-intuitive to many, yet many public resources are regulated - including minerals, electromagnetic spectrum and intellectual property.

6. Data literacy will be key to digital safety.
Computer literacy is one thing, but data literacy is different and less tangible. We have strong privacy intuitions that have evolved over centuries but in cyberspace we lose our bearings. We don’t have the familiar social cues when we go online, so now we need to develop new ones. And we need to build up a common understanding of how data flows in the digital economy. Today we train kids in financial literacy to engender a first-hand sense of how commerce works; data literacy may become even more important as a life skill. It's more than being able to work an operating system, a device and umpteen apps. It means having meaningful mental models of what goes on in computers. Without understanding this, we can’t construct effective privacy policies or privacy labeling.

7. Privacy will get worse before it gets better.
Privacy is messy, even where data protection rules are well entrenched. Consider the controversial Right To Be Forgotten in Europe, which requires search engine operators to provide a mechanism for individuals to request removal of old, inaccurate and harmful reports from results. The new rule has been derived from existing privacy principles, which treat the results of search algorithms as a form of synthesis rather than a purely objective account of history, and therefore hold the search companies partly responsible for the offense their processes might produce. Yet, there are plenty of unintended consequences, and collisions with other jurisprudence. The sometimes urgent development of new protections for old civil rights is never plain sailing.

My report "Privacy Enters Adolescence" can be downloaded here. It expands on the points above, and sets out recommendations for improving awareness of how Personal Data flows in the digital economy, negotiating better deals in the data-for-value bargain, the conduct of Privacy Impact Assessments, and developing a "Privacy Bill of Rights".

Posted in Social Networking, Social Media, Internet, Constellation Research, Big Data

The Constellation Research Disruption Checklist for 2015

The Constellation Research analyst team has assembled a "year end checklist", offering suggestions designed to enable you to take better control of your digital strategy in 2015. We offer these actions to help you dominate "digital disruption" in the new year.

1. Matrix Commerce: Scrub your data

Guy Courtin

When it comes to Matrix Commerce, companies need to focus on the basics first. What are the basics? Cleaning up and getting your data in order. Much is discussed about the evolution of supply chains and the surrounding technologies. However these solutions are only as useful as the data that feeds them. Many CxOs that we have spoken to have discussed the need to focus on cleaning up their data. First work on a data audit to identify the most important sources of data for your efforts in Matrix Commerce. Second, focus on the systems that can process and make sense of this data. Finally, determine the systems and business processes that will be optimized with these improvements. Matrix Commerce starts with the right data. The systems and business processes that layer on top of this data are only as useful as the data. CxOs must continue to organize and clean their data house.

2. Safety and Privacy - Create your Enterprise Information Asset Inventory

Steve Wilson

In 2015, get on top of your information assets. When information is the lifeblood of your business, make sure you understand what really makes it valuable. Create (or refresh) your Enterprise Information Asset Inventory, and then think beyond the standard security dimensions of Confidentiality, Integrity and Availability. What sets your information apart from your competitors? Is it more complete, more up-to-date, more original or harder to acquire? To maximise the value of information, innovative organisations are gauging it in terms of utility, currency, jurisdictional certainty, privacy compliance and whatever other facets matter the most in their business environment. These innovative organizations structure their information technology and security functions to not merely protect the enterprise against threats, but to deliver the right data when and where it's needed most. Shifting from defensive security to strategic informatics is the key to success in the digital economy. Learn more about creating an information asset inventory.

3. Data to Decisions - Create your Big Data Plan of Action

Andy Mulholland

Big Data is arriving at the end of the hype cycle. In 2015, real-time decision support using ‘smart data’ extracted from Big Data will manifest as a requirement for competitiveness. Digital Business, or even just online sellers, are all reducing reaction and response times. Enterprises have huge business and technology investments in data that need to support their daily activities better, so its time to pivot from using Big Data for analysis and start examining how to deliver Smart Data to users and automated online systems. What is Smart Data? Well, let's say creating your organization's definition of Smart Data is priority number one in your Big Data strategy. Transformation in Digital markets requires a transformation in the competitive use of Big Data. Request a meeting with Constellation's CTO in residence, Andy Mulholland.

4. Next Gen CXP - Make Customer Experience Instinctual

Natalie Petouhoff

Stop thinking of Customer Experience as a functional or departmental initiative and start thinking about experience from the customer’s point of view.

Customers don’t distinguish between departments when they require service from your organization. Customer Experience is a responsibility shared amongst all employees. However, the division of companies into functional departments with separate goals means that customer experience is often fractured. Rid your company of this ethos in 2015 by using design thinking to create a culture of cohesive customer experience.

Ensure all employees live your company mythology, employ the right customer and internal-facing technologies, collect the right data, and make changes to your strategy and products as soon as possible. Read "Five Approaches to Drive Customer Loyalty in a Digital World".

5. Future of Work - Take Advantage of Collaboration

Alan Lepofsky

Over the last few years, there has been a growing movement in the way people communicate and collaborate with their colleagues and customers, shifting from closed systems like email and chat, to more transparent tools like social networks and communities. That trend will continue in 2015 as people become more comfortable with sharing and as collaboration tools become more integrated with the business software they use to get their jobs done. Employees should familiarize themselves with the tools available to them, and learn how to pick the right tool for each of the various scenarios that make up their work day. Read "Enterprise Collaboration: From Simple Sharing to Getting Work Done".

6. Future of Work - Prepare for Demographic Shifts

Holger Mueller

In the next ten years 10% to 20% of the North American and European workforce will retire. Leaders need to understand and prepare for this tremendous shift so performance remains steady as many of the workforce's highly skilled workers retire.

To ensure smooth a smooth transition, ensure your HCM software systems can accommodate a massive number of retirements, successions and career path developments, and new hires from external recruiting.

Constellation fully expects employment to be a sellers market going forward. People leaders should ensure their HCM systems facilitate employee motivation, engagement and retention, lest they lose their best employees to competitors. Read "Globalization, HR, and Business Model Success". Additional cloud HR case studies here and here.

7. Digital Marketing Transformation - Brand Priorities Must Convey Authenticity

Ray Wang

Brand authenticity must dominate digital and analog channels in 2015. Digital personas must not only reflect the brand, but also expand upon the analog experience. Customers love the analog experience, so deliver the same experience digitally. Brand conscious leaders must invest in the digital experience with an eye towards mass personalization at scale. While advertising plays a key role in distributing the brand message, investment in the design of digital experiences presents itself as a key area of investment for 2015. Download free executive brief: Can Brands Keep Their Promise?

8. Consumerization of IT: Use Mobile as the Gateway to Digital Transformation Projects

Ray Wang

Constellation believes that mobile is more than just the device. While smartphones and other devices are key enablers of 'mobile', design in digital transformation should take into account how these technologies address the business value and business model transformation required to deliver on breakthrough innovation. If you have not yet started your digital transformation or are considering using mobile as an additional digital transformation point, Constellation recommends that clients assess how a new generation of enterprise mobile apps can change the business by identifying a cross-functional business problem that cannot be solved with linear thinking, articulating the business problem and benefit, showing how the solution orchestrates new experiences, identifying how analytics and insights can fuel the business model shift, exploiting full native device features, and seeking frictionless experiences. You'll be digital before you know it. Read "Why the Third Generation of Enterprise Mobile is Designed for Digital Transformation"

9. Technology Optimization & Innovation - Prepare Your Public Cloud Strategy

Holger Mueller

In 2015 technology leaders will need to create, adjust and implement their public cloud strategy. Considering estimates pegging Amazon AWS at 15-20% of virtualized servers worldwide, CIOs and CTOs need to actively plan and execute their enterprise’s strategy vis-a-vis the public cloud. Reducing technical debt and establishing next generation best practices to leverage the new ‘on demand’ IT paradigm should be a top priority for CIOs and CTOs seeking organizational competitiveness, greater job security and fewer budget restrictions.

Posted in Social Media, Security, Privacy, Constellation Research, Cloud, Big Data

Too smart?

An Engadget report today, "Hangouts eavesdrops on your chats to offer 'smart suggestions'" describes a new "spy/valet" feature being added to Google's popular video chat tool.

  • "Google's Hangouts is gaining a handy, but slightly creepy new feature today. The popular chat app will now act as a digital spy-slash-valet by eavesdropping on your conversations to offer 'smart suggestions.' For instance, if a pal asks 'where are you?' it'll immediately prompt you to share your location, then open a map so you can pin it precisely."

It's sad that this sort of thing still gets meekly labeled as "creepy". The privacy implications are serious and pretty easy to see.

Google is evidently processing the text of Hangouts as they fly through their system, extracting linguistic cues, interpreting what's being said using Artificial Intelligence, extracting new meaning and insights, and offering suggestions.

We need some clarification about whether any covert tests of this technology have been undertaken during the R&D phase. A company obviously doesn't launch a new product like this without a lot of research, feasibility testing, prototyping and testing. Serious work on 'smart suggestions' would not start without first testing how it works in real life. So I wonder if any of this evaluation was done covertly on live data? Are Google researchers routinely eavesdropping on hangouts to develop the 'smart suggestions' technology?

If so, is such data usage covered by their Privacy Policy (you know, under the usual "in order to improve our service" justification)? And is usage sanctioned internationally in the stricter privacy regimes?

Many people have said to me I'm jumping the gun, and that Google would probably test the new Hangouts feature on its own employees. Perhaps, but given that scanning gmails is situation normal for Google, and they have a "privacy" culture that joins up all their business units so that data may be re-purposed almsot without limit, I feel sure that running AI algorithms on text without telling people would be par for the course.

In development and in operation, we need to know what steps are taken to protect the privacy of hangout data. What personally identifiable data and metadata is retained for other purposes? Who inside Google is granted access to the data and especially the synthtised insights? How long does any secondary usage persist for? Are particularly sensitive matters (like health data, financial details, corporate intellectual property etc.) filtered out?

This is well beyond "creepy". Hangouts and similar video chat are certainly wonderdful technologies. We're using them routinely for teaching, education, video conferencing, collaboration and consultation. The tools may become entrenched in corporate meetings, telecommuting, healthcare and the professions. But if I am talking with my doctor, or discussing patents with my legal team, or having a clandestine chat with a lover, I clearly do not want any unsolicited contributions from the service provider. More fundamentally, I want assurance that no machine is ever tapping into these sorts of communications, running AI algorithms, and creating new insights. If I'm wrong about covert testing on live data, then Google could do what Apple did and publish an Open Letter clarifying their data usage practices and strategies.

Come to think of it, if Google is running natural language processing algorithms over the Hangouts stream, might they be augmenting their gmail scanning the same way? Their business model is to extract insights about users from any data they get their hands on. Until now it's been a crude business of picking out keywords and using them to profile users' interests and feed targeted advertising. But what if they could get deeper information about us through AI? Is there any sign from their historical business practices that Google would not do this? And what if they can extract sensitive information like mental health indications? Even with good intent and transarency, predicting healthcare from social media is highly problematic as shown by the "Samaritans Radar" experience.

Artificial Intelligence is one of the new frontiers. Hot on the heels of the successes of IBM Watson, we're seeing Natural Language Processing and analytics rapidly penetrate business and now consumer applications. Commentators are alternately telling us that AI will end humanity, and not to worry about it. For now, I call on people to simply think clearly through the implications, such as for privacy. If AI programs are clever enough to draw deep insights about us from what we say, then the "datapreneurs" in charge of those algorithms need to remember they are just as accountable for privacy as if they have asked us reveal all by filling out a questionnaire.

Posted in Social Networking, Social Media, Privacy, Internet, Big Data

Thinking creatively about information assets in retail and hospitality

In my last blog Improving the Position of the CISO, I introduced the new research I've done on extending the classic "Confidentiality-Integrity-Availability" (C-I-A) frame for security analysis, to cover all the other qualities of enterprise information assets. The idea is to build a comprehensive account of what it is that makes information valuable in the context of the business, leveraging the traditional tools and skills of the CISO. After all, security professionals are particularly good at looking at context. Instead of restricting themselves to defending information assets against harm, CISOs can be helping to enhance those assets by building up their other competitive attributes.

Let's look at some examples of how this would work, in some classic Big Data applications in retail and hospitality.

Companies in these industries have long been amassing detailed customer databases under the auspices of loyalty programs. Supermarkets have logged our every purchase for many years, so they can for instance put together new deals on our favorite items, from our preferred brands, or from competitors trying to get us to switch brands. Likewise, hotels track when we stay and what we do, so they can personalise our experience, tailor new packages for us, and try to cross-sell services they predict we'll find attractive. Behind the scenes, the data is also used for operations to plan demand, fine tune their logistics and so on.

Big Data techniques amplify the value of information assets enormously, but they can take us into complicated territory. Consider for example the potential for loyalty information to be parlayed into insurance and other financial services products. Supermarkets find they now have access to a range of significant indicators of health & lifestyle risk factors which are hugely valuable in insurance calculations ... if only the data is permitted to be re-purposed like that.

The question is, what is it about the customer database of a given store or hotel that gives it an edge over its competitors? There many more attributes to think creatively about beyond C-I-A!

  • Utility
  • It's important to rigorously check that the raw data, the metadata and any derived analytics can actually be put to different business purposes.
    • Are data formats well-specified, and technically and semantically interoperable?
    • What would it cost to improve interoperability as needed?
    • Is the data physically available to your other business systems?
    • Does the rest of the business know what's in the data sets?
  • Completeness
    • Do you know more about your customers than your competitors do?
    • Do you supplement and enrich raw customer behaviours with questionaires, or linked data?
    • How far back in time do the records go?
    • Do you understand the reason any major gaps? Do the gaps themselves tell you anything?
    • What sort of metadata do you have? For example, do you retain time & location, trend data, changes, origins and so on?
  • Currency & Accuracy
    • Is your data up to date? Remember that accuracy can diminish over time, so the sheer age of a long term database can have a downside.
    • What mechanisms are there to keep data up to date?
  • Permissions & Consent
    • Have customers consented to secondary usage of data?
    • Is the consent specific, blanket or bundled?
    • Might customers be surprised and possibly put off to learn how their loyalty data is utilised?
    • Do the terms & conditions of participation in a loyalty program cover what you wish to do with the data?
    • Do the Ts&Cs (which might have been agreed to in the past) still align with the latest plans for data usage?
    • Are there opportunities to refresh the Ts&Cs?
    • Are there opportunities for customers to negotiate the value you can offer for re-purposing the data?

  • Compliance
  • When businesses derive new insights from data, it is possible that they are synthesising brand new Personal Information, and non-obvious privacy obligations can go along with that. The competitive advantage of Big Data can be squandered if regulations are overlooked, especially in international environments.
    • So where is the data held, and where does it flow?
    • Are applications for your data compliant with applicable regulations?
    • Is health information or similar sensitive Personal Information extracted or synthesised, and do you have specific consent for that?
    • Can you meet the Access & Correction obligations in many data protection regulations?

For more detail, my new report, "Strategic Opportunities for the CISO", is available now.

Posted in Big Data, Constellation Research, Management theory, Privacy, Security

Improving the position of the CISO

Over the years, we security professionals have tried all sorts of things to make better connections with other parts of the business. We have broadened our qualifications, developed new Return on Security Investment tools, preached that security is a "business enabler", and strived to talk about solutions and not boring old technologies. But we've had mixed success.

Once when I worked as a principal consultant for a large security services provider, a new sales VP came in to the company with a fresh approach. She was convinced that the customer conversation had to switch from technical security to something more meaningful to the wider business: Risk Management. For several months after that I joined call after call with our sales teams, all to no avail. We weren't improving our lead conversions; in fact with banks we seemed to be going backwards. And then it dawned on me: there isn't much anyone can tell bankers about risk they don't already know.

Joining the worlds of security and business is easier said than done. So what is the best way for security line managers to engage with their peers? How can they truly contribute to new business instead of being limited to protecting old business? In a new investigation I've done at Constellation Research I've been looking at how classical security analysis skills and tools can be leveraged for strategic information management.

Remember that the classical frame for managing security is "Confidentiality-Integrity-Availability" or C-I-A. This is how we conventionally look at defending enterprise information assets; threats to security are seen in terms of how critical data may be exposed to unauthorised access, or lost, damaged or stolen, or otherwise made inaccessible to legitimate users. The stock-in-trade for the Chief Information Security Officer (CISO) is the enterprise information asset register and the continuous exercise of Threat & Risk Assessment around those assets.

I suggest that this way of looking at assets can be extended, shifting from a defensive mindset to a strategic, forward outlook. When the CISO has developed a birds-eye view of their organisation's information assets, they are ideally positioned to map the value of the assets more completely. What is it that makes information valuable exactly? It depends on the business - and security professionals are very good at looking at context. So for example, in financial services or technology, companies can compete on the basis of their original research, so it's the lead time to discovery that sets them apart. On the other hand, in healthcare and retail, the completeness of customer records is a critical differentiator for it allows better quality relationships to be created. And when dealing with sensitive personal information, as in the travel and hospitality industry, the consent and permissions attached to data records determine how they may be leveraged for new business. These are the sorts of things that make different data valuable in different contexts.

CISOs are trained to look at data through different prisms and to assess data in different dimensions. I've found that CISOs are therefore ideally qualified to bring a fresh approach to building the value of enterprise information assets. They can take a more pro-active role in information management, and carve out a new strategic place for themselves in the C-suite.

My new report, "Strategic Opportunities for the CISO", is available now.

Posted in Big Data, Constellation Research, Management theory