If data is the new black (gold) how do we protect the supply?

Oil wells at Huntington Beach

Millions of people, and whole slices of society, are falling foul of bad data.

Data is now a defining utility, as important as clean drinking water and stable electricity, yet most of the information we need every day is sourced, consumed, acted on and passed on without any standards or controls. Apart from academics and a few other professionals, most people are completely on their own when it comes to deciding who to trust and what to believe.

Bad data operates on multiple levels. Consider online fraud: one way or another, most fraudsters make use of erroneous data.

Card Not Present fraud is powered by stolen credit card numbers which can be replayed against unwitting merchants. The latest Authorised Push Payment fraud, now booming in the UK and US, exploits the reduced friction of fast payments platforms and the lack of verification of destination accounts. More elaborate impersonations use rich personal data, often harvested from social media profiles, to fool Knowledge Based Authentication and account opening processes. And entirely synthetic identities can be created by skilful criminals who know the blind spots of financial security systems.

All these frauds take advantage of the difficulty we have detecting digital lies.  

With consumers relying more and more on information, and businesses and governments too switching en masse from physical to digital experiences, we urgently require assurance of the accuracy of all data — be they credit card numbers, current affairs, emails from the boss, SMS messages from the bank, or videos of Tom Cruise.

Instead of using data on an entirely ad hoc basis, we need to treat it as seriously as any critical utility, and protect it accordingly. To deliver reliable data in real-time, wherever it’s needed, we need a blend of technologies, services, standards and rules and eventually legislation that all work together.

We need what I’m now calling an “infostructure”.

Infostructure is broader than traditional critical infrastructure such as poles and wires, telecoms and pipelines. It’s a comprehensive system for protecting data itself as a utility. Infostructure is how we are going to organise the cloud in future to safeguard our digitised society, and it’s becoming clear what this will look like.

It’s popular to liken data to crude oil. The comparison is contested by many, but I find it useful on many fronts because it points to regulatory precedents and powerful historical lessons. There’s certainly no implication that data should be like the oil industry. The metaphor points to dangers as much as opportunities. 

Contrast Huntington Beach, California, in the early 1920s and how it is today. Before the oil industry was regulated, drillers were laying waste to vast tracts of land, running roughshod over land owners. Now it’s all suburban housing and a clean beach. 

Huntington Beach in 2020
Top: An oil field on former farmland near Alabama and Clay streets in Huntington Beach, California, during the oil boom of the early 1920s. This field has since been cleaned up and is now suburban housing. (Photo: J F Hoyer/City of Huntington Beach archives) Here: Suburban housing in Huntington Beach, California, in 2020 with not a single oil well to be seen. (Photo: Mike Fox/Unsplash)

We haven’t got rid of oil today — tragically we’re digging up more than ever — but we have domesticated it. We regulate the mining, refining, distribution, consumption and disposal of petroleum products. The rules are especially tight at the consumer end of the supply chain, with rigorous licensing and certifying of gas stations. The community wouldn’t have it any other way. Nevertheless the petrochemical and automobile industries have still seen a long history of innovation and competition within these tight regulations.

Data supply today is a lot like the oil industry 100 years ago.

The embryonic information economy is a free-for-all. Stupendous wealth is being generated by a powerful new class of technocrats. These new magnates exploit the apparently free and endless availability of raw data, harvested through the services they roll out as “for free”.

But now the reality of Surveillance Capitalism is widely accepted. Huge asymmetries are emerging between consumers who, witlessly for the most part, provide endless raw data to search engines, social networks, messaging and media services, and highly skilled technologists who monetise the higher-order insights they extract using proprietary and often opaque algorithms.

Covert surveillance is only the tip of this iceberg. The underlying analytics are what are making the richest billionaires the world has ever seen.

The oil rush triggered societal changes and new rules to deal with harms that until then had never been contemplated. It even led to new jurisprudence in land rights, and new asset classes such as exploration licences. Similarly, today’s digital free-for-all won’t last forever. We should expect that data will be subject to brand new types of laws, despite it being the most intangible of assets.

A practical infostructure is starting to take shape, built in part around exciting new models of managed cryptography and Data Protection as-a-Service — but these are stories for another time.