Lockstep

Mobile: +61 (0) 414 488 851
Email: swilson@lockstep.com.au

The birthday paradox and biometrics

The inventor of forensic DNA testing, Dr Alec Jeffreys, has cautioned that once millions of DNA samples are collected in population databases, false matches rise significantly.

DNA testing is not an infallible proof of identity. While Jeffreys' original technique compared scores of markers to create an individual "fingerprint," modern commercial DNA profiling compares a number of genetic markers - often 5 or 10 - to calculate a likelihood that the sample belongs to a given individual.

Jeffreys estimates the probability of two individuals' DNA profiles matching in the most commonly used tests at between one in a billion or one in a trillion, "which sounds very good indeed until you start thinking about large DNA databases." In a database of 2.5 million people, a one-in-a-billion probability becomes a one-in-400 chance of at least one match.

[Ref: DNA Fingerprint Privacy Concerns Jill Lawless, CBS News.]

Dr Jeffreys is alluding to the Birthday Paradox, where the chance of any pair of people being matched on a random trait rises dramatically and counter-intuitively in groups of people. At a gathering of just 25 people, the chances are better than 50:50 that a pair of people in the group will have the same birthday. The implication for forensic databases is that it's highly likely that somewhere in the set, there will be pairs of different people that happen to have biometric data that fall within the tolerance of the matching algorithm. In other words, the matching software will confuse them. The designers of driver licence and immigration databases need to put protocols in place that double-check automatic matches so as to avoid impugning innocent people. By-and-large, the protocols I have seen in practice work well, but these practicalities are glossed over by biometrics vendors who continue to over-hype their technologies.

In the context of population databases, we see once again why the adjective "unique" is a wrong and misleading way of describing biometrics. No biometric trait has a zero probability of a false match, so none of them can be described as "unique". And even the highly distinctive traits like DNA can lead to surprisingly frequent false detects in large databases.

So it bears repeating, biometrics don't work as well as suggested by science fiction movies.

Posted in Science, Biometrics

Science is more than the books it produces

These days it’s common to hear the modest disclaimer that there are some questions science can’t answer. I most recently came across such a show of humility by Dr John Kirk speaking on ABC Radio National’s Ockham’s Razor [1]. Kirk says that “science cannot adjudicate between theism and atheism” and insists that science cannot bridge the divide between physics and metaphysics. Yet surely the long history of science shows that divide is not hard and fast.

Science is not merely about the particular answers; it’s about the steady campaign on all that is knowable.

Science demystifies. Way before having all the detailed answers, each fresh scientific wave works to banish the mysterious, that which previously lay beyond human comprehension.

Textbook examples are legion where new sciences have rendered previously fearsome phenomena as firstly explicable and then often manageable: astronomy, physiology, meteorology, sedimentology, seismology, microbiology, psychology and neurology, to name a few.

It's sometimes said that in science, the questions matter more than the answers. Good scientists find a way to ask good questions. Great scientists show where there is no question anymore.

Once something profound is no longer beyond understanding, that state of affairs permeates society. Each wave of scientific advance is usually signalled by beneficial new technologies, but more importantly, deep down, what science does for the human condition is it imparts confidence. In an enlightened society, those with no scientific training still appreciate that science gets how the world itself works. And over time this vital communal confidence has supplanted astrologers, shamans, witch doctors, and even the churches. Laypeople may not know how televisions work, nor nuclear medicine, semiconductors, anaesthetics, antibiotics or fibre optics, but they sure know it’s not by magic.

The arc of science ever parts mystery’s curtain. Contrary to Dr Kirk's partitions, science frequently renders the metaphysical as natural and empirically knowable. My favorite example: To the pre-Copernican mind, the Sun was perfect and ethereal, but when Galileo trained his new telescope upon it, he saw spots. These imperfections were shocking enough, but the real paradigm shift came when Galileo observed the sunspots to move across the face, disappear and then return hours later on the other limb. Thus the Sun was shown―in what must have truly been a heart-stopping epiphany―to be a sphere turning on its axis: geometric, humble, altogether of this world, and very reasonably the centre of a solar system as Copernicus had reasoned a few decades earlier. This was science exercising its most profound power, titrating the metaphysical.

An even more dramatic turn was Darwin's discovery that all the world’s living complexity was explicable without god. He thus dispelled teleology (the search for ultimate reason). He not only neutralised the Argument from Design for the existence of god, but also the very need for god. The deepest lesson of Darwinism is that there is simply no need to ask "What am I doing here?" because the wonderous complexity of all of biology, including humanity's own existence are seen to have arisen through natural selection, without a designer, and moreover, without a reason. Darwin himself felt keenly the gravity of this outcome and what it would mean to his deeply religious wife, and for that reason he kept his work secret for so long. It seems philosophers appreciate the deep lessons of Darwinism more than our modest scientists: Karl Marx saw that evolution “deals the death-blow to teleology” and Frederich Nietzsche claimed “God is dead ... we have killed him”.

So why shouldn’t we expect science to continue? Why should we doubt―or perhaps fear―its power to remove all mystery? Of course many remaining riddles are very hard indeed, and I know there’s no guarantee science will be able to solve them. But I don't see the logic of rejecting the possibility that it will. Some physicists feel they’re homing in why the physical constants should have their special values. And many cognitive scientists and philosophers of the mind suspect a theory of consciousness is within reach. I’m not saying anyone yet gets it, but surely most would agree that consciousness just doesn’t feel like a total enigma anymore.

Science is more than the books it produces; it’s the optimism we will keep writing new ones.

References

[1]. “Why is science such a worry?” Ockham's Razor 18 December 2011 http://www.abc.net.au/radionational/programs/ockhamsrazor/ockham27s-razor-18-december-2011/3725968

Posted in Science, Culture

An improved frame for understanding Digital Identity

I’ve always been uneasy about the term “ecosystem” being coopted when what is really meant is “marketplace”. There’s nothing wrong with marketplaces! They are where we technologists study problems and customer needs, launch our solutions, and jockey for share. The suddenly-orthodox “Identity Ecosystem” as expressed in the NSTIC is an elaborate IT architecture, that defines specific roles for users, identity providers, relying parties and other players. And the proposed system is not yet complete; many commentators have anticipated that NSTIC may necessitate new legislation to allocate liability.

So it’s really not an "ecosystem”. True ecosystems evolve; they are not designed. If NSTIC isn’t even complete, then badging it an “ecosystem” seems more marketing than ecology; it’s an effort to raise NSTIC as a policy initiative above the hurly burly of competitive IT.

My unease about “ecosystem” got me thinking about ecology, and whether genuine ecological thinking could be useful to analyse the digital identity environment. I believe there is potential here for a new and more powerful way to frame the identity problem.

We are surrounded by mature business ecosystems, in which different sectors and communities-of-interest, large and small, have established their own specific arrangements for managing risk. A special part of risk management is the way in which customers, users, members or business partners are identified. There is almost always a formal protocol by which an individual joins one of these communities-of-interest; that is, when they join a company, qualify for a professional qualification, or open a credit account. Some of these registration protocols are set freely by employers, merchants, associations and the like; others have a legislated element, in regulated industries like aviation, healthcare and finance; and in some cases registration is almost trivial, as when you sign up to a blog site. The conventions, rules, professional charters, contracts, laws and regulations that govern how people do business in different contexts are types of memes. They have evolved in different contexts to minimise risk, and have literally been passed on from one generation to another.

As business environments change, risk management rules in response change too. And so registration processes are subject to natural selection. An ecological treatment of identity recognises that “selection pressures” act on those rules. For instance, to deal with increasing money laundering and terrorist financing, many prudential regulators have tightened the requirements for account opening. To deal with ID theft and account takeover, banks have augmented their account numbers with Two Factor Authentication. The US government’s PIV-I rules for employees and contractors were a response to Homeland Security Presidential Directive HSPD-12. Cell phone operators and airlines likewise now require extra proof of ID. Medical malpractice in various places has led hospitals to tighten their background checks on new staff.

It's natural and valuable, up to a point, to describe "identities" being provided to people acting in these different contexts. This abstraction is central to the Laws of Identity. Unfortunately the word identity is suggestive of a sort of magic property that can be taken out of one context and used in another. So despite the careful framing of the Laws of Identity, it seems that people still carry around a somewhat utopian idea of digital identity, and a tacit belief in the possibility of a universal digital passport (or at least a greatly reduced number of personal IDs). I have argued elsewhere that the passport is actually implausible. If I am right about that, then what is sorely needed next is a better frame for understanding digital identity, a perspective that helps people stay away of the dangerous temptation of a single passport, and to understand that a plurality of identities is the natural state of being.

All modern identity thinking recognises that digital identities are context dependent. Yet it seems that federated identity projects have repeatedly underestimated the strength of that dependence. The federated identity movement is based on an optimism that we can change context for the IDs we have now, and still preserve some recognisable and reusable core identity. Or alternatively, create a new smaller set of IDs that will be useful for transacting with a superset of services. Such “interoperability” has only been demonstrated to date in near-trivial use cases like logging onto blog sites with unverified OpenIDs, or Facebook or Twitter handles. More sophisticated re-use of identities across context – such as the Australian banking sector’s ill-fated Trust Centre project – have foundered, even when there is pre-existing common ground in identification protocols.

The greatest challenge in federated identity is getting service and identity providers that are accustomed to operating in their own silos, to accept risks arising from identification and/or authentication performed on/by their members in other silos. This is where the term “identity” can be distracting. It is important to remember that the “identities” issued by banks, government agencies, universities, cell phone companies, merchants, social networks and blog sites are really proxies for the arrangements to which members have signed up.

This is why identity is so very context dependent, and so why some identities are so tricky to federate.

If we think ecologically, then a better word for the “context” that an identity operates in may be niche. This term properly evokes the tight evolved fit between an identity and the setting in which it is meaningful. In most cases, if we want to understand the natural history of identities, we can look to the existing business ecosystem from where they came. The environmental conditions that shaped the particular identities issued by banks, credit card companies, employers, governments, professional bodies etc. are not fundamentally changed by the Internet. As such, we should expect that when these identities transition from real world to digital, their properties – especially their “interoperability” and liability arrangements -- cannot change a great deal. It is only the pure cyber identities like blogger names, OSN handles and gaming avatars that are highly malleable, because their environmental niches are not so specific.

As an aside, noting how they spread far and wide, and too quickly for us to predict the impacts, maybe it's accurate to liken OpenID and Facebook Connect to weeds!

A lot more work needs to be done on this ecological frame, but I’m thinking we need a new name, distinct from "federated" identity. It seems to me that most business, whether it be online or off, turns on Evolved Identities. If we appreciate the natural history of identities as having evolved, then we should have more success with digital identity. It will become clearer which identities can federate easily, and which cannot, because of their roots in real world business ecosystems.

Taking a digital identity (like a cell phone account) out of its natural niche and hoping it will interoperate in another niche (like banking) can be compared to taking a tropical salt water fish and dropping it into a fresh water tank. If NSTIC is an ecosystem, it is artificial. As such it may be as fragile as an exotic botanic garden or tropical aquarium. I fear that full blown federated identity systems are going to need constant care and intervention to save them from collapse.

Posted in Science, Language, Identity, Federated Identity, Culture, Security

What's the point of astronomy?

Astronomy is the archetypal 'royal science'. It can seem pointless, especially when the projects like the Hubble Space Telescope cost a billion dollars or more. Some justify astronomy on the grounds of the spinoffs, like better radio antenna technology. Some say our destiny is to emigrate from the planet, so we'd better start somewhere. Others simply assert unapologetically that peering into the heavens is what homo sapiens does, at any cost.

Here's a different rationale. To my mind (as an ex astronomer) the deepest practical value of astronomy is it enables us to perform experiments that are just too grand to ever be done on earth. The limits of earth-bound experiments of course change over time, but during each era, astronomy has furnished answers to questions that elude terrestrial investigation.

For example, astronomers were the first to:

– measure the diameter of the earth (by observing the angle of the sun in the sky at noon at different latitudes, in ancient Greece and Egypt)

– measure the speed of light (by observing the transit of the moons of Jupiter)

– demonstrate the General Theory of Relativity (by observing the precession of the orbit of Mercury, and measuring during a solar eclipse the bending of star light by the Sun's gravity )

– detect gravity waves (by observing the slowing of a binary millisecond pulsar).

There must be other examples.

And here's another thing. Obviously it was a vital that humankind debunk the early mysticism about the perfection and mystery of the Sun, and the misconception that the Earth is the centre of the universe. It was when we philosophically grew up and left home. The Coppernican revolution was much more than star gazing.

So ... one of my all-time favorite gooose-bump moments in the history of science is Galileo observing sunspots for the first time, by projecting the Sun through his telecope onto a white sheet. Until then, they all thought the Sun was a perfect unchanging disc of light. But the sunspots represented blemishes and nullified that perfection! Yet more importantly, Galileo saw that the spots moved across the face of the disc over a few hours, disappeared and came back later back at the start. Gadzooks! The sun turns out to be a turning ball!

It was a critical moment of unification, evidence contributing to the realisation that all of the things and all of the stuff in the universe are fundamentally the same.

It's not the only time an intellectual watershed -- a radical, instant, disruptive re-framing of our place in the world -- has been delivered by astronomy.

For example -- and it wasn't really so long ago -- in the early 20th century, Edwin Hubble and his peers established that the 'nebulae' were actually galaxies just like the Milky Way, and that the universe therefore had to be tens of millions of times bigger than previously thought. This kind of revelation about our place in the scheme of things only comes from astronomy. And so astronomy is therefore a very practical pursuit after all.

Posted in Science, Culture

We're not ready for genetic engineering

As a software engineer years ago I developed a deep unease about genetic engineering and genetically modified organisms (GM). The software experience suggests to me that GM products cannot be verifiable given the state of our knowledge about how genes work. I’d like to share my thoughts.

Personally, I'm not even slightly queasy about eating the stuff; after all, I smoked when I was young and I still drink too much sometimes, so I'm not inclined to whinge non-specifically about poisons entering my body. I'm sure GM foods are not going to hurt me. Yet some GM proponents would have us believe that the entire proof of this pudding is in the eating. That is, if trials show that GM food is not toxic, then it must be safe, and there isn't anything else to worry about. The lesson I want others to draw from the still new discipline of software engineering is that verification of correctness in complex systems is much much harder than tesing the end product.

Recently I’ve come across an Australian government-sponsored FAQ Arguments for and against gene technology (May 2010) that supposedly provides a balanced view of both sides of the GM debate. Yet it sweeps important questions under the rug.

[At one point the paper invites readers to think about whether agriculture is natural. It’s a soothing question grounded in the cool proposition that GM is simply an extension of the age old artificial selection that gave us wheat, Merinos and all those different potatoes. Yet the question glosses over the fact that when genes recombine under normal sexual reproduction, cellular mechanisms constrain where each gene can end up, and most mutations are still-born. GM is not constrained; it jumps levels. It is quite unlike any breeding that has gone before.]

Genes are very frequently compared with computer software, for good reason. I urge that the comparison be examined more closely, so that lessons can be drawn from the long standing “Software Crisis”.

Each gene codes for a specific protein. That much we know. Less clear is how relatively few genes -- 20,000 for a nematode; 25,000 for a human being -- can specify an entire complex organism. Science is a long way from properly understanding how genes specify bodies, but it is clear that each genome is an immensely intricate ensemble of interconnected biochemical short stories. We know that genes interact with each other, turning each other on and off, and more subtly influencing how each is expressed. In software parlance, genetic codes are executed in a massively parallel manner. This combinatorial complexity is probably why I can share fully half of my genes with a turnip, and have an “executable file” in DNA that is only 20% longer than that of a worm, and yet I can be so incredibly different from those organisms.

If genomes are like programs then let’s remember they have been written achingly slowly over eons, to suit the circumstances of a species. Genomes are revised in a real world laboratory over billions of iterations and test cases, to a level of confidence that software engineers can’t even dream of. Brassica napus.exe (i.e. canola) is at v1000000000.1. Tinkering with isolated parts of this machinery, as if it were merely some sort of wiki with articles open to anyone to edit, could have consequences we are utterly unable to predict.

In software engineering, it is received wisdom that most bugs result from imprudent changes made to existing programs. Furthermore, editing one part of a program can have unpredictable and unbounded impacts on any other part of the code. Above all else, all but the very simplest software in practice is untestable. So mission critical software (like the implantable defibrillator code I used to work on) is always verified by a combination of methods, including unit testing, system testing, design review and painstaking code inspection. Because most problems come from human error, software excellence demands formal design and development processes, and high level programming languages, to preclude subtle errors that no amount of testing could ever hope to find.
How many of these software quality mechanisms are available to genetic engineers? Code inspection is moot when we don’t even know how genes normally interact with one another; how can we possibly tell by inspection if an artificial gene will interfere with the “legacy” code?

What about the engineering process? It seems to me that GM is akin to assembly programming circa 1960s. The state-of-the-art in genetic engineering is nowhere near even Fortran, let alone modern object oriented languages.

Can today’s genetic engineers demonstrate a rigorous verification regime, given the reality that complex software programs are inherently untestable?

We should pay much closer attention to the genes-as-software analogy. Some fear GM products because they are unnatural; others because they are dominated by big business and a mad rush to market. I simply say let’s slow down until we’re sure we know what we're doing.

Posted in Software engineering, Science

There is no algorithm for good management

There is a malaise in security. One problem is that as a "profession", we’ve tried to mechanise security management, as if it were just like generic manufacturing, ammenable to ISO 9000-like management standards. We use essentially the same security processes and policy templates for all businesses. Don’t get me wrong: process is important, and we do want our security responses to be repeatable and uniform. But not robotic. The truth is, there is no algorithm for doing the right thing.

An algorithm is a repeatable set of instructions or recipe that can be followed to automatically perform some task or solve some structured problem. Given the same conditions and the same inputs, an algorithm will always produce the same results. But no algorithm can cope with unexpected inputs or events; an algorithm’s designer needs to have a complete view of all circumstances in advance.

Mathematicians have long known that some surprisingly simple tasks cannot be done algorithmically. The classic ‘travelling salesman’ problem, of how to plot the shortest course through multiple connected towns, has no single recipe for success. There is no way to trisect an angle using a compass and a ruler. There is no consistent way to tell if any given computer program is ever going to stop.

So when security is concerned so much of the time with the unexpected, we should be doubly careful about formulaic management approaches, especially template policies and checklist-based security audits!

Ok, but what's the alternative? This is extremely challenging, but we need to think outside the check box.

Like any complex management field, security is all about problem solving. There’s never going to be a formula for it. Rather, we need to put smart people on the job and let them get on with it, using their experience and their wits. Good security like good design frankly involves a bit of magic. We can foster security excellence through genuine expertise, teamwork, research, innovation and agility. We need security leaders who have the courage to treat new threats and incidents on their merits, trust their professional instincts, try new things, break the mould, and have the sense to avoid management fads.

I have to say I remain pessimistic. These are not good times for couragous managers. For the first rule of career risk management is to make sure everyone agrees in advance to whatever you plan to do, so the blame can be shared when something goes wrong. This is probably the real reason why people are drawn to algorithms in management: they can be documented, reviewed, signed off, and put on back the shelf in wait for a disaster and the inevitable audit. So long as everyone did what the said they were going to do in response to an incident, nobody is to blame.

So I'd like to see a law suit against a company with a perfect ISO 27001 record which still got breached, where the lawyers's case is that it is unreasonable to rely on algorithms to manage in the real world.

Posted in Science, Management theory, Security

Generic verisimilitude

"Generic verisimilitude" is a nice big word! It means the accepted visual language that conveys reality in genre movies. In other words, cinematic cliches.

I've always been bemused by the sideways figure-of-eight black frame that tells us when a movie character is looking through binoculars. Have movie makers ever actually used binoculars? You don't get the sideways "8", instead you just see a nice black circle. But it's not the worst example.

I saw "Rachel getting married" a year or two ago, and thought it was pretty good except for the madly excessive handycam wobble. And I got thinking about that and realised what a terrible artifice it is. Ironically, handicam wobble has become the leading sign of generic verisimilitude in 'gritty' moviemaking, yet the wobble is entirely fictional.

One of the marvels of the human brain is the way it produces a steady image as we move around. We can walk, run, jump up and down even on a trampoline, and our steadfast perception of the world is that it stands still. This complicated feat of cognition is thought to involve feedback mechanisms that allow the brain to compensate for the visual field shifting around on our retinas as the skull moves, sorting out which movements are apparent because we're moving, and which movements are really out there. It's a really vital survival tool; you couldn't chase down a gazelle on the savanna if your cognition was confused by your own mad dashing about.

So, if the world doesn't actually look to me like it shifts when I move, then WTF is the point of a film maker foisting this jerkiness upon us? If I was really in the place of the cinematographer, no matter how much I dance about, I would't see any wobble.

It's odd that in the face of suspension-of-disbelief, when the audience is already putty in their hands, filmmakers inject these falsehoods into the visual language of otherwise hyper-realistic movies.

Posted in Science, Popular culture