The birthday paradox and biometrics

The inventor of forensic DNA testing, Dr Alec Jeffreys, has cautioned that once millions of DNA samples are collected in population databases, false matches rise significantly.

DNA testing is not an infallible proof of identity. While Jeffreys’ original technique compared scores of markers to create an individual “fingerprint,” modern commercial DNA profiling compares a number of genetic markers – often 5 or 10 – to calculate a likelihood that the sample belongs to a given individual.

Jeffreys estimates the probability of two individuals’ DNA profiles matching in the most commonly used tests at between one in a billion or one in a trillion, “which sounds very good indeed until you start thinking about large DNA databases.” In a database of 2.5 million people, a one-in-a-billion probability becomes a one-in-400 chance of at least one match.

Dr Jeffreys is alluding to the Birthday Paradox, where the chance of any pair of people being matched on a random trait rises dramatically and counter-intuitively in groups of people. At a gathering of just 25 people, the chances are better than 50:50 that a pair of people in the group will have the same birthday. The implication for forensic databases is that it’s highly likely that somewhere in the set, there will be pairs of different people that happen to have biometric data that fall within the tolerance of the matching algorithm. In other words, the matching software will confuse them. The designers of driver licence and immigration databases need to put protocols in place that double-check automatic matches so as to avoid impugning innocent people. By-and-large, the protocols I have seen in practice work well, but these practicalities are glossed over by biometrics vendors who continue to over-hype their technologies.

In the context of population databases, we see once again why the adjective “unique” is a wrong and misleading way of describing biometrics. No biometric trait has a zero probability of a false match, so none of them can be described as “unique”. And even the highly distinctive traits like DNA can lead to surprisingly frequent false detects in large databases.

So it bears repeating, biometrics don’t work as well as suggested by science fiction movies.