Watson the Doctor is no laughing matter

For the past year, oncologists at the Memorial Sloan Kettering Cancer Centre in New York have been training IBM’s Watson – the artificial intelligence tour-de-force that beat all-comers on Jeopardy – to help personalise cancer care. The Centre explains that “combining [their] expertise with the analytical speed of IBM Watson, the tool has the potential to transform how doctors provide individualized cancer treatment plans and to help improve patient outcomes”. Others are speculating already that Watson could “soon be the best doctor in the world”.

I have no doubt that when Watson and things like it are available online to doctors worldwide, we will see overall improvements in healthcare outcomes, especially in parts of the world now under-serviced by medical specialists [having said that, the value of diagnosing cancer in poor developing nations is questionable if they cannot go on to treat it].As with Google’s self-driving car, we will probably get significant gains eventually, averaged across the population, from replacing humans with machines. Yet some of the foibles of computing are not well known and I think they will lead to surprises.

For all the wondrous gains made in Artificial Intelligence, where Watson now is the state-of-the art, A.I. remains algorithmic, and for that, it has inherent limitations that don’t get enough attention. Computer scientists and mathematicians have known for generations that some surprisingly straightforward problems have no algorithmic solution. That is, some tasks cannot be accomplished by a universal step-by-step codified procedure. Examples include the Halting Problem and the Travelling Salesperson Problem. If these simple challenges have no algorithm, then surely we need be more sober in our expectations of computerised intelligence.

A key limitation of any programmed algorithm is that it must make its decisions using a fixed set of inputs that are known and fully characterised (by the programmer) at design time. If you spring an unexpected input on any computer, it can fail, and yet that’s what life is all about — surprises.

Now, I don’t think what humans do is magic; I do believe we are computers made out of meat. Nevertheless, when paradoxes like the Halting Problem abound, we can be sure that computing and cognition are not what they seem. We should hope these conundrums are better understood before putting too much faith in computers doing deep human work.

And yet, predictably, futurists are jumping ahead to imagine “Watson apps” in which patients access the supercomputer for themselves. Even if there were reliable algorithms for doctoring, I reckon the “Watson app” would be a much bigger step than winning at Jeopardy, because of the complex way another human’s conditions are assessed and data gathered for the diagnosis.

Consider the taking of a medical history.  In these days of billion-dollar investments in electronic health records (EHRs), we tend to think that medical decisions are all about the data. When politicians announce EHR programs they often boast that patients won’t have to go through the rigmarole of giving their history over and over again to multiple doctors as they move through a healthcare journey. This is actually a serious misunderstanding of the importance in clinical decision-making of the interaction between medico and patient when the history is taken. It’s subtle. The things a patient chooses to tell, the way they tell those things, the things they seem reticent about, and the questions that make them anxious all guide an experienced medico when taking a history, and provide extra cues — metadata if you will — about the patient’s condition.

Now, Watson may well develop the ability to navigate this complexity and conduct a very sophisticated Q&A. It will certainly have a vastly bigger and more reliable memory of cases than any doctor, and with that it could steer a dynamic patient questionnaire.

But will Watson be good enough to be made available direct to patients through an app, with no expert human mediation? Or will a host of new input errors result from patients typing their answers into a smart phone or speaking into a microphone, without any face-to-face subtlety (let alone human warmth)? It was true of mainframes and it’s just as true of the best AI: Bullshit in, bullshit out.

Finally, Watson’s existing linguistic limitations are not to be underestimated. It is surely not trivial that Watson struggles with puns and humour.

Futurist Mark Pesce when discussing Watson remarked in passing that scientists don’t understand the “quirks of language and intelligence” that create humour. The question of what makes us laugh does in fact occupy some of the finest minds in cognitive and social science. So we are a long way from being able to mechanise humour. And this matters because for the foreseeable future, it puts a great deal of social intercourse beyond AI’s reach.

In between the extremes of laugh-out-loud comedy and a doctor’s dry written notes lies a spectrum of expressive subtleties, like a blush, an uncomfortable laugh, and the humiliation that goes with some patients’ lived experience of illness. Even if Watson understands the English language, does it understand people?

Watson can answer questions, but good doctors ask a lot of questions too. When will this amazing computer be able to hold the sort of two-way conversation that we would call a decent “bedside manner”?