Update September 2012
The recent discovery that junk DNA is not actually junk rather reinforces my long standing thesis, espoused below, that we don't know enough about how genes work to be able to validate genetic engineering artifacts by empirical testing alone. I point out that computer programs are only validated by a coordinated mixture of testing, code inspection and theory, all of which based on knowing how the code works at the instruction level. But we don't have a terribly complete picture of how genes interact. We always knew they were 'massively parallel', and now it turns out that junk DNA has some sort of role in gene expression across the whole of the genome, raising the combinatorial complexity even further. This tells me we have little idea how modifications at one point in the genome can impact the functioning at any number of other points (but it hints at an explanation as to why human beings are so much more complex than nematodes despite having only a modestly larger genome).
And now there is news that a cow in New Zealand, genetically engineered in respect of one allergenic protein, was born with no tail. It's too early to blame the GM for this oddity, but equally, the junk DNA finding surely undermines the confidence that any genetic engineer can have in predicting that their changes cannot have had unexpected and really unpredictable side effects.
Original post, 15 Jan 2011
As a software engineer years ago I developed a deep unease about genetic engineering and genetically modified organisms (GM). The software experience suggests to me that GM products cannot be verifiable given the state of our knowledge about how genes work. I’d like to share my thoughts.
Genetic engineering proponents seem to believe the entire proof of a GM pudding is in the eating. That is, if trials show that GM food is not toxic, then it must be safe, and there isn't anything else to worry about. The lesson I want others to draw from the still new discipline of software engineering is there is more to the verification of correctness in complex programs than tesing the end product.
Recently I’ve come across an Australian government-sponsored FAQ Arguments for and against gene technology (May 2010) that supposedly provides a balanced view of both sides of the GM debate. Yet it sweeps important questions under the rug.[At one point the paper invites readers to think about whether agriculture is natural. It’s a highly loaded question grounded in the soothing proposition that GM is simply an extension of the age old artificial selection that gave us wheat, Merinos and all those different potatoes. The question glosses over the fact that when genes recombine under normal sexual reproduction, cellular mechanisms constrain where each gene can end up, and most mutations are still-born. GM is not constrained; it jumps levels. It is quite unlike any breeding that has gone before.]
Genes are very frequently compared with computer software, for good reason. I urge that the comparison be examined more closely, so that lessons can be drawn from the long standing “Software Crisis”.
Each gene codes for a specific protein. That much we know. Less clear is how relatively few genes -- 20,000 for a nematode; 25,000 for a human being -- can specify an entire complex organism. Science is a long way from properly understanding how genes specify bodies, but it is clear that each genome is an immensely intricate ensemble of interconnected biochemical short stories. We know that genes interact with each other, turning each other on and off, and more subtly influencing how each is expressed. In software parlance, genetic codes are executed in a massively parallel manner. This combinatorial complexity is probably why I can share fully half of my genes with a turnip, and have an “executable file” in DNA that is only 20% longer than that of a worm, and yet I can be so incredibly different from those organisms.
If genomes are like programs then let’s remember they have been written achingly slowly over eons, to suit the circumstances of a species. Genomes are revised in a real world laboratory over billions of iterations and test cases, to a level of confidence that software engineers can’t even dream of. Brassica napus.exe (i.e. canola) is at v1000000000.1. Tinkering with isolated parts of this machinery, as if it were merely some sort of wiki with articles open to anyone to edit, could have consequences we are utterly unable to predict.
In software engineering, it is received wisdom that most bugs result from imprudent changes made to existing programs. Furthermore, editing one part of a program can have unpredictable and unbounded impacts on any other part of the code. Above all else, all but the very simplest software in practice is untestable. So mission critical software (like the implantable defibrillator code I used to work on) is always verified by a combination of methods, including unit testing, system testing, design review and painstaking code inspection. Because most problems come from human error, software excellence demands formal design and development processes, and high level programming languages, to preclude subtle errors that no amount of testing could ever hope to find.
How many of these software quality mechanisms are available to genetic engineers? Code inspection is moot when we don’t even know how genes normally interact with one another; how can we possibly tell by inspection if an artificial gene will interfere with the “legacy” code?
What about the engineering process? It seems to me that GM is akin to assembly programming circa 1960s. The state-of-the-art in genetic engineering is nowhere near even Fortran, let alone modern object oriented languages.
Can today’s genetic engineers demonstrate a rigorous verification regime, given the reality that complex software programs are inherently untestable?
We should pay much closer attention to the genes-as-software analogy. Some fear GM products because they are unnatural; others because they are dominated by big business and a mad rush to market. I simply say let’s slow down until we’re sure we know what we're doing.
Maybe genetic engineering experts (I admit I am not one) could comment on ways that GM products might be made safe-by-design, to provide deeper assurance of safety than mere testing provides.
I read about GM bananas in New Yorker recently. Researchers assure us that because bananas are sterile, even if engineering does introduce a fault into the gennome, it won't be able to leave the plan and get into any progeny. But that argument assumes that the entire plant is still behaving normally. Because every gene potentially touches every other gene, who's to say that the GM organism is still perfectly predictable?
Again, I am not being paranoid here, I'm just saying that the job of verifying software is really tough, and the software profession has learned the hard way to be very careful with the assumptions it makes about program correctness and verifying correctness.
Point taken about the lack of assurance that GMO are safe since we still don't very fully understand how any genome works.
But what is "so incredibly different" between you and a worm??!!!?? You both have social lives, sex lives, private lives, toileting requirements, health considerations, dietary needs, sexual reproduction, jobs, emotional responses, energy levels. I suppose worms don't write blogs, drive Subarus, shop at Woolworths, pray to Jesus, Speak English with an accent, or listen to One Direction, but those are all recent developments in evolutionary terms. None of them mean anything more than a peacock's feathers mean to a peacock, in the context of the genome.
And the turnip... 50% similar? That sounds about right. You share a common ancestor. You share millions of years of grandparents. Turnips don't dance or write with ballpoint pens, but they have families and they have aspirations. They have the will to power. Compared to a piece of granite, a turnip is an amazing thing.
Certainly One Direction is all peacock feathers; their appeal is a case study in sexual selection.
But from a software developer's perspective, are you not surprised that with just 20% extra lines of code, the nematode was upgraded to be able to speak, write, drive, pray, shop and bop?
If the explanation lies in the newly discovered switching role of junk DNA, and if genes are switched on and off by bits of code spread across the genome, then I don't know how genetic engineers are able to predict the effects of gene splicing. And predict they must. My thesis is that, like software, black box testing of GMOs cannot be sufficient; we need to also perform the equivalent of code inspection, yet the fundamentals of the programming language are not yet understood.
Who's to say that arbitrary changes to a turnip's genome might not change its will to power into something more? Turnip the volume!