A software engineer's memoir (work in progress)

I’m an ex software “engineer” [I have reservations about that term] with some life experience of ultra high rel development practices. It’s fascinating how much about software quality I learned in the 1980s and 90s is relevant to info sec today.

I’ve had a trip down memory lane triggered by Karen Sandler’s presentation at LinuxConf12 in Ballarat http://t.co/xvUkkaGl and her paper “Killed by code”.

The software in implantable defibrillators

I’m still working my way through Karen Sandler’s materials. So this post is a work in progress.

What’s really stark on first viewing of Karen’s talk is the culture she experienced and how it differs from the implantable defib industry I knew in its beginnings 25 years ago.

Karen had an incredibly hard and very off-putting time getting the company that made her defib to explain their software. But when we started in this field, every single person in the company — and many of our doctors — would have been able to answer the question What software does this defib run on?: the answer was “ours”. And moreover, the FDA were highly aware of software quality issues. The whole medical device industry was still on edge from the notorious Therac 25 episode, a watershed in software verification.

A personal story

I was part of the team that wrote the code for the world’s first software controlled implantable cadrioverter/defibrillator (ICD).

In 1990 Telectronics (a tragic legend of Australian technology) released the model 4210, which was just the fourth or fifth ICD on the market (the first few being hard-wired devices from CPI Inc. and Telectronics). The computing technology was severely restricted by several design factors, most especially ultra low power consumption, and a very limited number of microprocessor vendors that would warrant their chips for use in medical devices. The 4210 defib used a semi-customised 8 bit micro-controller based on the 6502, and a 32 KB byte-organised SRAM chip that held the entire executable. The micro clocked at 128kHz, fully eight times slower than the near identical micro in the Apple II a decade earlier. The software had to be efficient, not only to ensure it could make some very tough real time rendezvous, but to keep the power consumption down; the micro consumed about 30% of the device’s power over its nominal five year lifetime.

Software development

We wrote mostly in C, with some assembly coding for the kernel and some performance sensitive routines. The kernel was of our own design, multi-tasking, with hard real time performance requirements (in particular, for obvious reasons the system had to respond within tight specs to heart beat interrupts and we had to show we weren’t ever going to miss an interrupt!) We also wrote the C compiler.

The 4210’s software was 40,000 lines of C, developed by a team of 5-6 over several years; the total effort was 25 person-years. Some of the testing and pre-release validation is described in my blog post about coding being like play writing. The final code inspection involved a team of five working five-to-six hour days for two months, reading aloud and understanding every single line. When occasion called for checking assembly instructions, sometimes we took turns with pencil and paper pretending to be the accumulators, the index registers, the program counter and so on. No stone was left unturned.

The final walk-through was quite a personnel management challenge. One of the senior engineers (a genius who also wrote our kernel and compiler) lobbied for inspecting the whole executable because he didn’t want to rely on the correctness of the compiler — but that would have taken six months. So we compromised by walking through only the assembly code for the critical modules, like the tachycardia detector and the interrupt handlers.

I mentioned that the kernel and compiler were home-grown. So this meant that the company controlled literally every single bit of code running in its defibs. And I reiterate we had several individuals who knew the source code end to end.

By the way, these days I will come across developers in the smartcard industry who find it hard working on applets that are 5 or 10 kilobytes small. Compare say a digital signing applet with an ICD, with its cardiac monitoring algorithms, treatment algorithms, telemetry controller, data logging and operating system all squeezed into 32KB.

Reliability

We amassed several thousand implant-years of experience with the 4210 before it was superseded. After release, we found two or three minor bugs, which we fixed with software upgrades. None would have caused a misfire, neither false positive or false negative.

Yes, for the upgrade we could write into the RAM over a proprietary telemetry protocol. In fact the main reason for the one major software upgrade in the field was to add error correction because after hundreds of device-years we noticed higher than expected bit flips from natural background radiation. That’s a helluva story in itself. It was observed that had the code been in ROM, we couldn’t have changed it but we wouldn’t have had to change it for bit flips either.

Morals of the story

Anyway, some of the morals of the story so far:

Software then was cool and topical, and the whole company knew how to talk about it. The real experts — the dozen or so people in Sydney directly involved in the development — were all well known worldwide by the executives, the sales reps, the field clinical engineers, and regulatory affairs. And we got lots of questions (in contrast to Karen Sandler’s experience where all the caridologists and company people said nobody ever asked about the code).

Everything about the software was controlled by the company: the operating system, the chip platform, the compiler, the telemetry protocol.

We had a team of people that knew the code like the backs of their hands. Better in fact. It was reliable and, in hindsight, impregnable. Not that we worried about malware back in 1987-1990.

Where has software development gone?

So the sorts of issues that Karen Sandler is raising now, over two decades on, are astonishing to me on so many levels.

Why would anyone decide to write life support software on someone else’s platform?

Why would they use wifi or Bluetooth for telemetry?

And if the medical device companies cut corners in software development, one wonders what the defense industry is doing with their drone flight controllers and other “smart” weaponry with its countless millions of lines of opaque software.

Management theory, Software engineering

A software engineer’s memoir (work in progress)