Modern aircraft incorporate extensive networks of on-board computer chips and software. Electronic “fly-by-wire” systems now interpret and modify every pilot command. These systems have led to new benefits, but they’ve also resulted in unprecedented challenges that the FAA must meet in its certification of safety systems. In light of the failure of a critical software-based system that led to two recent Boeing 737 Max crashes, there are questions about the ability of companies and regulators to maintain the benefits of software without compromise to safety.
Have you been annoyed by your cell phone’s insistent demands to update its operating system software? Or perhaps its alerts to update its application software, which, after all, must be consistent with that operating system update? Imagine the pressure on Boeing to update its 737 Max airliner’s Maneuvering Characteristics Augmentation System (MCAS) – the anti-stall software that led to the Lion Air Flight 610 catastrophe, in October 2018, and then to the Ethiopian Airlines Flight 302 disaster, in March 2019, together causing 346 fatalities.
Boeing 737 crash in Ethiopia
Commercial aircraft, whether made by Boeing or Airbus, now incorporate millions of lines of software code that require regular updates. This is also true for military aircraft. When the U.S. Department of Operational Test and Evaluation (DOTE) said, in 2017, that Lockheed Martin’s F-35 stealth fighter jet – the most expensive weapon system in history – would be delayed once again because “more software patches will likely be needed,” the tax payer’s appropriate comment should have been “Duh!” When it comes down to it, the F-35 is little else than flying titanium and software!
How did we get to this point and what lies ahead? To answer that question, we need first to understand why the computer chip has proliferated into aviation.
Last year, over 20 billion computer chips were shipped. This number should astound you because there were “only” 300 hundred million PCs shipped in 2018 and “only” two billion cell phones shipped that year. So where are the other 18 billion or so computer chips hiding?
Computer chips can be used in two types of systems: “open” and “embedded.” Open systems like PCs and Macs can be programmed by you, the end customer. By contrast, embedded systems can be only be programmed by the product developers themselves.
Embedded computer chips are used in home TV sets, in car navigation systems and, well, in the 737 Max MCAS. The ones in Boeing’s airliner cannot be programmed by United Airlines, and certainly not by you, the airplane passenger. They can only be programmed by Collins Aerospace, a Boeing supplier.
So, now that we’ve found the “missing” computer chips, we are ready to ask “why?” – why did 18 billion embedded computer chips ship last year? Wouldn’t one or two billion have been enough?
Hand-in-hand with dramatically increasing transistor density predicted by Moore’s Law came increasing computer chip design and fabrication costs. Today a state-of-the-art fabrication facility (a “fab”) costs over $10 billion. Furthermore, the engineering cost to design a computer chip prior to fab often exceeds $100 million.
In order to complete complicated chips on schedule, designers resorted to incorporating more and more pre-tested computer circuits into their chip designs. In this way, the function of the chip could be finalized later, when its software is loaded, rather than being intrinsic to the transistor design itself. As a result, almost all chips became computer chips.
This development delivered another advantage: The same chip could be programmed in different ways, for different applications, amortizing its design and fabrication costs over more sales.
Boeing 737 MAX airliner
Turning every chip into a computer chip certainly had huge advantages to the developers. But where is the customer, the end-user in this story? Simply put, computer chips, whether in open systems or embedded systems, often enable a spectrum of applications for the customer that would otherwise be impossible.
Will you be watching a MagellanTV series or documentary tonight? If so, your TV computer chips will be using a complex digital video decoder that would not otherwise be possible. Streaming video services without computer chips would be like trying to put out a fire by blowing water through a straw!
But with this flexibility comes the dark side: software updates. Not all software updates are ugly – some do deliver new capabilities. For instance, Tesla can increase a car’s MPG with a software update. However, in a high percentage of instances the update is required because the original software isn’t perfect; in fact, it’s too complicated to be perfect. These updates patch bugs in the features, patch security holes uncovered by new viruses (please keep your Windows software up to date!) or, on occasion, patch safety problems. Boeing, are you still there?
Modern aircraft “fly-by-wire,” which means without direct mechanical linkages between the pilot’s actions and the airplane flaps and rudder. Instead, the pilot’s actions cause a computer chip to run its software program, then signal the next computer chip to run its program, and so on until the last computer chip in the chain tells the flaps to move.
The trend to fly-by-wire started long ago but matured in the 21st century to the point where some aircraft like the Airbus A380 have no mechanical backup systems.
Fly-by-wire has a number of benefits: 1) aircraft weight is saved, reducing acquisition and fuel costs; 2) pilot information displays are more extensive; and 3) component reliability against wear-out failure is improved. Fundamentally, the fly-by-wire software that processes the pilot’s commands can ensure that the plane flies within its optimal safety and fuel economy envelope; the fly-by-wire computer chips do not merely pass the pilot’s commands along unchanged, they actually analyze the commands – and can massage them or even override them.
Today there is no going back; modern commercial aviation simply can’t exist without fly-by-wire.
As the 737 Max tragedies demonstrated, the complexity of software poses a new kind of risk. Back in 1980, Earl Wiener, an aviation guru, wrote, “Digital devices tune out small errors while creating opportunities for large errors.”
To manage the digital device complexity, aircraft computer chip networks are partitioned into four domains, in order of decreasing risk:
Data transfers between domains are controlled. After all, we don’t want that guy in seat 12A to take over the aircraft controls.
Within each domain, the FAA applies de facto safety standards to software (DO-178C) and to hardware (DO-254). The standards attempt to control and document all aspects of airborne system development: planning, design, testing, configuration management, quality assurance. Potential hazards are identified by a System Safety Analysis and, as a result, systems must conform to one of five assurance levels.
Level A (catastrophic) classification means that failure may cause deaths, usually with the loss of the airplane. Level B (hazardous) means that failure has a large negative impact but is less likely to cause a crash. And so on, down to Level E where failures are analyzed to have no safety impact.
Boeing’s MCAS anti-stall system sits in Domain #1, flight control, where failures can often become catastrophic. Unfortunately, in March 2019, investigative reports by the Seattle Times revealed that Boeing employees, working under the auspices of the FAA, classified MCAS failure as “hazardous” (Level B) rather than “catastrophic” (Level A). In general, Level A classification requires measures such as system redundancy to reduce the likelihood of failure. But meeting these requirements would have delayed the 737 MAX introduction.
If, as Boeing’s CEO Dennis Muilenburg implied at his shareholder’s meeting, the pilots should have been able to override the MCAS failures to save flights 610 and 302, then Level B was the correct classification. But most outsiders believe that it was not reasonable to expect the pilots to be able to override the MCAS failures, so such failures should have been classified as “catastrophic” (Level A). In fact, many argue that in the rush to compete against the Airbus A320neo airliner, the MCAS risk was downplayed. On the other hand, Boeing insists that MCAS was certified to FAA requirements; if so, it was certified to a Level B rather than Level A safety standard.
Fundamentally, safety standards for airborne systems rely on extensive documentation and reviews. Even when systems are properly classified, their computer chips and software are highly complex.
Aircraft suppliers rely on vast computer-aided design (CAD) resources to develop new fly-by-wire technology. Is documentation and review sufficient to insure the safety of these systems? You might say, “Of course not!” However, it is important to note that safety standards for the most sensitive systems constrain their technical complexity in order to help ensure a safe outcome. The fact is more complex computer chips create more pitfalls for software. Where is the right balance in aircraft design between unleashing computer chip technology and throttling it for safety?
Long after the 737 Max is returned to flight, aircraft design will continue to be driven by demands for more fuel economy, more reliability, and more features – as well as safety. You, the informed public, must ask whether an inflection point has been reached where the siren song of computer chips and their increasingly complex software has overcome the resources of Boeing, Airbus, and the regulatory agencies to protect passenger safety.
Have a nice flight!
W. Patrick Hays spent 40 years as a designer, manager, and executive responsible for new computer chip development. He co-founded Lexra, Inc., and, from 2013 to 2018, was a manager and chief engineer at Boeing Defense, Space & Security.
The Wright Brothers revolutionized early aviation in...
Trains of the future are looking more and more...
Machines that think have long been the goal of...
The world wars of the 20th century were fought by soldiers, sailors, and airmen in clashes of steel across defined...
The events of September 11, 2001, are seared into the...
NASA is planning a return to crewed missions and it’s starting out with a big bang – landing astronauts on the surface of the Moon by 2024. This...