NZSM Online

Get TurboNote+ desktop sticky notes

Interclue makes your browsing smarter, faster, more informative

SciTech Daily Review

Webcentre Ltd: Web solutions, Smart software, Quality graphics

Feature

Risk, Failure and Mercury Energy

The Auckland power crisis provides a demonstration of the importance of assessing risk and understanding the consequences of failure.

Craig Webster and Alan Merry

The report of the inquiry into the Auckland power crisis has now been released and discounts claims that the failure was caused by last summer's high temperatures. Despite the high air temperatures, soil temperatures and moisture levels were not outside the acceptable range specified by the manufactures of the power cables which failed. Actual power use did not exceed Mercury's official supply capacity, and was considerably less than recent power demand forecasts for the period.

The weather theory, promulgated by Mercury Energy and featured in a recent NZSM article [Blame the Weather!, April] focuses on a single cause for the failure. However, the nature of system failures is such that in any reasonably complex system, the chance of a failure being attributable to a single cause tends to be small, and decreases sharply with increasing complexity of the system in question.

Complex systems, such as electricity delivery networks, typically fail through an accumulation of many small latent defects combined with regular minor violations or breaches of safety standards, none of which would be large enough to bring down the system on its own. It is a mistake to stop the search for underlying causes at the first factor found, which is usually the last and often the least serious, in a series of compounding problems. Most single-cause theories of failure are incomplete at best and scapegoat-seeking at worst. Either way the complete set of underlying causes is not addressed, which means that the likelihood of recurrent failure remains high.

Risk, Failure and Mercury Energy Figure A (6KB)
When latent defects in a system's defences coincide, failure occurs.

Good design, safety devices, tolerance limits, and recommended operating procedures are all defences against system failure. When represented schematically, these defences form overlapping layers between the system and failure. Defects in design, violations such as not following correct procedures, a lack of proper maintenance, or simple human error may be represented as holes in the layers of defences, and the size of these holes depends on the severity of the violation. A hole (or latent defect) in one layer is usually compensated for by an intact barrier at another. However, when a set of holes line up, an accident trajectory exists through all layers of the system's defences and a system failure occurs. It is very unusual for a single factor to be solely responsible for a failure.

In the Auckland power system failure, the first system violation dates as far back as 1959 when the first of the four cables that failed was installed with backfill material which did not meet the specifications of the cable manufacturers. It did not allow heat produced by the cables to dissipate into the surrounding ground at the required rate. Under these conditions the cables were prone to over-heating even under normal loading, and would exceed assigned operating temperatures, which is known to shorten the life of cables.

This created the first hole in the system's defences -- a latent defect waiting for an opportunity to manifest itself. Other latent failures included poor maintenance and a lack of proper investigation into subsequent minor cable faults. Cable faults were seen as unlikely independent events, when clearly systemic problems meant they were not. In addition, a lack of cable markers and poor records of cable location meant that construction work in the city area caused damage to cables, and also hampered attempts by emergency repair crews to find the cables after the blackout occurred.

Risk and Reliability

Risk is often viewed simply as the probability of a undesirable outcome. A more useful conception of risk takes into account both the probability of an outcome and the consequences of that outcome. If the consequences are dire, the importance of the risk is greater even if the probability of the event remains the same. Thus even a very low chance of failure should be considered unacceptably risky if the consequences of that failure are disastrous.

Even though power has now been restored, the risks of further failure remain. The overhead power cables supplying the majority of the city's power are temporary and, as the inquiry report makes clear, have a number of potentially weak points in their design, not the least of which is the complete lack of any backup system should they fail and require two to four days to repair.

A reliable power delivery system depends on closing all identifiable holes in the system's defences. The inquiry concluded that the reliability of the Central Business District's power supply is currently unsatisfactory. Mercury Energy is presently implementing initiatives which it claims will ensure an appropriate security of power supply by December 1998.

In reality, it is doubtful that absolute reliability can ever be achieved in a complex system. The unpredictability of human error, combined with the limitations of design and the enormous variation in the combination of events that may unexpectedly occur, has led systems experts to propose the concept of "normal accidents". A balance must be struck between expenditure on defences against failure and the need to produce a product at a reasonable price.

Nevertheless, given the very serious consequences of the recent power failure, it seems reasonable to argue that greater investment is warranted in the future to prevent a recurrence, including the adoption of a more sophisticated approach to system safety. Even if this is accepted and treated as a priority the process will occupy Mercury Energy for many years to come.

Alan Merry has research interests in risk and safety in complex systems, and works in the Anaesthesia Department of Auckland's Green Lane Hospital.
Craig Webster is currently a clinical researcher in the Anaesthetics Department at Auckland's Green Lane Hospital.