Monday

If it can go wrong……

One derivative of Murphy’s Law is: If it can go wrong it will go wrong, usually at the most inconvenient moment!

Planning the assault

This post may be old news to many European’s but in November 2009, the 27-kilometer (16.8 mile) Large Hadron Collider (LHC), buried under fields on the French/Swiss border, suffered serious overheating in several sections after a small piece of baguette landed in a piece of equipment above the accelerator ring. Dr Mike Lamont, the LHC’s Machine Coordinator, said that “a bit of baguette”, believed to have been dropped by a passing bird (other sources suggest a malicious pigeon), caused the superconducting magnets to heat up from 1.9 Kelvin (-271.1C) to around 8 Kelvin (-265C), close to the level where they stop superconducting.

In theory, had the LHC been fully operational, this could cause a catastrophic breakdown similar to the one that occurred shortly after it was first switched on. Fortunately, the machine has several fail-safes which would have shut it down before the temperature rose too high.

Part of the LHC

Given the total cost of the project to build and commission the accelerator is of the order of 4.6 billion Swiss francs (approx. $4400M, €3100M, or £2800M as of Jan 2010) with an overall budget of 9 billion US dollars (approx. €6300M or £5600M as of Jan 2010), making the LHC is the most expensive scientific experiment in human history. Politicians are probably asking how a bungling bird could target a critical part of the machine with a small piece of bread and shut the whole system down?

A more realistic question for project practitioners is how could design engineers and risk managers be expected to foresee this type of problem? Failure Mode Analysis (FMA) may help but I can just see the reaction to someone in a risk workshop hypothesising that a bird would fly over the machine and drop its dinner precisely on target to cause the maximum damage. Theoretically possible, but hardly plausible would be a polite reaction……until after it happened.

One of the messages from books like ‘The Black Swan’ and from complexity theory  is the future is inherently unpredictable. This is probably as good an example of a ‘Black Swan’ as any I’ve heard of.

For more on the LHC see: http://en.wikipedia.org/wiki/Large_Hadron_Collider

16 responses to “If it can go wrong……

  1. Pat,

    Great story, too bad it’s not true. “Just the facts ma’am,” as Joe Friday of the TV show Dragnet would have said.

    The magnets – operating at 8.36 Tesla (a huge value) are are cooled by cryogenic technology producing using superfluid helium. The power for the generation and handling of the helium comes from a “top side” source, which is powered by a substation and switch gear in the standard electric utility manner.

    After the substation tripped (our neighborhood one trips all the time during the summer from lightening), the next the failure mode of the cooling system is the shut down the magnetics by removing power.

    This trip system worked and triggered an “un commanded” shutdown of entire LHC – in exactly the same way when the power goes off at your home with gas appliances the oven and the water heater shut off and assure that natural gas does not reach the kitchen or basement.

    Now a critical “fact” when inspecting the electrical switch gear above ground, bird feathers and bread were found inside the fencing surrounding the substation switch gear.

    From the official report…

    “Of course, no such thing happened,” (bird dropping bread) says Gillies. But he did admit that engineers at CERN do not fully understand how the heating occurred in the two sectors. “To this day, we do not know what caused the power cut,” he says.

    I have great interest in CERN having written software as a graduate student on another accelerator, before attempting to complete my PhD. Having no really good original ideas I switch to the dark side after being recruited by TRW to take my algorithms and apply them to radar and sonar signal processing.

    This is a case of the popular press having a wonderful time “making up” the situation – not uncommon in the US.

    See the CERN site, then look for the CERN Bulletin in the lower left corner.

    • Thanks for the update Glen – we did check several sources but obviously not deep enough. The reported direct quotation from Dr. Mike Lamont was what convinced me……

    • The real news from these links is definitely not as attention grabbing as the UK press headlines we based the post on, but does support the thrust of this blog which is focused on the question: A more realistic question for project practitioners is how could design engineers and risk managers be expected to foresee this type of problem? Obviously some good FMA work was done to handle the power outage but the root cause is still a mystery.

  2. Pat,
    The tripping of the substation – a not uncommon occurrence – worked as designed.

    Not sure how that can be defined as unforeseen. The power sources for any industrial site are never dependent on 100% availability.

    A bird flying with the switch gear would be rare for sure, but the probability is not 0%.

    The design of the system allows for unforeseen events in it’s emergency shutdown processes that’s how engineers deal with the unforeseen.

    Take a look at http://www.niwotridge.com/PDFs/Fault%20Coverage.zip to see how fault tolerance is “assured” in control systems used to drive the “emergency shutdown” of many industrial equipment.

    In the end the root cause is not very useful, since the shut down system worked. I say this from experience on platform shut down system, turbine powered compressors. It would be logged as “source unknown,” but fail to safe event.

    Once it was conjectured a bird flew into so switch gear, the primal source could be logged a “speculative source,” everyone back to work.

  3. This is a highly advanced and innovative project, was it realistic to expect it to run perfectly without any problems. Well not in fact because a shutdown was planned after a short running in period. I think the real lesson in this highly complex project is to expect the unexpected. And recognise this project was always a “high risk” project

  4. I have ordered to book you recomend. I have added you to our Blog roll. Would you be interested in adding us to you blog roll we are at http://blog.parallelprojecttraining.com.

  5. Pingback: Seconds From Disaster « Aavssitedev’s Blog

  6. Pingback: The powerful illusion of control « Stakeholder Management's Blog

  7. Pat and Paul and anyone else who will listen. Books like “The Illusion of Control,” “Blackl Swan,” are popular press accounts of processes and events with little understanding and insight into actual facts.

    They are entertainment. I’m not saying don’t read them. Do. But imagine you have broad and deep knowledge and experience in a topic. Along comes an author and visits you, interviews a few people around you and then writes a newspaper story or even a book about how you and your colleagues make mistakes and behave like “bone heads” after in incident.

    There are unlimited examples of this type of behavior. Ranging from “raging aganist the machine” of government contracts – one of the favorite pass times of a VERY well know PM in SE Asia. The uninformed posted on actual profession websites about the use or misuse of one or another process – EVM for example.

    Even in our NDIA (National Defense Industrial Association) sponsored conferences (with PMI), there is usually 1 or 2 speakers that have it “completely wrong.”

    Great care is needed after reading – almost anything – to assure that the narrative in the story is actually factual and appropriate for the context and domain. The notion that “general knowledge” is dangerous is applicable here

  8. I live in 3 ‘domains’ aircraft maintenance – construction/engineering and IT. The difference in approach are quite remarkable.

  9. Clearly many book and cases studied simplify the reality of many complex and challenging project in order to fit (or even justify) a narrative. However it is understandable that project managers from outside these projects seek to understand what went wrong. You could see this as gloating over someone else’s disaster or genuine interest learning the lessons form the project. The sadness is that the participants in the project themselves often remain silent. Even within the confines of an organisation projects are often reluctant to reveal all the lessons learned. I often use a case study based on Heathrow T5. It has lots very strong points and lessons to learn. I compiled it from research into public domain material. Very little of which is first hand from the participants, so it is very difficult (almost impossible) to get anywhere near the truth. All an external observer of these projects can do is to make sure your sources are reliable and quoted and then the reader can judge for themselves.

    • T5 (Terminal 5 at Heathrow) is a fascinating study – a largely successful construction project with really innovative delivery strategies and risk management. Having passed through several times now a really effective and efficient terminal – one of the worlds best.

      And in the middle the total disaster of the opening days. T5 is used by Dr. Lynda Bourne in the Introduction to her book, Stakeholder Relationship Management: A Maturity Model for Organisational Implementation; the Intro can be downloaded from http://www.mosaicprojects.com.au/Book_Sales.html.

      What seems to be emerging from the T5 opening that is really significant is the effect a few small errors and relatively short delays can have in a closely coupled complex system (called a ‘Normal Accident’). I hope to post on this in the next week or two.

  10. Pingback: Pigeons 1 – Humans O

  11. Pingback: Volcanoes and Showing Mr. Murphy the Door « Perspectives on Technology & Business

Leave a comment