Want to Improve Your Investigation Results?

Keep an open mind well into the investigation. Resist the temptation to reject proposed scenarios until there is sufficient evidence to do so.

SMALL changes in investigation technique can sometimes produce large improvements in incident investigation results. This article discusses several common weaknesses found in investigation management systems. These weaknesses could be viewed as low-hanging fruit that, if present and harvested, can significantly enhance effectiveness of incident investigations. These weaknesses also can be viewed as avoidable and possibly deleterious mistakes that can poison an otherwise well-developed investigation system.

Although there are other important aspects to incident investigation, these six items in particular represent typical avoidable defects that are correctable using internal resources. If any of these weaknesses is present in your investigation management system or practices, you could consider them as opportunities to further improve investigation results.

1. Inadequate Target. If the performance target for the investigation is set too low, the root cause level may not be reached. An effective investigation policy will clearly define expected objectives in language understandable to all levels of the organization, accompanied by written examples (both good and bad). Investigators need to know when they have reached the intended root cause level. If the performance bar is set too low, investigations may stop before identifying underlying causes, and this can create an opportunity for a repeat incident. Identifying and repairing system defects has broad-reaching benefits. Underlying system weaknesses that remain uncorrected have potential to broadly affect other activities and may generate similar incidents in other departments.

Once a clear target is established, investigators and managers need a sufficient and consistent understanding of what is considered as an adequate level of performance in identifying root causes. Achieving adequate understanding of targets is a function of the training, auditing, and quality assurance aspects of the investigation management system.

2. Premature Stopping Point--Improper Screening of Information and Evidence. It is a potential mistake to gather only information related to the cause scenario that is felt to be most likely. If alternate scenarios are rejected prematurely, evidence can be irrevocably lost and ultimate identification of root causes may be incomplete or inaccurate. Seasoned investigators try to keep an open mind well into the investigation. They resist the temptation to reject proposed scenarios until there is sufficient evidence to do so. Inexperienced investigators often decide on the root cause(s) before starting the investigation and therefore selectively screen out information and evidence that is ultimately found to be relevant and important.

In well-implemented investigation management systems, the concept of "delayed selection decision of the most likely cause scenario" is incorporated into investigator training, written protocol, and auditing/monitoring activities.

3. Premature Stopping Point--Event Level. Events are results of underlying root causes and conditions. Events themselves should not be viewed as root causes. If a person experiences a flat tire and subsequently loses control of the vehicle, the flat tire is an event. The investigation team should extend the investigation activities to pursue the reasons and causes that resulted in the tire becoming flat. If the investigation team mistakes an event for a root cause, the investigation may not discover the underlying reasons for what caused the event.

4. Premature Stopping Point--"Failure to Follow Procedure." In most instances, failure to follow established procedure is an event that is the result of underlying root causes. If the investigation stops at the "failure to follow procedure" level, underlying causes for the failure often will not be identified and corrected.

Most of the time, there are correctable reasons why the procedure was not followed. These reasons can be discovered, and preventive actions can be implemented to minimize the occurrence of similar incidents in the future. In many cases, the investigation uncovers the fact this was not the first and only time the procedure was not followed, although it may be the first time adverse consequences resulted from the deviation.

5. Premature Stopping Point--Single Root Cause. In most instances, there are multiple root causes and corresponding multiple opportunities in the scenario where the accident sequence could have been arrested. If the investigation finds only one root cause, it is likely that other remaining root causes will remain in place (and uncorrected) to contribute to future incidents. In literally all major accidents, there are multiple adverse events in the accident sequence. If any of these events were missing, the chain of events would be broken and the actual outcome of the scenario could be significantly different. Multiple failures of safeguards occurred in the Challenger Space Shuttle disaster, the aborted Apollo 13 moon mission, the Three Mile Island nuclear power plant incident, and the Bhopal, India toxic chemical release (references 1, 2, 3). In the Apollo 13 incident, there were at least four opportunities where safeguard features failed, including:

  • Inadequate management-of-change activities when addressing voltage specification changes from the original 28 volts to 65 volts resulted in failure of a thermal protection temperature-limiting safety device. The failure of this device is believed to have allowed internal temperature to reach excessive and damaging levels during tank heating that occurred 17 days prior to launch.
  • Improper rigging during assembly resulted in the tank being dropped and damaged several years prior to launch.
  • After the tank was dropped, inadequate testing and inspection failed to detect internal damage caused by being dropped.
  • An inadequate temporary procedure was developed and executed 17 days before the mission when it was discovered that liquid oxygen could not be removed in the normal manner following a dress rehearsal. This inadequate procedure, coupled with the failed temperature-limiting device, resulted in generating significant internal temperatures (on the order of 1,000 degrees F), which damaged the internal wiring insulation. This damage provided the third and missing fuel leg of the fire triangle when the internal tank-stirring device was activated 56 hours into the mission.

In the absence of any of these four adverse events, outcome of the incident scenario probably would have been much different. The lesson is that multiple root causes are present in almost every incident. It would be a mistake for the investigation team to stop after identifying a single root cause.

6. Marginal Quality Witness Interviews. Witness interviews present a wide range of opportunities for success or failure. High-quality incident investigation management systems often address the interview competency skill set for investigation team members and provide corresponding training and written interview guidelines. It is easy for a witness's perception of the event to become changed by interaction and discussion with others prior to the interview. This can happen both consciously and subconsciously. It is a recognized good practice to conduct the initial interview in a private location as soon as practical. One important phase of the interview is the "uninterrupted narrative," where the witness is asked to tell what happened from his or her perspective. Experienced interviewers will allow this narrative to proceed without interruption, despite the strong temptation to ask questions and clarify points during the narrative.

The ultimate purpose of incident investigation is prevention of a repeat event. Accurately identifying and correcting the multiple underlying root causes are keys to success. Recognizing and managing these six critical aspects of incident investigation can significantly affect the results.

This column appeared in the January 2006 issue of Occupational Health & Safety.


References

1. Apollo 13--Godwin, Robert, Apollo 13--The NASA Mission Reports, 2000, Apogee Books, Burlington, Burlington, Ontario Canada, ISBN 1-896522-55-6.

2. Bhopal--Mannan, Sam, Lees' Loss Prevention in the Process Industries, Third Edition, 2005, Elsevier Butterworth Publishing, NY, NY, ISBN 0-7506-7589-3.

3. Three Mile Island--Center for Chemical Process Safety, Guidelines for Investigating Chemical Process Incidents, Second Edition, 2003, American Institute of Chemical Engineers, NY, NY, ISBN 0-8169-0897-4.

This article originally appeared in the January 2006 issue of Occupational Health & Safety.

Featured

Artificial Intelligence