Quality control engineer manages digital quality systems

Why Human Error Demands a Systems-Based Safety Approach

Organizations can reduce accidents and improve operational performance by designing resilient systems, strengthening leadership and focusing on the root causes of human error rather than blaming workers.

Introduction

Operationally. most organizations can be reduced to three key elements. The organization produces an output (a product or service). The organization’s management secures plants, and equipment. They also create systems, such as policies, processes and procedures with which to create the output. They employ workers who activate, control, and manage the systems in order to produce the output.

Performance Management

Some of the output may end up being defective or unacceptable creating a financial loss. Field operations management must be constantly vigilant in order to ensure work is put in place in accordance with plans and specs. While performing their tasks the workers may engage in unsafe behavior potentially causing an accident which may result in an injury. Over a century ago a significant research study determined that invariably some action by the worker lead to that accident.

So, the typical interventions were providing the worker with training, possibly retaining counseling, or possibly punishment. Unfortunately, this only had a temporary improvement effect with more accidents in the future. There are a multitude of reasons for the unintended unsafe act to occur:

  • Due to a slip resulting from worker error such as inattention.
  • Or a lapse due to memory failure which could be system or knowledge driven,

Or due to an intended act:

  • Due to an error, resulting from practice or knowledge mistake
  • Or a violation, intentionally performed

The above, cause distinction is important as it tries to determine why the worker specifically did what they did. This enables the devising of specific solutions to improve safety outcomes. This looks at human error more closely.

The Human Error Factor

The negative impact of human error on organizational performance impacts productivity, quality, customer service, and accidents causing injury, and loss. The only category listed above for which there is statistics is accidents. In the most serious accidents in the last 50 years, almost all initial findings attributed the failures primarily to human error:

  • 1978, West Virginia power plant construction cooling tower collapse, killed 51 workers.
  • 1984, Bhopal, India, Union Carbide plant explosion released cyanide gas, killing 20,000.
  • 1988, Alpha oil platform explosion killed 167 & resulted in a major oil spill.
  • 1991, Hamlet Chicken processing plant fire in North Carolina killed 25 workers.
  • 2005, the Texas City BP refinery explosion killed 15 workers.
  • 2006, sugar refinery explosion in Georgia killed 42 workers.
  • 2010, BP Deepwater Horizon oil spill in the Gulf of Mexico killed 11 men and injured 17

Management devises the systems, and—as humans—are fallible, may create systems with latent risks. The workforce has to function within the systems, and these latent defects, combined with operator errors, may lead to failures. All these latent conditions influence the producer's (worker's) choices and decision making. Latent conditions are discrepancies in the systems that facilitate error on the part of the producer.

Traditional Approaches to Combatting Human Error

The traditional approaches to managing human performance may contribute to human error. Most of this is triggered by performance goals, and metrics disregarding key operational factors. This mismatch may be due to a misunderstanding of the task, task demand, worker capability, knowledge, motivation, goals, information, communication, human dynamics, supervision, work climate, organization culture, and leadership to name a few. Human error is inevitable and seems to primarily occurs due to the individual, the organization's systems, or the leader-member exchange.

Research has shown that humans do learn from their mistakes, so management may devise a process to bring it to the workforce’s attention. It would seem that the way to address performance issues is to make the result (consequences) of the mistakes as inconsequential as possible. Therefore, a performance management strategy might include a number of elements, one of which might be designing out the error producing elements of the systems, or at least reducing their negative consequences or their frequency. Thereby returning the process to its former unimpaired state.

Preventing Human Error

This focuses on designing systems, processes, and/or procedures that minimize mistakes from happening, rather than blaming individuals. Human error is often the consequence of flawed systems, confusing procedures, unrealistic practices, while effective prevention involves targeting the root causes. 

Key Strategies for Human Error Prevention:

  • Standard Operating Procedures (SOPs): Ensure the workforce is clear and thoroughly familiar with the operating procedures. Processes may change, ensure SOPs are aligned and up to date. Identify tribal knowledge, if effective align them with operations and incorporate into SOPs.
  • Engineered Controls (Poka-Yoke) :  Devise means and methods that make it impossible or very difficult to make an error, such as physical constraints, automation, robotics, prefabrication, etc. that forces a specific, correct sequence or operation. Ensure involvement of the workforce in design and implementation of standards.
  • Root Cause Analysis (RCA) :  (RCA) is a structured framework for identifying underlying causes of adverse events, implementing corrective actions. When errors occur, investigate the underlying system failures rather than taking the easier route by addressing the symptoms.   
  • Training & Education: Provide robust training so as to ensure employees have the necessary understanding, skills a knowledge to perform assigned tasks correctly. This requires a combination of clear expectations, consistent training, active performance management, and a supportive environment. Key strategies focus on improving engagement, aligning individual goals with company objectives, and providing necessary tools for success. 

Developing Error Tolerance

Errors are going to be made, and error avoidance is not "foolproof," so the next step is critical in optimizing performance, minimize the consequences of the errors. Error tolerance can be achieved in a couple of ways.

  1. For systems where error that cannot be designed out or blocked, there should be a way to detect them early, and mechanisms developed to recover from them without significant impairment of performance. An example of this is a checklist utilized before engaging in an activity. Pilots routinely go through a preflight checklist. This has helped to render flying safer. Checklists can also be used after completion of any activity, such as maintenance, to ensure that the equipment is in good working order.
  2. Deviations or errors not detected or detected "late" are going to have consequences. The minimization of these unexpected and undesired outcomes must be dealt with effectively so as not to adversely impact performance. Such a process will keep an error from escalating into a major undesirable event. Examples of this might include routines maintenance, or redundant systems, etc.

Become Resilient

Making the organization’s systems resilient is the next element in managing human error. This means having a built-in mechanism to deal with human error, and changed conditions effectively while recovering from adverse effects quickly returning to "normal" operations seamlessly. Agile resilience has five elements: Leadership, culture, people, systems, and the work environment.

Resilience begins from a leadership vision. The organization must select the right people and provide the resources to devise resilient systems and establish the acceptable level of risk with the "right" balance between risk taking and avoidance. Leadership must create a climate where it is okay to make mistake and, once made, ensure that lessons are learned and disseminated throughout the organization.

A resilient culture is built on four pillars, trust, purpose, empowerment, and accountability. Such organizations have a strong sense of purpose that flows to all the employees, encouraging self-directed teams to innovate and communicate cross-functionally. The four pillars bind the organization into a cohesive, innovative, purposeful group with a sense of commitment to action, problem resolution, and win-win thinking, with a passion for excellence.

The core of any organization is its people. The organization must select the "right" people, who are motivated, have the courage to challenge the process, are willing to work toward a common goal, share a common vision and purpose, and are willing to overcome obstacles and barriers. The organization must provide the timely information and resources which will facilitate effective decision making and problem solving.

Systems in a resilient organization have an open structure that allows for the flow of information and resources. Such systems foster innovation and agility. The systems and subsystems are integrated and aligned with the organization's goals and objectives, allows for effective planning and strategy implementation. It supports and rewards innovation, cooperation, enhances flow, and creates value.

The work environment in a resilient organization is flexible and conducive to learning (from one's mistakes). It is designed so as to minimize latent defects in the systems. The strategy, objectives, goals, and metrics are integrated so as the accomplish excellence.

Conclusion

Performance management has taken on urgency in the realities of the 21st century. The traditional business models and management approaches that have worked well in the past cannot be used to solve the problems of today (Einstein). It is these very tools and techniques that have gotten us to where we find ourselves now. The more productive approach is to identify the challenges, define the problems, face reality, stop treating the symptoms, dispel the myths, assess the organizational system and people constraints, foster integration, communicate a compelling vision, move away from command and control, foster trust, empower the people, and lead, lead, and lead.

This article originally appeared in the issue of Occupational Health & Safety.

Featured

Artificial Intelligence

Webinars