Medical Error, RCA & Quality Improvement
What medical error is, and why the systems view wins
A medical error is the failure of a planned action to be completed as intended, or the use of a wrong plan to achieve an aim. It is the process failure. An adverse event is the outcome β an injury caused by medical management rather than the underlying disease β and it is preventable when it results from an error. A near miss (or close call) is an error that was caught before it reached the patient. These distinctions matter because they change the response: an error that causes no harm is still a signal that a defence failed, and it is the cheapest, safest event to learn from.
The single most important mental shift in patient safety is James Reason's distinction between the person approach and the system approach. The person approach treats error as a moral failure of an individual β name, blame, shame, retrain β and it feels natural and satisfying. The system approach accepts that humans are fallible and that errors are to be expected; the defences, tasks and organisation are redesigned so that the inevitable error is either prevented or caught before it harms anyone. Examiners are testing whether you have made this shift. Blaming the surgeon who marked the wrong leg fixes one surgeon; redesigning the site-marking and time-out process protects every future patient.
This is why the discipline measures latent conditions as much as active failures. A tired registrar, an ambiguous consent form, a theatre list that runs without a pause β these are latent conditions that wait, silently, for an active failure (the slip) to line up with them. The landmark To Err Is Human report (Institute of Medicine, 1999) put patient harm on the map by estimating that somewhere between forty-four thousand and ninety-eight thousand Americans were dying in hospital each year from preventable medical error β 44kβ98kdeaths per year (IOM, 1999) β figures comparable to deaths from motor-vehicle trauma, and a deliberate call to move from blame to systems.
The error taxonomy β slips, lapses and mistakes
Errors fall into three families, and examiners expect you to name and separate them. A slip is an action failure: the plan was correct but the wrong action was executed, usually during a familiar, automatic task (a skill-based error of perception or motor control) β for example, reaching for the left leg when the consent says the right. A lapse is a memory failure: the plan was correct but a step was omitted, again during a routine task β for example, forgetting to prescribe venous-thromboembolism prophylaxis after a hip fracture. A mistake is a planning or knowledge failure: the plan itself was wrong β either the wrong rule was applied (rule-based, e.g. plating a fracture that the evidence supports nailing) or knowledge was genuinely absent (knowledge-based, e.g. choosing an implant unsuited to the fracture pattern).
The cognitive control levels behind this come from Rasmussen's SkillβRuleβKnowledge (SRK) framework: slips and lapses occur at the skill-based level (automatic behaviour), while mistakes occur at the rule-based and knowledge-based levels (conscious problem-solving). This matters because the fixes differ β slips and lapses are best prevented by forcing functions, checklists and standardisation, whereas mistakes require training, supervision, decision support and evidence-based guidelines.
Separate from errors are violations β deliberate deviations from a rule or protocol. These are not failures of cognition but of motivation, and they map onto just culture (see later): a routine violation (cutting a corner that has become the local norm), a situational violation forced by a poorly designed system, an optimising violation for personal convenience, or a reckless violation that consciously disregards a substantial risk.
| Type | Cognitive level | What goes wrong | Orthopaedic example | Best fix |
|---|---|---|---|---|
| Slip | Skill-based | Right plan, wrong action β perceptual or motor error during a routine task | Preparing the wrong leg when the consent says the opposite side | Forcing functions, standardised site-marking, mandatory time-out |
| Lapse | Skill-based | Right plan, step omitted β memory failure during a routine task | Forgetting to prescribe chemical DVT prophylaxis after a neck-of-femur fracture | Checklists, electronic prompts, pharmacy double-checks |
| Mistake | Rule-based or knowledge-based | Wrong plan β wrong rule applied or a genuine knowledge gap | Choosing plating for a diaphyseal fracture that the evidence supports nailing | Training, supervision, decision support, evidence-based guidelines |
| Violation | Deliberate | Intentional deviation from a rule or protocol | Skipping the time-out to keep the list moving | Just culture: coach at-risk behaviour, act on reckless behaviour |
The pattern to remember is that slips and lapses happen to the most careful, well-trained clinician and are expected in a fallible system; the response is to redesign the environment. Mistakes and violations point more toward knowledge, judgement and accountability.
The Swiss-cheese model and how defences fail
Reason's Swiss-cheese model is the examiner's favourite diagram for accident causation. Picture several slices of Swiss cheese lined up in sequence β each slice is a layer of defence (the organisation's safety policy, supervision, the team, a checklist, the surgeon's own vigilance). Each slice has holes, representing latent weaknesses and active failures. Most of the time a hole in one slice is blocked by the next slice. A patient is harmed only when the holes in every slice line up, momentarily, and the trajectory of an error passes all the way through.
The power of the model is that it relocates blame. The "hole" nearest the patient is the active failure β the surgeon's slip β and it is the most visible, but it is also the smallest and most transient contributor. The holes further back are latent conditions: an under-staffed theatre, an ambiguous clinic letter, a culture that discourages speaking up, a training gap. These latent conditions sit dormant for months or years, waiting for the active failure. A root-cause analysis that stops at the active failure has found the trigger, not the cause.
Reason paired this with a sequence of failure stages β organisational influences, unsafe supervision, preconditions for unsafe acts, and the unsafe acts themselves β so that an investigator works backwards from the active failure to the latent conditions that made it possible. The practical message for a surgeon is plain: spend your energy strengthening the back slices (the system), because you cannot eliminate holes in the front slice (human fallibility).
Wrong-site surgery and the safeguards
Wrong-site, wrong-procedure and wrong-patient surgery are the prototypical preventable adverse events in orthopaedics, and they are classified as never events β Serious Reportable Events (catalogued by bodies such as the US National Quality Forum) that should never occur if the right barriers are in place. Because they are largely skill-based slips operating on a wrong plan that was set up by latent conditions, they are the textbook application of everything above: you do not fix them by exhorting surgeons to "be more careful" β you fix them with forcing functions and a standardised process.
The global standard is the WHO Surgical Safety Checklist (2009), a short set of checks split into three pauses: Sign In (before anaesthesia β confirm identity, site, procedure, consent, allergy, airway risk), Time Out (before skin incision β the whole team confirms patient, site and procedure aloud, antibiotic prophylaxis and imaging are in theatre), and Sign Out (before the team leaves β instrument and swab counts are correct, the specimen is labelled, and key concerns for recovery are stated). In its landmark global study, introducing the checklist cut in-hospital complications by about a third and deaths by close to a half, with no change in case mix β 36%fewer complications (Haynes 2009).
Orthopaedic specialty and national bodies layer process controls on top. The American Academy of Orthopaedic Surgeons runs the "Sign Your Site" campaign (the operating surgeon marks the site, with their initials, in the presence of the awake patient); the AO Foundation teaches site-marking and the formal pause; the Joint Commission's Universal Protocol bundles pre-procedure verification, site-marking and the time-out. The common thread is that the check is performed aloud, by the whole team, with the patient as a participant β a forcing function, not a formality.
Wrong-site surgery is a never event: it signals a defence failure, never an isolated personal lapse. The operating surgeon marks the site with their own initials while the patient is awake and confirms it, and the team performs a documented time-out before incision β every case, every time. A near-miss (a wrong-site error caught at the time-out) is reported and investigated with the same rigour as a completed event, because the latent holes are identical.
Root-cause analysis
A root-cause analysis (RCA) is a structured, retrospective investigation used after a serious adverse event or sentinel event to identify the underlying system causes β not to apportion blame. It is triggered in many systems when a sentinel event occurs (the Joint Commission requires it of accredited organisations), and it deliberately takes a systems lens: the question is not "who is at fault?" but "why did the system let this happen, and how do we make it impossible next time?"
A typical RCA follows a sequence. First, gather the facts β interviews, the notes, the timeline β and separate what happened from why it happened. Then map the causal chain using tools that force you past the obvious cause. The two staples are the "5 Whys" (ask "why?" repeatedly until you reach a system cause you can act on) and the Ishikawa or fishbone diagram, which sorts candidate causes into contributory-factor categories. Charles Vincent's contributory-factors framework is the healthcare fishbone most examiners recognise: patient factors, task factors, individual and team factors, communication, equipment, environment, and organisation and management. The RCA then prioritises root causes and converts each into a specific, owned action, which is tracked to completion.
The example below shows why stopping at the first "why" is the classic RCA trap β the surgeon is the trigger, not the cause.
- Event: The surgeon operated on the right knee; the correct side was the left.
- Why 1? The consent form, clinic letter and site-marking all stated "right".
- Why 2? The clinic letter was generated from the referring GP letter, which was ambiguous, and was not re-confirmed against the patient.
- Why 3? The registrar saw the patient in a rushed pre-assessment clinic with no independent confirmation step.
- Why 4? The clinic pathway has no mandatory dual-identifier cross-check and no patient-confirmation pause.
- Why 5 (root cause)? The organisation has no standardised, audited sign-your-site process at the clinic stage, so the error was baked in long before theatre.
- Action: Implement a mandatory site-mark by the surgeon with the awake patient in clinic and again before theatre, a dual-identifier check, and a documented time-out β then audit compliance monthly.
A useful rule of thumb within RCA is the substitution test (the test at the heart of a just culture): if you replaced the individual with another competent person in the same role, would the same thing have happened? If the answer is yes, the cause is the system and the response is redesign; if no, the cause lies more with the individual and the response involves coaching or accountability. The common failure modes of RCA are stopping at the proximate cause, conducting it in a blame culture that drives staff underground, and producing an action list that is never closed out β each of which the examiner can probe.
Mortality and morbidity (M&M) review
The mortality and morbidity meeting is the recurring, case-based forum through which a department learns from its complications and deaths. Done well, it is anonymised, blame-free and systems-focused β it applies the RCA mindset at department scale. Each case is presented against the timeline of what happened, the discussion works back to why, and the meeting ends with explicit learning points and action items that carry an owner and a deadline. The discipline that separates a good M&M from a moan is the closed loop: every action is tracked to completion and revisited at a later meeting, so that review converts into improvement rather than regret.
For the exam candidate, the points that earn marks are that M&M is a protected, no-blame, systems discussion; that it must be embedded in a wider governance structure (the action items feed into quality-improvement work and, where appropriate, into incident reporting and the regulator); and that it deliberately includes the near miss and the good catch, not only the disaster. A department that reviews only its deaths learns slowly; one that reviews its near misses learns cheaply.
Quality improvement β PDSA cycles
If RCA is the reactive half of patient safety, quality improvement (QI) is the proactive half β the structured method by which a change is tested and embedded. The framework examiners expect is the Institute for Healthcare Improvement's Model for Improvement, which asks three questions before any change is made: What are we trying to accomplish? (a clear, measurable aim), How will we know a change is an improvement? (a measurement plan), and What change can we make that will result in improvement? (the idea). The change itself is then tested through PDSA cycles β Plan (state the prediction, who does what, by when, and what data you will collect), Do (run the test), Study (compare the result with the prediction), and Act (adopt, adapt or abandon).
PDSA cycles are deliberately small and fast: test on one patient, then a few, then a ward, scaling only what works. Measurement is by run charts or statistical process control (SPC) charts over time, which distinguish genuine change from random noise β a single before-and-after snapshot cannot tell you whether you improved the system or just got lucky. Related methods you should be able to name are Lean (eliminating waste and non-value-adding steps) and Six Sigma (reducing variation and defects), both widely used in theatre-list and pathway redesign.
A practical example: to improve DVT-prophylaxis prescription after hip-fracture admission, you set an aim (chemical prophylaxis within twenty-four hours in ninety percent of eligible patients), measure the baseline rate, introduce a pre-printed admission order set and an electronic prompt (the change), and run three PDSA cycles refining the prompt and the timing, tracking the rate on a run chart. The link back to error taxonomy is direct: you are building a defence (a forcing function) against the lapse of forgetting to prescribe.
The boundary between QI and research matters for ethics. QI is a local, practice-improvement activity using existing data to improve care for the population already in the system; it does not usually require full research-ethics approval, but it must still respect the principles set out in the Declaration of Helsinki β minimise burden, protect confidentiality, and seek oversight where patients are exposed to anything beyond standard care. The Choosing Wisely campaign sits in this space too: reducing low-value care (tests and procedures that offer no benefit) is itself a patient-safety and quality intervention, because every unnecessary investigation carries its own risk of harm.
Just culture and the duty of candour
A safety culture that never holds anyone accountable is as broken as one that blames everyone. Just culture, developed by David Marx, is the balance: an organisation learns from system failure and manages individual accountability in a way that is fair and predictable. Its engine is the substitution test mentioned above, and it sorts human behaviour into three categories, each met with a different response rather than a uniform punishment.
A human error (an inadvertent slip or lapse) is consoled β the person is consoled and supported, and the system is fixed. An at-risk behaviour (a shortcut in which the person did not perceive the risk, or judged the risk acceptable for a small gain) is coached β the person is counselled and the incentives that made the shortcut attractive are removed. A reckless behaviour (conscious disregard of a substantial and unjustifiable risk) is punished. The point is that the response follows the behaviour, not the outcome: a slip that happens to kill a patient is consoled and system-fixed, while a reckless act that happens to cause no harm is still addressed. This is what stops a just culture from collapsing back into an outcome-driven blame culture.
Closely tied to just culture is the duty of candour β the ethical and, in many jurisdictions, legal obligation to be open and honest with a patient (and their family) when they have been harmed. The expectation is prompt, clear communication: say what happened, apologise sincerely, explain what is known about why, describe the steps being taken to prevent recurrence, and offer support and written follow-up. The United Kingdom encodes this in the Care Quality Commission's Regulation 20 (Duty of Candour) and in the General Medical Council's Good Medical Practice ("be open and honest"); Australia operationalises it through the Australian Commission on Safety and Quality in Health Care Open Disclosure Framework; and many United States states have apology laws (often "I'm sorry" laws protecting expressions of sympathy from being used as evidence of liability). These differ in detail but converge on the same professional principle: the patient has a right to know.
Finally, do not forget the second victim. The clinician involved in a serious adverse event frequently suffers guilt, anxiety and loss of confidence β a recognised trauma in its own right. A just culture supports, rather than discards, the second victim; this is both humane and safety-promoting, because a clinician who fears blame hides the next error.
Consent and the law that frames error
The legal framework around consent sets the standard against which an error in communication β a failure to warn β is judged, and it has moved decisively toward the patient. The traditional common-law test was Bolam (1957): a doctor was not negligent if they acted in accordance with a practice accepted as proper by a responsible body of medical opinion skilled in that art. Bolam was refined by Bolitho (1997), which held that a court need not accept expert opinion merely because a body of doctors holds it β that opinion must be capable of withstanding logical analysis, so the court retains the final word on whether a practice is defensible.
The decisive shift in consent was Montgomery v Lanarkshire Health Board (UK Supreme Court, 2015). The court replaced the Bolam approach to disclosure with a patient-centred test: a clinician must take reasonable steps to ensure the patient is aware of any material risks of treatment and of reasonable alternatives. A risk is material if a reasonable person in the patient's position would be likely to attach significance to it, or the doctor is or should reasonably be aware that this particular patient would attach significance to it. The duty is to inform, not merely to answer questions. Montgomery aligned the United Kingdom with jurisdictions that had already reached the reasonable-patient standard β Australia's Rogers v Whitaker (1992) and the United States' Canterbury v Spence (1972) β so the direction of travel is genuinely global.
Underpinning all of this are the four principles of biomedical ethics from Beauchamp and Childress β autonomy, beneficence, non-maleficence and justice. Autonomy drives the consent and candour duties; non-maleficence ("first, do no harm") drives the error-prevention and QI machinery; justice drives fair allocation and equal access. When an examiner asks you to reason through an ethical dilemma, naming the principles in tension (often autonomy versus beneficence) and stating how you would resolve them is the expected structure.
Communication failures are themselves a leading source of preventable harm, and two tools recur. ISBAR structures handover and escalation (see the mnemonic below). The WHO Surgical Safety Checklist structures team communication at the three peri-operative pauses. Both convert high-risk, interruption-prone conversations into a predictable script β a defence against the lapse and the slip.
Evidence
A surgical safety checklist to reduce morbidity and mortality in a global population
- Introducing the WHO Surgical Safety Checklist across eight hospitals in diverse global settings reduced the in-hospital complication rate from 11% to 7% (about a 36% relative reduction) and the death rate from 1.5% to 0.8% (about a 47% reduction), with no significant change in case mix
- Improvements were seen at every site regardless of baseline income level, demonstrating that a structured team pause is one of the highest-yield safety interventions in surgery
Human error: models and management
- Separated the person approach (blame the individual) from the system approach (redesign the environment), arguing that error is an inevitable product of fallible humans and must be managed at the system level
- Distinguished active failures (unsafe acts at the sharp end) from latent conditions (dormant weaknesses in defences, design and organisation) β the conceptual basis of the Swiss-cheese model
- Argued that error management needs two strategies: prevention (reduce errors) and containment (build defences that catch them before harm)
Error in medicine
- Argued that the prevailing focus on individual fault was misguided, because most medical errors arise from system flaws, and that preventing the last error does nothing to prevent the next
- Made the case that accident-prevention techniques from other high-hazard industries (aviation, nuclear power) β specifically the systems approach β should be adopted in medicine
Incidence of adverse drug events and potential adverse drug events: implications for prevention
- In two teaching hospitals, adverse drug events and 'potential' adverse drug events (errors intercepted before reaching the patient) occurred far more often than actual harm, with roughly seven potential events for every actual event
- Concluded that most injuries from medication error are preventable through system changes β computerised order entry, pharmacist involvement and forcing functions β rather than exhorting clinicians to be careful
To Err Is Human: Building a Safer Health System
- Estimated that between 44,000 and 98,000 Americans died in hospitals each year as a result of preventable medical error, numbers comparable to deaths from motor-vehicle accidents
- Framed the majority of errors as system failures and recommended mandatory and voluntary event-reporting systems, safety standards, and a national focus on patient safety
Montgomery v Lanarkshire Health Board β informed consent and the material-risk test
- Replaced the Bolam test for what risks a clinician must disclose with a patient-centred standard: a doctor must take reasonable steps to ensure the patient is aware of any material risks of treatment and of reasonable alternatives
- A risk is 'material' if a reasonable person in the patient's position would be likely to attach significance to it, or the doctor is or should reasonably be aware that the particular patient would do so
Bolam v Friern Hospital Management Committee β the 'Bolam test'
- A doctor is not negligent if they act in accordance with a practice accepted as proper by a responsible body of medical opinion skilled in that particular art β even if other bodies of opinion hold a different view
Exam and revision
Everything below condenses the topic for revision and viva practice β the high-yield points, the memory hooks, two worked vivas, and a one-screen cheat sheet.
- Error taxonomy: slips and lapses are skill-based (right plan, wrong action or omission); mistakes are rule-based or knowledge-based (wrong plan); violations are deliberate. Name and separate them.
- Systems, not blame: most preventable harm is a Swiss-cheese alignment of latent conditions and one active failure β find the latent cause, do not stop at the surgeon.
- Fix the front slice with forcing functions: site-marking, the WHO three-pause checklist and the time-out prevent skill-based slips; checklists and prompts prevent lapses; training and decision support prevent mistakes.
- RCA asks "why", not "who": use the 5 Whys and the Vincent contributory-factors fishbone, run the substitution test, and close every action to completion.
- QI is PDSA: aim, measure, change, then Plan-Do-Study-Act cycles tracked on a run chart or SPC chart.
- Just culture: console human error, coach at-risk behaviour, punish reckless behaviour β the response follows the behaviour, not the outcome.
- Candour is a duty: tell the patient promptly, honestly and with an apology; support the second victim.
- Consent law: Bolam (1957) refined by Bolitho (1997) for negligence; displaced in consent by Montgomery (2015) and its reasonable-patient equivalents (Rogers v Whitaker, Canterbury v Spence).
Slip Β· Lapse Β· Mistake (and Violation)The error taxonomy
Hook:Slip and Lapse are Skill-level (the plan was fine); Mistake is the Mind (the plan was wrong).
Identify Β· Situation Β· Background Β· Assessment Β· RecommendationISBAR β safe handover and escalation
Hook:A structured handover is a defence against the lapse: it turns an interruption-prone call into a predictable script.
Console Β· Coach Β· CensureJust culture β the three responses
Hook:The response follows the behaviour, not the outcome β a slip that kills is still consoled and system-fixed.
Viva practice
Practise clinical reasoning and management decisions out loud
βA registrar on your firm prepared and draped the wrong leg for a total knee replacement. The error was caught at the time-out, no harm reached the patient, and the correct side was then operated on. Talk me through your immediate actions and how you would investigate the event.β
βA sixty-eight-year-old man develops a compartment syndrome of the leg overnight after an uncomplicated tibial nailing. The escalating pain was documented but not escalated, and he is left with a permanent foot drop. Walk me through the clinical and medicolegal management, and how you would discuss this with him.β
Error taxonomy and the systems view
- Slip (skill-based, wrong action) and lapse (skill-based, missed step) β fix with forcing functions, checklists, prompts
- Mistake (rule-based or knowledge-based, wrong plan) β fix with training, supervision, decision support, guidelines
- Violation (deliberate) β managed by just culture, not retraining
- Most preventable harm is a Swiss-cheese alignment of latent conditions plus one active failure β find the latent cause
Investigating and improving
- RCA is systems and retrospective: 5 Whys plus the Vincent contributory-factors fishbone (patient, task, team, communication, equipment, environment, organisation)
- Substitution test: would another competent person have done the same? Yes means system, no means individual
- M&M is anonymised, no-blame, systems-focused, with a closed loop on every action
- QI is the IHI Model for Improvement β aim, measure, change β tested in PDSA cycles and tracked on a run or SPC chart
Safeguards, culture and law
- Never events (wrong-site surgery) are prevented by the WHO three-pause checklist (Sign In, Time Out, Sign Out) and sign-your-site marking β Haynes 2009: about 36 percent fewer complications
- Just culture: console human error, coach at-risk behaviour, censure reckless behaviour β response follows behaviour, not outcome
- Duty of candour / open disclosure: prompt, honest explanation with an apology; support the second victim
- ISBAR structures every handover and escalation
- Negligence standard: Bolam (1957) refined by Bolitho (1997); consent standard: Montgomery (2015), mirroring Rogers v Whitaker and Canterbury v Spence