Skip to main content
OrthoVellum
Knowledge Hub

Study

  • Topics
  • MCQs
  • ISAWE
  • Operative Surgery
  • Flashcards

Company

  • About Us
  • Editorial Policy
  • Contact
  • FAQ
  • Blog

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Medical Disclaimer
  • Copyright & DMCA
  • Refund Policy

Support

  • Help Center
  • Accessibility
  • Report an Issue
OrthoVellum

© 2026 OrthoVellum. For educational purposes only.

Not affiliated with the Royal Australasian College of Surgeons.

Outcome Measures and PROMs

Back to Topics
Contents
0%

Outcome Measures and PROMs

Comprehensive guide to outcome measures and Patient-Reported Outcome Measures (PROMs) in orthopaedic surgery including WOMAC, DASH, SF-36, and joint-specific scores.

complete
Updated: 2025-12-24
High Yield Overview

OUTCOME MEASURES AND PROMs

Patient-Reported Outcomes | Measurement Properties | Clinical Application

PROMPatient-Reported Outcome Measure
MCIDMinimal Clinically Important Difference
VASVisual Analog Scale (0-100mm)
SF-36Generic Health Status Measure

Outcome Measure Types

Generic PROMs
PatternSF-36, EQ-5D - any condition
TreatmentCompare across diseases, population norms
Region-Specific PROMs
PatternDASH (arm), LEFS (leg) - anatomic region
TreatmentSensitive to regional pathology
Joint-Specific PROMs
PatternWOMAC (hip/knee), ASES (shoulder) - single joint
TreatmentMost sensitive to joint pathology
Disease-Specific PROMs
PatternODI (spine), FAAM (ankle) - specific condition
TreatmentTailored to disease features

Critical Must-Knows

  • PROM: Patient-Reported Outcome Measure - patient completes without clinician interpretation. Captures patient perspective.
  • MCID: Smallest change in score that patients perceive as meaningful benefit. Essential for clinical interpretation.
  • Validity: Does the measure assess what it claims to assess? (content, construct, criterion validity)
  • Reliability: Does the measure give consistent results? (test-retest, inter-rater, internal consistency)
  • Responsiveness: Can the measure detect clinically meaningful change over time? (ceiling/floor effects)

Examiner's Pearls

  • "
    SF-36 has 2 components: Physical (PCS) and Mental (MCS) - scored 0-100, higher is better
  • "
    WOMAC assesses 3 domains: Pain, Stiffness, Function - scored 0-96, lower is better (or normalized 0-100)
  • "
    DASH measures upper extremity disability - 0-100 scale, 0 = no disability
  • "
    Floor/ceiling effects over 15% indicate measure may not detect worsening or improvement

Clinical Imaging

Imaging Gallery

SF12v2 PCS and HOOS threshold values (represented by dashed lines) are dependent on preoperative MCS score and demonstrate a linear relationship. Postoperative data are plotted in a binned fashion, wh
Click to expand
SF12v2 PCS and HOOS threshold values (represented by dashed lines) are dependent on preoperative MCS score and demonstrate a linear relationship. PostCredit: Berliner JL et al. via Clin. Orthop. Relat. Res. via Open-i (NIH) (Open Access (CC BY))
A diagrammatic representation of different alignment parameters based on The Knee Society Total Knee Arthroplasty Roentgenographic Evaluation and Scoring System (Viswanathan et al. 2008a). The Coronal
Click to expand
A diagrammatic representation of different alignment parameters based on The Knee Society Total Knee Arthroplasty Roentgenographic Evaluation and ScorCredit: Hadi M et al. via Springerplus via Open-i (NIH) (Open Access (CC BY))
1-year postoperative estimated point defi cit in quality of life estimated by SF-36, in elderly and younger patients operated for LDH compared to a published age-matched reference data population (Sul
Click to expand
1-year postoperative estimated point defi cit in quality of life estimated by SF-36, in elderly and younger patients operated for LDH compared to a puCredit: Open-i / NIH via Open-i (NIH) (Open Access (CC BY))
(A) A photograph of the electronic Patient-Reported Outcome Measures (ePROMs) portal being used on a tablet device in the outpatient setting. (B) A photograph of a patient completing an ePROMs quality
Click to expand
(A) A photograph of the electronic Patient-Reported Outcome Measures (ePROMs) portal being used on a tablet device in the outpatient setting. (B) A phCredit: Malhotra K et al. via BMJ Open via Open-i (NIH) (Open Access (CC BY))

Critical PROM Concepts

Why PROMs Matter

Patient-Centered Care: Surgeon assessment may not match patient experience. PROMs capture what matters to patients - pain, function, quality of life. Required for value-based care.

MCID is Essential

Clinical Significance: A statistically significant change (p less than 0.05) may not matter to patients. MCID defines meaningful improvement. Compare observed change to MCID, not just p-value.

Generic vs Specific

Trade-off: Generic (SF-36) allows comparison across conditions but less sensitive. Specific (WOMAC) highly sensitive to joint pathology but cannot compare to other joints.

Measurement Properties

Quality Assessment: Valid (measures what it claims), Reliable (consistent results), Responsive (detects change). Poor properties = unreliable conclusions.

At a Glance

Patient-Reported Outcome Measures (PROMs) capture the patient's perspective on pain, function, and quality of life without clinician interpretation. The MCID (Minimal Clinically Important Difference) defines the smallest change that patients perceive as meaningful—compare observed change to MCID, not just p-values. PROMs are classified as generic (SF-36, EQ-5D—compare across conditions), region-specific (DASH for upper limb, LEFS for lower limb), or joint/disease-specific (WOMAC for hip/knee, ODI for spine—most sensitive to pathology). Key measurement properties are validity (measures what it claims), reliability (consistent results), and responsiveness (detects change over time). Floor and ceiling effects greater than 15% indicate the measure cannot detect deterioration or improvement respectively.

Mnemonic

VRRMeasurement Properties (PROM Quality)

V
Validity
Does it measure what it claims? (Content, Construct, Criterion)
R
Reliability
Consistent results? (Test-retest, Inter-rater, Internal consistency)
R
Responsiveness
Detects change over time? (Minimal floor/ceiling effects)

Memory Hook:VRR your PROMs - Validity, Reliability, Responsiveness ensure high-quality outcome measurement!

Mnemonic

SWANKCommon Orthopaedic PROMs by Region

S
Shoulder: ASES, Constant
ASES = American Shoulder and Elbow Surgeons score
W
Wrist/Hand: DASH, QuickDASH
DASH = Disabilities of Arm, Shoulder, and Hand
A
All Regions: SF-36, EQ-5D
Generic health status measures
N
kNee/Hip: WOMAC, OKS/OHS
WOMAC most common for hip/knee arthritis
K
bacK/Spine: ODI, NDI
ODI = Oswestry Disability Index for lumbar spine

Memory Hook:SWANK PROMs cover all major orthopaedic regions - memorize these for exams!

Overview and Introduction

What are PROMs?

Patient-Reported Outcome Measures (PROMs) are standardized, validated questionnaires that patients complete without clinician interpretation. They capture the patient perspective on health status, symptoms, function, and quality of life.

Why PROMs Matter:

  • Patient-Centered Care: Surgeon assessment may not match patient experience
  • Quantifies Subjective Outcomes: Pain, function, satisfaction cannot be objectively measured
  • Value-Based Care: Payers increasingly link reimbursement to patient-reported outcomes
  • Quality Improvement: Registries (AOANJRR) use PROMs to benchmark performance
  • Research: Essential for clinical trials to demonstrate treatment efficacy

PROM vs Clinician-Reported Outcomes:

  • PROMs capture what matters to patients (pain, daily activities, quality of life)
  • Clinician measures (ROM, strength) important but may not correlate with patient satisfaction
  • Best practice: Use both PROMs and objective measures

Principles of Outcome Measurement

Types of Outcome Measures

Generic PROMs

Purpose: Assess overall health status across any condition. Allow comparison between different diseases and populations.

SF-36 (Short Form-36 Health Survey)

Description: 36-item generic health status measure.

Domains (8 subscales):

  • Physical Functioning
  • Role Physical (work/activities due to physical health)
  • Bodily Pain
  • General Health
  • Vitality (energy/fatigue)
  • Social Functioning
  • Role Emotional (work/activities due to emotional problems)
  • Mental Health

Scoring:

  • Each subscale: 0-100 (higher = better health)
  • Physical Component Summary (PCS): Aggregate of physical domains
  • Mental Component Summary (MCS): Aggregate of mental domains

MCID: Approximately 5 points for PCS and MCS.

Advantages: Population norms available, allows cross-disease comparison.

Limitations: Less sensitive to specific musculoskeletal pathology than joint-specific measures.

SF-36 is the most widely used generic PROM in orthopaedic research.

EQ-5D (EuroQol-5 Dimensions)

Description: 5-item generic health status measure with utility score.

Domains:

  • Mobility
  • Self-care
  • Usual Activities
  • Pain/Discomfort
  • Anxiety/Depression

Scoring:

  • Each domain: 3 levels (no problems, some problems, extreme problems)
  • Index Score: Utility value 0 (death) to 1 (perfect health), allows cost-effectiveness analysis
  • VAS: 0-100 self-rated health thermometer

MCID: Approximately 0.07-0.10 for index score.

Advantages: Brief (5 questions), generates utility for QALY calculation in health economics.

Limitations: Less responsive than disease-specific measures, ceiling effects in healthy populations.

EQ-5D is preferred for economic evaluations and cost-utility analyses.

Joint-Specific PROMs

WOMAC (Western Ontario and McMaster Universities Arthritis Index)

Description: Most widely used PROM for hip and knee osteoarthritis.

Domains (24 items):

  • Pain (5 items): Pain with various activities
  • Stiffness (2 items): Morning and later-day stiffness
  • Physical Function (17 items): Difficulty with daily activities

Scoring Options:

  • Likert Scale: 0-4 per item, total 0-96 (lower = better)
  • VAS: 0-100mm per item
  • Often normalized to 0-100 scale (higher = better or lower = worse depending on version)

MCID: Approximately 10-15 points (on 100-point scale).

Advantages: Excellent validity and reliability for hip/knee OA, widely used in arthroplasty research.

Limitations: Designed for arthritis - less applicable to ligament injuries, fractures.

WOMAC is the gold standard for hip and knee arthroplasty outcome assessment.

DASH (Disabilities of Arm, Shoulder and Hand)

Description: Region-specific measure for entire upper extremity.

Domains (30 items):

  • Physical function with various activities (lifting, writing, dressing, etc.)

Scoring: 0-100 scale (0 = no disability, 100 = severe disability)

QuickDASH: 11-item shortened version, faster to complete.

MCID: Approximately 10-15 points.

Advantages: Covers entire upper limb, well-validated, widely used.

Limitations: Less sensitive than joint-specific measures for isolated shoulder or elbow pathology.

DASH is versatile for any upper extremity condition.

ASES (American Shoulder and Elbow Surgeons Score)

Description: Joint-specific measure for shoulder pathology.

Domains:

  • Pain (1 VAS 0-10): Weighted 50%
  • Function (10 items): Activities of daily living, weighted 50%

Scoring: 0-100 scale (higher = better function)

MCID: Approximately 6-7 points.

Advantages: Brief, simple, sensitive to shoulder pathology.

Limitations: Heavy weighting on pain may not capture all aspects of shoulder function.

ASES is widely used in shoulder surgery research.

ODI (Oswestry Disability Index)

Description: Disease-specific measure for low back pain and disability.

Domains (10 items):

  • Pain intensity
  • Personal care
  • Lifting
  • Walking
  • Sitting
  • Standing
  • Sleeping
  • Sex life
  • Social life
  • Traveling

Scoring: 0-100% (0 = no disability, 100 = bed-bound)

MCID: Approximately 10 points.

Advantages: Gold standard for lumbar spine disability, well-validated.

Limitations: Specific to low back pain - different measure needed for neck (NDI).

ODI is essential for lumbar spine outcome assessment.

Measurement Properties

Validity

Definition: Does the measure assess what it claims to assess?

Types of Validity

TypeDefinitionHow to AssessExample
Content ValidityCovers all relevant aspects of constructExpert panel review, patient inputWOMAC includes pain, stiffness, function for OA
Construct ValidityCorrelates with related measures, discriminates from unrelatedCorrelation with similar PROMs (convergent), lack of correlation with dissimilar (discriminant)WOMAC correlates with knee ROM (convergent) but not with mental health scores (discriminant)
Criterion ValidityCorrelates with gold standardCompare to established measureNew knee score correlates with WOMAC

Reliability

Definition: Does the measure give consistent results when condition is stable?

Types of Reliability

TypeDefinitionHow to AssessTarget
Test-RetestSame result when repeated in stable patientsIntraclass Correlation Coefficient (ICC)ICC greater than 0.70
Inter-RaterDifferent raters get same resultICC for clinician-administered measuresICC greater than 0.70
Internal ConsistencyItems within scale measure same constructCronbach alphaAlpha 0.70 to 0.95 (too high suggests redundancy)

Responsiveness

Definition: Can the measure detect clinically meaningful change over time?

Floor Effect: High proportion (over 15%) score at minimum (worst possible).

  • Problem: Cannot detect worsening in these patients.

Ceiling Effect: High proportion score at maximum (best possible).

  • Problem: Cannot detect improvement in these patients.

Responsiveness Index: Standardized Response Mean (SRM) or Effect Size.

  • SRM greater than 0.8: Large responsiveness (good)
  • SRM 0.5 to 0.8: Moderate responsiveness
  • SRM less than 0.5: Small responsiveness (may miss change)

Understanding responsiveness prevents choosing measures that cannot detect improvement.

Minimal Clinically Important Difference (MCID)

What is MCID?

Definition: The smallest change in PROM score that patients perceive as beneficial and would mandate a change in management.

Purpose: Distinguish statistically significant from clinically meaningful change.

How MCID is Determined

Methods:

  1. Anchor-Based: Compare PROM change to external anchor (patient global assessment)

    • "Compared to before surgery, how would you rate your improvement: Much better, Better, Same, Worse?"
    • Calculate MCID as mean change for "Better" group.
  2. Distribution-Based: Use statistical thresholds (0.5 SD, Standard Error of Measurement)

    • MCID = 0.5 × standard deviation
    • Less clinically intuitive than anchor-based.

Clinical Application:

  • If mean improvement = 8 points and MCID = 10 points → Improvement is statistically significant but NOT clinically meaningful.
  • If 95% CI = 12 to 18 points and MCID = 10 → Entire CI exceeds MCID → Clinically meaningful improvement.

Always compare treatment effects to MCID, not just p-values.

Clinical Application and Relevance

Choosing the Right PROM

Joint-specific for sensitivity (WOMAC for THA). Generic for cross-disease comparison and population norms (SF-36). Use both when possible to capture joint-specific and overall health.

Interpreting PROM Data

Compare change to MCID, not just statistical significance. Check floor/ceiling effects - over 15% suggests measure may not detect change. Report mean change AND proportion exceeding MCID.

Registry Requirements

AOANJRR and many registries require PROMs. Pre-operative baseline and post-operative follow-up (1 year, 5 year). Allows benchmarking and quality improvement.

Value-Based Care

Payers increasingly link reimbursement to PROMs. Demonstrating patient-reported improvement justifies procedures. PROMs essential for value-based contracts.

Evidence Base

WOMAC Measurement Properties

3
Bellamy N, Buchanan WW, Goldsmith CH, et al • Journal of Rheumatology (1988)
Key Findings:
  • WOMAC developed and validated for hip and knee osteoarthritis
  • 24-item questionnaire assessing pain, stiffness, physical function
  • Excellent test-retest reliability (ICC greater than 0.90)
  • Good construct validity (correlates with other arthritis measures)
  • Responsive to change after arthroplasty
Clinical Implication: WOMAC is the gold standard PROM for hip and knee osteoarthritis and arthroplasty research.
Limitation: Designed for arthritis - less applicable to trauma or ligament injuries.

MCID for Common Orthopaedic PROMs

3
Copay AG, Subach BR, Glassman SD, et al • Spine Journal (2007)
Key Findings:
  • Systematic review of MCID values for musculoskeletal PROMs
  • SF-36 PCS: MCID approximately 5 points
  • WOMAC: MCID 10-15% of scale (10-15 points on 100-point scale)
  • VAS Pain: MCID 15-20mm on 100mm scale
  • DASH: MCID 10-15 points
Clinical Implication: Use validated MCID values to interpret clinical significance of PROM changes, not just statistical significance.
Limitation: MCID may vary by population, disease severity, and intervention - use disease-specific values when available.

Floor and Ceiling Effects in PROMs

5
Terwee CB, Bot SD, de Boer MR, et al • Journal of Clinical Epidemiology (2007)
Key Findings:
  • Floor or ceiling effects over 15% considered problematic
  • Indicates measure cannot detect worsening (floor) or improvement (ceiling)
  • Reduces responsiveness and statistical power
  • Should report floor/ceiling effects when validating PROMs
  • Consider alternative measure if effects exceed 15%
Clinical Implication: Check for floor/ceiling effects when selecting PROMs. High effects limit ability to detect change and require larger sample sizes.
Limitation: Acceptable threshold (15%) is somewhat arbitrary - context-dependent.

Exam Viva Scenarios

Practice these scenarios to excel in your viva examination

VIVA SCENARIOStandard

Scenario 1: PROM Selection

EXAMINER

"You are planning an RCT comparing cemented vs uncemented THA. What outcome measures would you use and why?"

EXCEPTIONAL ANSWER
For a THA trial, I would use a combination of joint-specific and generic PROMs to comprehensively assess outcomes. My primary outcome would be the **WOMAC score**, which is the gold standard patient-reported measure for hip osteoarthritis and arthroplasty. WOMAC assesses three domains: pain, stiffness, and physical function, with 24 items total. It is highly validated, reliable, and responsive to change after THA, with an MCID of approximately 10-15 points on a 100-point scale. As a secondary outcome, I would include the **SF-36**, particularly the Physical Component Summary (PCS), to assess overall health status and quality of life. This allows comparison to population norms and captures health benefits beyond the hip joint. I would also measure the **EQ-5D** to generate utility scores for cost-effectiveness analysis, which is increasingly required by payers and health systems for value-based care. Additionally, I would include objective measures such as **range of motion** and **radiographic assessment of component positioning and osseointegration**. I would collect PROMs at baseline (pre-operative), and post-operatively at 6 weeks, 3 months, 6 months, 1 year, and annually thereafter. This comprehensive approach captures patient-reported outcomes (WOMAC), overall health (SF-36), economic value (EQ-5D), and clinical success (ROM, radiographs).
KEY POINTS TO SCORE
Primary outcome: WOMAC (joint-specific, most sensitive to hip pathology)
Secondary outcomes: SF-36 PCS (generic health), EQ-5D (utility for cost-effectiveness)
Include objective measures: ROM, radiographs
Timing: Baseline and multiple post-op timepoints (6 weeks, 3 months, 6 months, 1 year, annually)
COMMON TRAPS
✗Using only generic measures (less sensitive to joint-specific change)
✗Not mentioning MCID or how to interpret clinical significance
✗Not including economic outcome measure (EQ-5D for utility)
✗Not specifying outcome timing (baseline and follow-up intervals)
LIKELY FOLLOW-UPS
"What is the MCID for WOMAC and why does it matter?"
"What is the difference between WOMAC and SF-36?"
"How would you handle missing PROM data in your analysis?"
VIVA SCENARIOChallenging

Scenario 2: MCID Interpretation

EXAMINER

"An RCT of 200 patients found that new rehab protocol improved WOMAC score by mean 8 points (95% CI 5 to 11 points, p = 0.001) compared to standard protocol. The MCID for WOMAC is 10 points. How do you interpret this result?"

EXCEPTIONAL ANSWER
This result requires careful interpretation because it demonstrates statistical significance without clear clinical significance. Let me analyze each component. First, **statistical significance**: p = 0.001 is highly statistically significant, well below the conventional 0.05 threshold, and the 95% CI of 5 to 11 points excludes zero, confirming a real difference exists. Second, **effect size**: the mean improvement is 8 points. Third, **clinical significance**: the MCID for WOMAC is 10 points, meaning patients perceive a 10-point change as meaningful benefit. The observed 8-point improvement falls short of this threshold. Fourth, **confidence interval assessment**: the CI ranges from 5 to 11 points. The lower bound (5 points) is well below the MCID, but the upper bound (11 points) exceeds it. This creates uncertainty - the true effect could be clinically meaningful (if at the upper end of CI) or not (if at the lower end). **Interpretation**: While statistically significant, this result is clinically uncertain. The point estimate of 8 points suggests the benefit may not be meaningful to most patients. However, the CI crossing the MCID threshold indicates we cannot definitively rule out a clinically important effect. **Recommendation**: I would interpret this as weak evidence for clinical benefit. To make a confident recommendation, we would need a larger study with adequate power to detect a 10-point difference, which would narrow the CI and clarify whether the true effect exceeds the MCID. Additionally, I would analyze the proportion of patients who achieved the MCID - if 60-70% of patients improved by at least 10 points, this might be clinically worthwhile despite the mean being below MCID.
KEY POINTS TO SCORE
Statistical significance (p = 0.001) does NOT equal clinical significance
Mean improvement (8 points) below MCID (10 points) suggests limited clinical benefit
CI (5-11) crosses MCID, creating uncertainty about clinical importance
Need larger study to narrow CI and clarify if effect exceeds MCID
Alternative analysis: proportion of patients achieving MCID
COMMON TRAPS
✗Concluding treatment is effective based solely on p less than 0.05
✗Not comparing effect size to MCID
✗Not interpreting confidence interval in relation to MCID threshold
✗Not suggesting larger study or alternative analyses
LIKELY FOLLOW-UPS
"What is the Minimal Clinically Important Difference (MCID)?"
"How would you design a study specifically powered to detect the MCID?"
"What other analyses could help interpret this borderline result?"

MCQ Practice Points

PROM Types

Q: What is the difference between a generic PROM (SF-36) and a joint-specific PROM (WOMAC)? A: Generic PROMs assess overall health status across any condition, allow comparison between diseases and to population norms, but are less sensitive to specific joint pathology. Joint-specific PROMs are highly sensitive to pathology in a single joint but cannot compare across different joints or to general population.

MCID Importance

Q: Why is MCID important when interpreting PROM changes? A: MCID defines clinically meaningful change - the smallest improvement that patients perceive as beneficial. Statistically significant changes (p less than 0.05) may not exceed MCID and thus not be clinically important. Always compare observed change to MCID, not just p-value.

Floor and Ceiling Effects

Q: What is a ceiling effect and why does it matter? A: Ceiling effect occurs when high proportion (over 15%) of patients score at maximum (best possible score). This prevents the measure from detecting improvement in these patients and reduces responsiveness. Choose a different measure or add a more challenging domain if ceiling effects are problematic.

Validity vs Reliability

Q: What is the difference between validity and reliability? A: Validity = Does the measure assess what it claims to assess? (accuracy). Reliability = Does the measure give consistent results when repeated in stable patients? (precision). A measure can be reliable but not valid (consistently wrong), but cannot be valid without being reliable.

Test-Retest Reliability

Q: What ICC value indicates good test-retest reliability? A: ICC greater than 0.70 indicates acceptable reliability. ICC (Intraclass Correlation Coefficient) ranges 0-1. ICC greater than 0.90 is excellent, 0.70-0.90 is good, less than 0.70 is poor. This measures consistency when same patient completes PROM twice with stable condition.

Responsiveness Measures

Q: How is responsiveness quantified? A: Standardized Response Mean (SRM) or Effect Size. SRM = mean change / SD of change. SRM greater than 0.8 = large responsiveness (good), 0.5-0.8 = moderate, less than 0.5 = small (may miss clinically important change). Responsiveness is essential for detecting treatment effects.

Australian Context

The Australian Orthopaedic Association National Joint Replacement Registry (AOANJRR) systematically collects PROMs for hip and knee arthroplasty, providing benchmarking data across Australian hospitals. The registry uses the Oxford Hip Score (OHS) and Oxford Knee Score (OKS) as primary joint-specific outcome measures, with EQ-5D for health utility assessment.

The ACSQHC (Australian Commission on Safety and Quality in Health Care) Clinical Care Standards increasingly incorporate PROM collection as quality indicators. Pre-operative and post-operative PROM collection at standardized intervals (baseline, 6 months, 1 year, 5 years) allows meaningful comparison across institutions and surgeons.

Australian validation studies have confirmed the measurement properties (validity, reliability, responsiveness) of commonly used PROMs in the Australian population, supporting their use in clinical practice and research. The AOANJRR PROM data demonstrates that contemporary arthroplasty procedures achieve mean improvements exceeding the MCID for the majority of patients.

Management Algorithm

📊 Management Algorithm
Management algorithm for Outcome Measures Proms
Click to expand
Management algorithm for Outcome Measures PromsCredit: OrthoVellum

OUTCOME MEASURES AND PROMs

High-Yield Exam Summary

Common Orthopaedic PROMs

  • •Generic: SF-36 (PCS/MCS, 0-100, higher better), EQ-5D (utility 0-1)
  • •Hip/Knee: WOMAC (pain/stiffness/function, 0-96 or 0-100, lower or higher better depending on version)
  • •Upper Extremity: DASH (0-100, 0 = no disability), QuickDASH (11 items)
  • •Shoulder: ASES (0-100, higher better), Constant score
  • •Spine: ODI (Oswestry 0-100%, lower better), NDI (Neck Disability)

MCID Values

  • •SF-36 PCS/MCS: MCID approximately 5 points
  • •WOMAC: MCID 10-15 points (on 100-point scale)
  • •DASH: MCID 10-15 points
  • •VAS Pain: MCID 15-20mm (on 100mm scale)
  • •Always compare treatment effect to MCID for clinical significance

Measurement Properties

  • •Validity = Does it measure what it claims? (content, construct, criterion)
  • •Reliability = Consistent results? (test-retest ICC greater than 0.70, Cronbach alpha 0.70-0.95)
  • •Responsiveness = Detects change? (SRM greater than 0.8 = large, less than 15% floor/ceiling effects)
  • •Floor effect = Too many at minimum (cannot detect worsening)
  • •Ceiling effect = Too many at maximum (cannot detect improvement)

PROM Selection

  • •Joint-specific for sensitivity (WOMAC for THA trial)
  • •Generic for cross-disease comparison and population norms (SF-36)
  • •Utility measure for cost-effectiveness (EQ-5D)
  • •Use combination: Joint-specific (primary) + Generic (secondary)
  • •Check floor/ceiling effects (over 15% problematic)

Interpreting PROM Data

  • •Compare mean change to MCID, not just p-value
  • •Check if 95% CI excludes MCID threshold
  • •Report proportion of patients achieving MCID
  • •Wide CI crossing MCID = uncertain clinical significance
  • •Large sample with trivial effect (below MCID) = not clinically important

Clinical Application

  • •AOANJRR and registries require PROMs (baseline and follow-up)
  • •Value-based care links reimbursement to PROM improvement
  • •Statistical significance ≠ Clinical significance
  • •Generic vs Specific trade-off: Comparison vs Sensitivity
  • •Pre-specify primary PROM and timing in study protocol
Quick Stats
Reading Time68 min
Related Topics

Articular Cartilage Structure and Function

Bending Moment Distribution in Fracture Fixation

Biceps Femoris Short Head Anatomy

Biofilm Formation in Orthopaedic Infections