Bauer et al. (2019) on Implementation Science
Implementation Science: What Is It and Why Should I Care?
Mark S. Bauer & JoAnn Kirchner · Psychiatry Research · 2019
Establishing the effectiveness of a clinical innovation is not sufficient to guarantee its uptake into routine use. Implementation science addresses the contextual factors that determine whether and how evidence-based practices actually reach patients.
The Motivating Problem
The authors open with a striking case study: a well-funded, multi-site randomized controlled trial of a Collaborative Chronic Care Model (CCM) for bipolar disorder showed significant improvements in mood episodes, quality of life, and social function at no extra cost. The intervention was endorsed by national guidelines and listed on SAMHSA’s evidence registry. Yet within one year of the study’s end, all 15 participating sites had abandoned the CCM and returned to treatment as usual. The question this paper answers: why does this keep happening, and what can be done about it?
The Implementation Gap
Classic research findings paint a sobering picture: clinical innovations typically take 17 to 20 years to enter routine practice, and fewer than half ever achieve general use. Chalmers and Glasziou (2009) estimate that 80% of medical research spending fails to generate meaningful public health impact. The problem is not new to the digital age; the first observation that citrus cures scurvy was made in 1601, an RCT was conducted in 1747, yet the British Navy did not adopt routine citrus use until 1795 and the merchant marine not until 1865. Evidence alone has never been sufficient.
What Implementation Science Is
Implementation science is defined as the scientific study of methods to promote the systematic uptake of research findings and other evidence-based practices into routine practice, with the aim of improving quality and effectiveness of health services (Eccles & Mittman, 2006). Its focus is not on proving that a clinical innovation works, but on identifying the factors that affect its uptake and developing strategies to overcome barriers across multiple levels of context: individual patients, providers, organizations, communities, and policy environments.
The Biomedical Research Pipeline (Three Models)
The authors conceptualize biomedical research as a pipeline from concept development to public health impact, depicted across three evolving models. The first (old) model assumed efficacy evidence alone was sufficient. The second model extended the pipeline to include effectiveness research prioritizing external validity in real-world settings. The third and current model adds implementation science as the stage that specifically tests strategies for achieving routine uptake, actively engaging with rather than controlling or ignoring context.
Research Pipeline (Simplified)
Does it work at all?
Internal validity
External validity
Uptake & context
Routine use
■ Pipeline & Validity ■ Implementation Core ■ Context & Adoption ■ Related Fields
Pipeline & Validity Concepts
Implementation Science Core Concepts
Context & Adoption
Related Fields & Distinctions
Implementation trials differ from clinical trials at the level of the hypothesis: rather than contrasting health outcomes, they test strategies to increase uptake and sustainability of an evidence-based innovation in real-world settings.
Trial Type Comparison (Table 1 from the paper)
The authors use intranasal ketamine for depression as a concrete illustrative example across all three trial types.
| Dimension | Efficacy | Effectiveness | Implementation |
|---|---|---|---|
| Hypothesis | Ketamine beats control condition | Ketamine beats control in real-world settings | A multifaceted strategy increases ketamine use vs. education alone |
| Setting | Academic medical centers or closely affiliated sites | More typical clinical sites where the intervention would be used | More typical clinical sites; resembles effectiveness setting |
| Population | Exclusions for psychosis, bipolar, anxiety; cooperative subjects selected | Include most comorbidities; minimize exclusion criteria | Unit of observation may be patients, providers, or entire clinics |
| Intervention fidelity | Trained to criterion; closely monitored | Trained to criterion; QI-type monitoring as in usual practice | Monitor and intervene; accommodate adaptations that preserve core components |
| Outcomes | Extensive health outcome battery | Focused, efficient battery (less research tolerance) | Uptake measures; health outcomes may supplement |
| Healthcare context | Control context at all costs | Work within typical conditions | Work within typical conditions and actively intervene to improve uptake |
| Research support | “Crypto-case management”: close follow-up and outreach | Some research support, firewalled to prevent artificial engagement | Support only for implementation tasks; light-touch remote assessments |
| Validity emphasis | Internal >> external | External > internal | Implementation strategy may be modified mid-course to maximize uptake while maintaining fidelity |
Key Methodological Distinctions
Context as the Variable, Not the Nuisance
- Efficacy trials control context to isolate treatment effects.
- Effectiveness trials tolerate context in the interest of generalizability.
- Implementation trials actively engage with and intervene in context to improve uptake.
- Researcher involvement at sites is drastically reduced in implementation trials to avoid distorting natural adoption dynamics.
Multi-Level Unit of Observation
- Clinical trials observe individual patients (subject-level).
- Implementation trials may observe patients, providers, clinics, facilities, organizations, communities, or policy environments.
- This multi-level focus requires multi-disciplinary teams: clinical scientists, social scientists, economists, systems engineers, and health services researchers.
Formative Evaluation
- In efficacy trials, mid-course modifications are minimized to protect invariance.
- In implementation trials, planned mid-course adjustments are a methodological feature, not a threat to validity.
- These adjustments may be specific study hypotheses (“what works, for whom, and under what conditions”).
- Stetler et al. (2006) is the foundational reference for this approach.
Operational Partners as Co-Researchers
- In clinical research, healthcare system leaders and staff play a primarily permissive role.
- In implementation research, they are full partners from study design through analysis, because the research intervenes directly in structures they control.
- Cultural gaps between researchers and operational colleagues must be actively managed (Kilbourne et al., 2012).
Why Audit and Feedback Alone Fails
- A Cochrane meta-analysis (Ivers et al., 2012) found that audit and feedback increased target provider behaviors by only 4.3% (range 0.5 to 16%).
- This finding demonstrates that education and monitoring approaches do not reliably change provider behavior.
- Implementation science moves beyond these passive strategies to test active implementation strategies that can be rigorously evaluated.
Eccles & Mittman (2006) — Defining Implementation Science
Definition FoundationalRogers (1962) — Diffusion of Innovations
Theory HistoricalMosteller (1981) — Innovation and Evaluation
Epidemiology Evidence BaseMorris et al. (2011) — “The Answer Is 17 Years”
Time Lag Evidence BaseDamschroder et al. (2009) — CFIR
Framework FoundationalCurran et al. (2012) — Hybrid Designs
Methods DesignIvers et al. (2012) — Audit and Feedback Meta-Analysis
Cochrane LimitationStetler et al. (2006) — Formative Evaluation
Methods QUERICenturies of evidence show that proving an intervention works is not enough to get it into practice. Implementation science is the scientific discipline that closes the gap between evidence and uptake, by studying and actively modifying the contextual factors that determine whether effective innovations actually reach patients.
-
The Implementation Gap Is Structural, Not Incidental
It takes 17 to 20 years on average for clinical innovations to reach routine practice, and fewer than half ever do. Eighty percent of medical research spending may fail to generate public health impact. This problem predates the digital era and is driven by contextual factors, not by weaknesses in the innovations themselves.
-
Effectiveness Is Necessary but Not Sufficient
Even well-funded, multi-site RCTs producing compelling results at no extra cost do not guarantee adoption. The CCM for bipolar disorder case illustrates this vividly: all 15 sites returned to usual care within a year of a landmark positive trial. Moving clinical research into real-world settings (effectiveness trials) does not, by itself, solve the uptake problem.
-
Implementation Science Has a Distinct Hypothesis Structure
Implementation trials do not ask “does this intervention produce better outcomes?” They ask “which strategy best increases uptake and sustainability of this evidence-based innovation?” The unit of observation may be providers, clinics, or entire organizations, and the outcome is adoption, not clinical health improvement.
-
Context Is the Variable, Not the Nuisance
Where efficacy trials control context and effectiveness trials tolerate it, implementation trials actively engage with and intervene in context. This requires multi-level investigation across patient, provider, organization, community, and policy environments, and demands collaboration with social scientists, economists, and systems engineers alongside clinicians.
-
Operational Partners Must Be Full Co-Investigators
Healthcare leaders, administrators, and clinical staff are not merely gatekeepers in implementation research. They must be full partners from study design to analysis, because the research intervenes directly in systems they control and understand. An innovation is implemented because of them, not in spite of them, and cultural barriers between researchers and practitioners must be actively addressed.
-
Formative Evaluation Transforms Mid-Course Adaptation from Threat to Tool
In efficacy trials, any change to protocol threatens internal validity. In implementation trials, planned mid-course adjustments are a designed feature. Formative evaluation integrates real-time data to adapt implementation strategies in order to optimize uptake, and the adaptations themselves can become specific study hypotheses, enabling progressive learning throughout the trial.
