How AI closes biopharma R&D workflow loops

(13 pages)

Biopharma innovation has delivered immense societal value, from vaccines that have halted global epidemics to cell therapies that have put certain cancers in check. But the economics of producing these breakthroughs are becoming unsustainable. Nearly 70 percent of R&D spending is concentrated in clinical development, clinical success rates hover around 13 percent for assets that enter Phase I trials, and the cost per successful new molecular entity has risen from approximately $2.5 billion in 2016 to $4 billion today.¹

Faced with these mounting pressures, industry leaders are increasingly turning to AI, with early applications already showing real promise. AI-enhanced genetic evidence can double or triple the current probability of success²; AI-driven molecule design has improved binding performance and shortened lead optimization cycles³; and AI-enabled trial optimization has increased probability of success while reducing time to approval. These applications, combined with operational efficiency improvements in clinical performance, have materially shortened the path from lead identification to investigational new drug submission and reduced late-stage development timelines by as much as 12 months.⁴

Yet the prevailing R&D operating model cannot fully capture the benefits of modern AI. Machine learning (ML), causal modeling, generative AI, and agentic AI can now accelerate decisions, reduce cycle times, and reshape core inflection points in R&D. The prevailing model, however, remains fundamentally linear, advancing through stage gates that move work forward but rarely create systematic feedback across decisions. Accelerating individual steps may improve efficiency, but it does not compound learning.

Extracting the full value from AI requires a structural redesign: organizing R&D around five connected decision points, each operating as a closed loop⁵ in which data, decisions, and outcomes reinforce one another. The ambition is not simply to deploy AI tools or replace human judgment, but to rethink how leaders and clinicians can exploit a richer trove of accumulated insights to make faster, better-informed decisions where it matters most.

A successful R&D program turns on decisions shaped by the outputs of five interconnected stages (Exhibit 1):

understanding patients and disease biology
identifying and validating drug targets
discovering and optimizing therapeutic candidates
designing and executing clinical trials
maximizing the patient impact of approved therapies

A closed-loop, AI-powered pharma R&D model enables richer insights and faster decisions.

Realizing the potential of AI-powered decision loops requires more than deploying isolated tools and technologies. It demands clear decision rights, strong accountability, and an operating model built for continuous learning rather than sequential handoffs. In this article, we outline the five loops, the AI capabilities that power them, the organizational conditions required to connect them, and the anchors needed to establish them as a durable source of competitive advantage. Properly implemented, decision loops can create a high-performance R&D delivery engine—but only if the infrastructure, governance, and organizational capabilities are in place to scale performance improvements across the portfolio.

The case for closed-loop decision points

Drug discovery has always depended on iteration—forming hypotheses, testing them, and refining them. Yet in most R&D organizations, insight does not reliably travel upstream. When a trial fails because the patient population was too broadly defined, biopharma organizations may not have a systematic mechanism to ensure that the lesson reshapes target selection. When manufacturability constraints emerge late, they may likewise fail to inform earlier molecule design.

By contrast, a closed-loop R&D model is designed for continuous learning across experimentation, evidence generation, and decision-making. This architecture closes the circuit: Every pivotal decision generates data that informs the next and refines the one that preceded it.

The technology stack for AI-powered decision loops

The five decision loops described in this article are powered by four AI technologies—each described here in this glossary—that work in concert. Each of them addresses a distinct challenge in the R&D process. Together, they form the orchestrated stack that makes closed-loop learning practical.

Machine learning: Finding patterns in complex biological data

Machine learning (ML) analyzes large, heterogeneous data sets such as genomics—imaging, electronic health records, and laboratory results—to surface patterns that inform R&D decisions. ML models are typically trained for specific, bounded tasks: In patient characterization, ML identifies disease subtypes and candidate biomarkers from real-world clinical data; in candidate screening, it predicts compound activity and biological behavior, enabling teams to prioritize the most promising molecules before committing laboratory resources. ML forms the analytical backbone of the loops; it is the layer that continuously processes incoming data and translates it into signals that sharpen the next decision.

Causal modeling: Knowing which variables drive outcomes

Standard ML identifies correlations. Causal modeling goes further, distinguishing the variables that cause outcomes from those that merely accompany them. In target selection, this distinction is consequential: Many associations observed in patient data reflect disease biology without driving it, and pursuing them as targets has been a significant source of late-stage attrition. Causal models evaluate whether a target gene genuinely influences disease progression and the difference between a valid drug target and an incidental biomarker. In clinical development, causal analytics identify which patient characteristics most strongly predict treatment response, enabling tighter trial designs and more efficient studies.

Generative AI and foundation models: Designing molecules and hypotheses at scale

Generative AI expands the hypothesis space at every loop; the other three capabilities determine which hypotheses are worth pursuing. Where ML analyzes what exists, generative AI creates what does not yet exist, including novel molecular structures, experimental hypotheses, and synthetic routes. In small-molecule discovery, generative models design compounds optimized for binding affinity, selectivity, and safety, enabling teams to explore a far larger solution space than conventional screening allows. In biologics, generative AI is increasingly applied to antibody and protein engineering, including de novo sequence generation, affinity maturation, epitope targeting, and developability optimization. Foundation models trained on large-scale protein sequence and structure data enable rapid exploration of antibody variants, prediction of binding and stability, and prioritization of candidates with favorable manufacturability and safety profiles. Beyond molecular design, generative models can synthesize complex patient-level outputs, including longitudinal disease trajectories and multimodal data integrating imaging with omics, opening new possibilities for trial simulation and patient stratification.

Agentic AI: Coordinating the research workflow end to end

Agentic AI refers to systems that can plan, reason, and execute multistep tasks with limited human intervention. Rather than answering a single query, an agentic system pursues a goal: It proposes a next step, executes it, evaluates the result, and updates its plan accordingly. In R&D, agentic systems coordinate the other three capabilities—dispatching computational predictions, commissioning laboratory experiments, analyzing results, and revising the research plan in real time. In clinical development, agentic systems automate complex data workflows, monitor trial execution, and generate protocol amendment recommendations for human review (see sidebar, “Embedding agentic AI across the clinical trial life cycle”). When all these components come together, human productivity can climb by an order of magnitude compared with conventional methods. Agentic AI is what makes the closed loop self-sustaining: It is the connective tissue that keeps the system learning rather than requiring human coordination at every handoff.

What makes this achievable now is a specific convergence of automation and AI capabilities operating in concert—ML for prediction and pattern recognition, generative AI for hypothesis creation, causal modeling for identifying key outcome drivers, and agentic AI for coordinating workflows—and preconfigured for human-in-the-loop management. (For more on these technologies, see sidebar “The technology stack for AI-powered decision loops.”) Together, these capabilities allow organizations to materially compress these loops—and generate data and insights that can benefit others.

The following examples illustrate the real-world potential of AI-powered loops:

Robin, a multiagent AI system developed by FutureHouse, integrates literature review, hypothesis generation, experimentation, and data analysis to rapidly turn scattered scientific clues into testable hypotheses. The system proposed repurposing the glaucoma drug ripasudil for dry age-related macular degeneration via a mechanism that enhances phagocytosis by the retinal pigment epithelium.⁶ Robin also identified KL001, a circadian-clock modulator, as a promising therapeutic candidate. Both findings were subsequently evaluated in vitro using primary human retinal pigment epithelium cells.
Google DeepMind’s Co-Scientist, another multiagent system, has demonstrated similar capabilities across several biomedical applications. Co-Scientist has identified novel candidates for drug repurposing and synergistic combination therapies for acute myeloid leukemia, several of which demonstrated selective cytotoxicity against leukemia cells in laboratory testing.⁷ The system also proposed new epigenetic targets for liver fibrosis, and it independently generated a mechanistic explanation for antimicrobial resistance gene transfer that matched unpublished wet-lab findings.

The five decision loops—and how AI compresses each one

In the traditional R&D model, the five decisions unfold sequentially. Teams first define the disease and patient population, then identify a target, and then design and optimize a candidate. Only after preclinical success do they move into clinical development, where trial design decisions lock in timelines and costs. Once approved, companies address commercial positioning and manufacturing strategy, often with limited opportunity to revisit earlier assumptions.

The five loops represent a new way to manage these same decision points, using an AI-powered operating model—combining data, AI and automation technologies, simulations, and human oversight (Exhibit 2)—to generate a richer trove of insights from simultaneous processes unfolding across the drug development system. Across all loops, outputs become inputs for other cycles, enabling continuous learning and compression of timelines through integrated AI, data, and orchestration layers.

Each loop combines data, simulation, experimentation, and agentic orchestration in new ways of working supervised by scientists.

Understanding patients and disease biology

Historically, characterizing patient heterogeneity has relied on slow, fragmented analyses across clinical, genomic, imaging, and real-world data sets. Based on observed AI-enabled deployments, integrating AI outputs into cross-functional decision-making allows teams to generate and test hypotheses about disease variability in weeks rather than months. This accelerates decisions on target selection, biomarker identification, and early translational intent.

The outputs of this loop are intentionally bounded: ranked hypotheses on disease heterogeneity, early biomarker candidates, and translational constraints that inform target prioritization and modality choice. They do not constitute validated endotypes or formal development strategies. By using live data across the enterprise to explicitly surface uncertainty and confidence levels—a departure from current methods—this loop strengthens downstream decisions while avoiding the false precision that has historically undermined patient stratification.

Identifying and validating drug targets

In this loop, companies can integrate machine learning, generative AI, causal analytics, and agentic orchestration across biological knowledge graphs. In so doing, they can rapidly map mechanistic pathways, prioritize high-confidence targets, and simulate novel target–disease relationships. Target identification builds on patient stratification but does not require full resolution of that data. Signals from the first loop provide human-relevant context, highlighting where biological variability matters and where biomarker differentiation may be required. In AI-powered operating models, a coordinated team of agents works to synthesize evidence and acts as connective tissue across biology, translational, and data science teams. These teams evaluate the signals together, enabling faster and more confident target decisions.

Discovering and optimizing therapeutic candidates

Historically constrained by laboratory throughput, design-make-test-analyze (DMTA) cycles can now accelerate through predictive screening, generative molecule design, causal analysis of phenotypic effects, and agentic screening campaigns. Rather than sequential experimentation, AI-guided design and automated testing operate as an integrated, iterative system.

Timelines are compressed and human productivity increases when AI-guided design, automated experimentation, and scientific judgment are embedded within a unified operating model. The same AI-guided design-test-learn paradigm is now being applied across therapeutic modalities—including antibodies and engineered proteins—enabling iterative optimization through modality-specific workflows built on a common loop architecture.⁸

Designing and executing clinical trials

AI enhances this loop through predictive trial simulation, integration of trial and real-world evidence, optimized subpopulation identification, and continuous reassessment of protocol assumptions. Agentic systems generate scenario analyses and amend recommendations for review within established governance processes.

At the frontier, ML, generative AI, and foundation models are increasingly being applied to create digital twins of trials and patients—currently deployed in pilot and exploratory settings—that can produce additional in silico evidence, with the potential to substantially reduce cost and duration at scale. Capturing the full value of agentic AI requires embedding the technology directly into protocol design, trial operations, and decision governance while preserving human ownership of clinical judgment.

Agentic systems function as supervised decision support tools; they do not autonomously modify approved protocols or make regulatory decisions. Where AI influences trial-critical decisions, or is embedded in the trial itself, these systems must meet applicable quality, validation, auditability, and oversight standards under the guidance of regulatory and compliance authorities. As regulatory frameworks evolve, the scope of these systems may expand within clearly defined and approved boundaries.

Maximizing patient impact

Rather than treating approval as an end point, organizations can continuously evaluate indication expansion, patient segmentation, competitive positioning, and portfolio fit. By integrating real-world data, trial evidence, and advanced analytics, companies move beyond static positioning toward ongoing optimization. Clinical positioning and chemistry, manufacturing, and controls (CMC) strategy are integrated within the same loop. Early manufacturing decisions materially affect scalability, cost, and differentiation; when aligned with clinical intent, they reduce late-stage surprises and accelerate launch readiness. Asset-level insights also feed back into earlier loops, shaping target prioritization and portfolio investment decisions.

Loops in action: AI-powered R&D workflow scenarios

To illustrate how these loops function in practice, we have mapped a range of AI-powered R&D workflow scenarios and selected two decision points—lead optimization and clinical trial design—for deeper examination.

Embedding agentic AI across the clinical trial life cycle

Agentic AI is already functioning as a supervised orchestration layer in clinical development, extending established practices such as risk-based monitoring, trial simulation, and evidence synthesis into a continuously learning execution system. Throughout, the human governance structure remains intact: Agentic systems generate recommendations and analyses for review, operating within the boundaries defined by clinical and regulatory governance.

The following workflows illustrate the scope of this role across the full arc of a trial:

Enrollment and assumption monitoring. Agentic systems continuously evaluate blinded and operational data to stress-test enrollment velocity, dropout rates, and control-arm assumptions. Rather than surfacing problems only at scheduled reviews, they flag emerging risks in real time, giving clinical and operational teams the lead time to intervene before a trial is structurally compromised.
Protocol amendment simulation. When assumptions shift, agentic systems simulate alternative designs—such as eligibility adjustments, enrichment strategies, and end-point refinements—and quantify the impact of each on statistical power, timelines, and cost. Teams receive evidence-based options rather than having to generate them manually under pressure.
Site and country strategy. By synthesizing site performance data, recruitment velocity, and real-time operational metrics, agentic systems generate recommendations for site activation, rebalancing, or deactivation through standard clinical operations channels, continuously rather than at fixed intervals.
Integrated evidence assembly. Agentic systems support continuous assembly of evidence packages that combine randomized controlled trial outputs, real-world data, and literature synthesis, keeping regulatory submissions current and reducing the compression that typically occurs in the final months before filing.
Regulatory review simulation. Before submission, agentic systems can function as virtual reviewers—evaluating draft content against past agency feedback and standards, and flagging risky language, missing evidence, and structural gaps. Teams resolve foreseeable problems before they reach the agency, reducing rework, accelerating agency interaction, and improving first-cycle approval rates. Insights from this step also feed back into earlier loops, informing how future trials are designed and how patient populations are characterized from the outset.

As regulators grow more comfortable with adaptive trial designs and AI-supported workflows, agentic AI systems will increasingly orchestrate trials that self-optimize within predefined boundaries—adjusting cohorts, enriching on biomarkers, or reallocating doses according to statistical and operational rules established upfront through governance and regulatory processes. Humans retain full oversight and accountability throughout, while AI accelerates execution of decisions and compresses the latency between signal detection and action.

In both scenarios, agentic AI operates as a supervised decision support layer (see sidebar “Embedding agentic AI across the clinical trial life cycle”). It generates recommendations and scenario analyses for human review; it does not autonomously modify approved research or clinical protocols or make regulatory decisions.

Scenario A: Discovering and optimizing therapeutic candidates

In an integrated architecture, in silico predictive models assess molecular and biologic properties—such as binding, stability, and immunogenicity—to inform experimental design. Automated laboratory systems generate data at scale, while agentic systems orchestrate the cycle: planning experiments, synthesizing results, and updating hypotheses in real time so that each iteration directly informs the next (Exhibit 3).

Historically, scientists in a wet lab operated instruments and captured data. Productivity is limited to what human energy can accomplish.

With the addition of AI point solutions, scientists’ productivity grows.

AI agents create a closed-loop research system, dramatically expanding scientists’ productivity.

As labs reengineer their processes and workflows to fully realize the benefits of agents and automation, productivity leaps again.

Fully automated wet labs, synched across all of R&D, deliver the next step change in productivity.

Exhibit 3

A fully automated R&D wet lab, paired with simulations and agentic AI orchestration, could unlock exponential gains in productivity. (1 of 5)

A fully automated R&D wet lab, paired with simulations and agentic AI orchestration, could unlock exponential gains in productivity. (2 of 5)

A fully automated R&D wet lab, paired with simulations and agentic AI orchestration, could unlock exponential gains in productivity. (3 of 5)

The result is a continuous DMTA flywheel. Instead of moving through manually coordinated steps, the system learns with every cycle: Low-probability candidates are eliminated earlier, promising molecules are refined faster, and scientific effort shifts toward higher-value decisions. The same orchestration principles extend into clinical development, where agentic systems can monitor assumptions, simulate protocol alternatives, synthesize evidence, and generate structured analyses for review within established governance. Across discovery and development, value emerges from integrating AI capabilities into a coherent architecture in which outputs continuously become inputs.

This architecture delivers impact only when paired with clear decision rights, embedded workflows, and defined human oversight. Without these conditions, even advanced AI risks remaining an isolated analytical asset rather than a driver of loop compression.

Scenario B: Designing and executing clinical trials

The clinical evidence loop illustrates how AI can compress both trial duration and decision cycles. In a closed-loop architecture, predictive models optimize patient selection and trial design; data integration and synthesis combine randomized controlled trial and real-world data; and agentic systems orchestrate operations—continuously monitoring assumptions and generating structured analyses for review. Rather than treating trial design and execution as separate activities, the loop integrates them into a single learning system.

AI can sharpen the assumptions most often tied to trial failure by identifying how patients in the control arm are likely to perform, and whether the treatment effect will hold across different patient subgroups. Probabilistic modeling enables teams to evaluate multiple patient populations in parallel rather than sequentially. And generative AI supports structured literature synthesis and scenario development, allowing teams to assess alternative enrollment, enrichment, and end-point strategies before committing to protocol changes.

Agentic systems coordinate these capabilities. Based on defined inputs, such as mechanism of action, indication, inclusion and exclusion criteria, and PICO (population, intervention, comparator, and outcomes) parameters, they generate integrated analyses that inform governance decisions. These systems do not autonomously modify protocols; they function as supervised decision support layers embedded within existing clinical oversight structures.

The same orchestration logic extends to regulatory readiness. Agentic systems can simulate a virtual review, screening draft submissions against prior agency feedback and established standards to flag potential gaps before filing. By identifying structural weaknesses or evidentiary risks early, teams reduce rework, accelerate agency interactions, and improve the probability of first-cycle approval—while feeding insights back into upstream trial and patient selection decisions.

Starting steps for implementing AI-powered decision loops

McKinsey research has established the organizational capabilities required to implement R&D decision loops across the enterprise.

Create a future-state blueprint that focuses investment and delivers early wins. Rather than tackling one capability at a time, leaders should first define a compelling future-state architecture across all five loops. That blueprint should identify the early wins worth pursuing, clarify sequencing, and guide investment toward discrete, value-adding elements that can be built and released quarter by quarter. Without a clear destination, early wins do not compound, and investment decisions can fragment across disconnected capabilities.
Redesign the operating model around decisions, not functions. Realizing the full potential of loop compression requires a fundamentally redesigned, AI-native operating model. Success depends on orchestrating complementary AI capabilities within standardized data architectures spanning research, clinical, and manufacturing domains, while cross-functional teams integrate science, engineering, and analytics. Workflows must be reimagined so that AI informs decisions at every stage—not in parallel with them. These structural changes are prerequisites for digital transformation, not outcomes of it.
Build the talent mix the loops require. The required talent spans a specific blend: expertise in causal modeling, foundation models, and agentic AI, paired with the ability to interface with the business and shape how technology delivers scientific impact. No company can outsource its way to closed-loop R&D excellence; the bench must be built in-house, embedded within scientific workflows rather than housed in a separate analytics function.
Build a flexible technology architecture that is designed for loop compression. To meet the demands of complex biopharma R&D, effective architectures must synthesize heterogeneous scientific, clinical, and operational data across R&D while connecting those insights to CMC and manufacturing. Purpose-built architectures that embed scientific logic and specialized decision pathways—and that are flexible enough to absorb new AI capabilities—form the foundation for genuine loop compression. McKinsey’s next-generation pharma R&D technology stack provides the organizing framework for designing that architecture.
Invest in organizational change alongside technology rewiring. The organizational change required to close R&D decision loops is routinely underestimated. Scientists should trust AI-generated outputs enough to act on them, and leaders need to redesign workflows so that AI informs decisions rather than augmenting them. Building that trust requires accessible data, transparent governance, and clear policies, particularly in regulated environments. Organizations will need to prioritize changes such as training, process redesign, and adoption infrastructure to enable the AI-native operating model.

How to separate from the pack

With AI-powered R&D tools now widely available for closed-loop infrastructures, organizations can build competitive distance by acting on four key imperatives:

Build data as a strategic asset. AI-driven R&D engines depend on the quality, structure, and continuity of biological, translational, and clinical data. Companies that intentionally design data generation to support closed-loop learning, rather than assembling fragmented, hypothesis-specific data sets, will build durable advantage through faster iteration and higher-confidence decisions.
Integrate AI broadly and quickly. Competitive distance will accrue to organizations that rapidly integrate machine learning, causal modeling, generative AI, and agentic systems across decision loops, rather than deploying them as isolated tools. Few companies have achieved true end-to-end integration; this creates an opening for early movers to compress learning cycles ahead of their peers.
Decide what to build and what to outsource. Sustained advantage requires deliberate choices about which scientific and AI capabilities to develop internally, particularly those that define how the organization learns and makes decisions. Hyperscalers can provide scalable infrastructure, but they cannot redesign scientific workflows or define decision architectures. Specialized vendors may offer leading algorithms, but these must be integrated into a coherent system. Leading organizations retain ownership of the capabilities that drive learning and differentiation, such as molecule and biologics design, patient insight, and decision orchestration.

The biopharma industry’s next leap will not come from layering more powerful AI tools onto a linear R&D model. It will come from redesigning workflows around connected decision loops, where every experiment, clinical trial, regulatory submission, and launch systematically builds institutional knowledge rather than resetting it. Organizations that move beyond isolated pilots and commit to a truly AI-native operating model—one built on connected loops, not disconnected tools—can unlock step-change performance and the ability to advance more programs within the same budget. The result is more medicines, delivered faster, at lower cost, and with greater predictability.

From linear gates to learning loops: Rewiring biopharma R&D with AI

About the authors