Vineeta Agarwala on the promise—and limits—of AI in drug discovery

By helping us learn from every single patient and every piece of data, AI could enable medical breakthroughs, says Andreessen Horowitz general partner Vineeta Agarwala.

‘I hope you’re learning from my journey.’ It’s a sentiment that many patients have expressed to Vineeta Agarwala, who agrees with it wholeheartedly—and believes that the healthcare and scientific communities are, indeed, starting to learn from every patient’s journey. Much of the learning is made possible through AI, which has tremendous potential to aid scientists and improve patient outcomes by rapidly analyzing reams of data—even from failed experiments and inconclusive lab tests.

Sidebar

Agarwala is a general partner at the venture capital firm Andreessen Horowitz (also known as a16z), where she leads investments in companies that straddle the worlds of biology, health, and technology. The a16z Bio + Health portfolio now comprises some 50 companies. Agarwala, who is also a physician, sees genuine opportunity for AI to break new ground in drug discovery, as she recently told McKinsey’s Lydia The during an interview in San Francisco. An edited transcript of their conversation follows.

Lydia The: You’re a physician, a scientist, and a venture capitalist. That gives you a unique vantage point from which to observe AI’s growing role in drug discovery and development. What gives you confidence that AI can revolutionize the field? And in which specific parts of the drug discovery and development cycle do you think AI could be most powerful?

Vineeta Agarwala: I think the intuition for the application of AI in drug discovery comes from nature. Nature has created a lot of great drugs, a lot of incredible protein machinery, a lot of very complex sequence-encoded instruction manuals. I mean, we have protein sequences in our bodies crawling DNA that figured out how to do transcription and translation and all kinds of mechanics.

To me, that says the design space is huge. The design space of proteins, nucleotide sequences, and small molecules that could have therapeutic effect is vast. At the very highest level, my reason for believing in the role and application of AI to drug discovery is just that we’ve barely tapped the level of complexity that nature has proven can exist.

Video
Nature as a reason to believe in AI

AI and machine learning [ML] could plug into the drug development life cycle at every step. Learning systems can help humans conduct an end-to-end task better and faster at any stage: in target discovery, in the design of a modality, in the design of preclinical experimentation, in the clinical-development phase, in deciding which patients are most informative for your clinical trial, and so on. I think we will see this technology help us inch closer toward the frontier of what’s possible.

One thing that excites me is the breadth of ML platform companies that are now being stood up across the industry—and they’re all taking different approaches and learning from different data sets. An ML company in the small-molecule space is not the same as an ML company in the nucleic acid space or in the biologic space. There’s a growing list of companies that are picking a very specific problem and getting very, very good at it.

What patients care about

Lydia The: What will the proliferation of ML companies mean for patients? What’s most exciting to you about AI’s potential impact on people who need drug therapies?

Video
AI’s impact on patients

Vineeta Agarwala: The reality is that patients don’t care how their drugs were developed, but they care intensely about the speed and the access properties of those medicines. What I get most excited about is the potential for AI to fundamentally change the timescales, as well as the cost structures, for development in the therapeutic space. Development costs do ultimately transmit to patients directly, and that’s something that patients care a lot about.

We’re unlikely to see a patient express pride and say, “Hey, I’m on an AI-guided drug!” But we are very likely to see patients—and patient communities that have been underserved by a lack of drug development—say, “Wow, new technologies have really created a surge of shots on goal for me that weren’t possible before.”

One thing that a lot of patients tell me—and tell their providers—is, “I really hope you’re learning from my journey. I’ve done so many lab tests; I hope you’re learning from all of them.” That’s something that I think the field is starting to get better at. We’re starting to build systems around harnessing clinical data at scale, and real-world data, and using that as evidence. We’re starting to get better at making sure that every time a patient walks in the door, their data journey is not exhaust. It’s harnessed and captured, and we’re going to learn from it, and we’ll get better at diagnosing or treating the next patient or selecting the next patient for a trial.

Video
Learning from every piece of data

The satisfaction of learning-inspired drug development is that almost nothing you do gets lost. Nothing you do is useless. The non-hits might be just as informative and important for your underlying model as the hits. That’s a really satisfying regime to live in. It helps you want to generate data. It helps frame a lot of the problems you’re solving as so-called type-two scientific questions, where either answer is still informative and important. So that’s a theme that really inspires me about this space.


‘A constant loop between the lab and the learning’

Lydia The: As you say, there is potential for AI across the entire discovery and development workflow. With that enormous scope and opportunity, how do you prioritize? How should a company—and the industry—think about which areas are ripe to start with first, versus areas where we still need more to get there?

Vineeta Agarwala: For a company, it is productive to ask the question, “Where do I have training data? Where have I run an experiment so many times that I might already have the data I need?” I would flag lead optimization as a part of the drug development cycle where large pharma companies do have a lot of data.

But taking a step back, where should the industry invest resources in machine learning? That’s a much vaster question, which should be untethered from where there already is training data, because our capacity to generate new biological data and to profile patient samples is at an all-time high. I don’t love the framework that says, “I have a lot of data, so let me throw it into a learning system and make a list of all my best insights and then hope I become a smarter actor going forward.” Sometimes large companies gravitate to that framework because they’re sitting on a lot of data.

More interesting, though, is to ask, “How can the learning system change the design of my next experiment such that I only gather the data that most incrementally improves my understanding of the problem at hand?” That becomes a big unlock because now you’re designing the most informative experiment you can, using an underlying learning model. It’s not a one-time learning effort. It’s a constant loop between the lab and the learning.

Could AI make drug development cheaper?

Lydia The: The elephant in the room is the capital intensity required for a lot of these companies to develop machines and robots so they can automate their data collection. What’s the advice you’re giving to companies about investing up front in order to build the platform that allows for learning-enabled drug development?

Vineeta Agarwala: I often hear, “Well, to build an AI platform, you’re going to have to burn millions on accumulating the training data needed to learn.” I think it’s a misconception that an AI platform needs to have a capital-intensive initial build phase. There are some settings where that will be true, but there are others where the AI platform can actually improve your spend efficiency and help you conduct just the right set of experiments that are most informative in helping you iteratively design a great therapeutic. The ultimate promise of AI-guided drug development is that you become more capital efficient because you’re learning from every piece of data generated in your lab.

Part of the reason that AI-enabled business models could be so exciting is that the cash-out needed to conduct all the preclinical work required to even get a program into the clinic, and then ultimately through trials, was historically very steep. The promise, in principle, is that AI-first technologies could fundamentally shift that cost curve such that your wall clock time and your cash requirement to nominate a program to advance into clinical trials could be blunted. We’re spending, like, $1.5 billion per drug that we get approved at the moment, so there’s a lot of room for improvement.

Efficiency and efficacy

Lydia The: We’ve built a case around AI, and now I feel like we need to address the skeptics. What should people look for to be convinced? Is it, “We have demonstrated that we were able to increase PTS [probability to success]” or “We identified a target we would have never been able to identify just using manual experiments”? Are there goalposts that you’re looking for?

Vineeta Agarwala: The proof points for AI-guided drug development, in my mind, fall broadly into two buckets. One is around efficiency of drug development. The other is around properties like efficacy and safety.

In the category of efficiency, there’s now starting to be some literature on this topic: For a company that said it was leveraging AI in its drug development process, how long did it take to get to an IND [investigational new drug] application? There’s so much variation in the baseline that it’s hard to do these analyses, but there’s early data suggesting that those timelines have shrunk—to three to five years, versus historically six to ten years. That’s an important goalpost in the efficiency domain. Also, can you burn less capital in getting to an IND? We’re very keen to back companies that can say, “We can stretch it. We can take more shots on goal with the same capital investment.”

For what it’s worth, I think it’s going to be really hard to figure out what program AI touched and what program AI never touched. I don’t think it’s black and white. But if you were to try to separate programs into more AI-guided versus not, these are the metrics that matter: PTS, time to IND, and capital expenditure per IND.

The other important bucket is efficacy at tolerable safety profiles. That’s the mantra of our industry, and if AI can help us get that more frequently, more quickly, and at lower cost, that would be huge: Have you been able to unlock development goals that were previously not possible? Could you drug a target because you found a pocket that no one else could see? Could you design an antibody to be delivered in a particular formulation that was really important for patients that couldn’t have been designed without AI-guided insights? These are all examples of fundamentally new—and very exciting—unlocks in the industry.

AI won’t be a ‘catchall’ solution

Lydia The: What do you think people get wrong about applying AI to drug development?

Vineeta Agarwala: I think what people get wrong is to assume that AI is a catchall silver bullet for problems that are hard. Those problems will still be hard. There are vexing problems of target biology, of interconnected pathways that we don’t understand yet, of preclinical models not being fully predictive of human biology.

AI will create insights that make those problems a little bit more tractable, more scalable, more engineerable. But many of the problems won’t be solved by AI alone. They will continue to be solved by really creative molecular biologists who come up with new assays, or through great synthetic solutions that help us generate a lot of diversity in different spaces, or through thorough preclinical studies. So you can’t assume that there’s an AI solution for every hard problem. Most likely, it’s going to be a scientific solution that might benefit from AI.

My bet is that learning platforms and technologies that help us learn from large amounts of data are going to be ubiquitous. I hope that, some years from now, people won’t just try to pluck software modules and incorporate them into a specific problem set, but rather that there will be a much more integrated experimental, computational, and clinical development regime in which the goal is to learn from the ever-mounting biological data that we’re collecting.

Explore a career with us

Related Articles