AI at scale with MLOps: What CEOs need to know

(8 pages)

What if a company built each component of its product from scratch with every order, without any standardized or consistent parts, processes, and quality-assurance protocols? Chances are that any CEO would view such an approach as a major red flag preventing economies of scale and introducing unacceptable levels of risk—and would seek to address it immediately.

Yet every day this is how many organizations approach the development and management of artificial intelligence (AI) and analytics in general, putting themselves at a tremendous competitive disadvantage. Significant risk and inefficiencies are introduced as teams scattered across an enterprise regularly start efforts from the ground up, working manually without enterprise mechanisms for effectively and consistently deploying and monitoring the performance of live AI models.

Ultimately, for AI to make a sizable contribution to a company’s bottom line, organizations must scale the technology across the organization, infusing it in core business processes, workflows, and customer journeys to optimize decision making and operations daily. Achieving such scale requires a highly efficient AI production line, where every AI team quickly churns out dozens of race-ready, risk-compliant, reliable models. Our research indicates that companies moving toward such an approach are much more likely to realize scale and value—with some adding as much as 20 percent to their earnings before interest and taxes (EBIT) through their use of AI as they tap into the $9 trillion to $15 trillion in economic value potential the technology offers.

CEOs often recognize their role in providing strategic pushes around the cultural changes, mindset shifts, and domain-based approach necessary to scale AI, but we find that few recognize their role in setting a strategic vision for the organization to build, deploy, and manage AI applications with such speed and efficiency. The first step toward taking this active role is understanding the value at stake and what’s possible with the right technologies and practices. The highly bespoke and risk-laden approach to AI applications that is common today is partly a function of decade-old data science practices, necessary in a time when there were few (if any) readily available AI platforms, automated tools, or building blocks that could be assembled to create models and analytics applications and no easy way for practitioners to share work. In recent years, massive improvements in AI tooling and technologies have dramatically transformed AI workflows, expediting the AI application life cycle and enabling consistent and reliable scaling of AI across business domains. A best-in-class framework for ways of working, often called MLOps (short for “machine learning operations”), now can enable organizations to take advantage of these advances and create a standard, company-wide AI “factory” capable of achieving scale.

In this article, we’ll help CEOs understand how these tools and practices come together and identify the right levers they can pull to support and facilitate their AI leaders’ efforts to put these practices and technologies firmly in place.

The bar for AI keeps rising

Gone are the days when organizations could afford to take a strictly experimental approach to AI and analytics broadly, pursuing scattered pilots and a handful of disparate AI systems built in silos. In the early days of AI, the business benefits of the technology were not apparent, so organizations hired data scientists to explore the art of the possible with little focus on creating stable models that could run reliably 24 hours a day. Without a focus on achieving AI at scale, the data scientists created “shadow” IT environments on their laptops, using their preferred tools to fashion custom models from scratch and preparing data differently for each model. They left on the sidelines many scale-supporting engineering tasks, such as building crucial infrastructure on which all models could be reliably developed and easily run.

Today, market forces and consumer demands leave no room for such inefficiencies. Organizations recognizing the value of AI have rapidly shifted gears from exploring what the technology can do to exploiting it at scale to achieve maximum value. Tech giants leveraging the technology continue to disrupt and gain market share in traditional industries. Moreover, consumer expectations for personalized, seamless experiences continue to ramp up as they are delighted by more and more AI-driven interactions.

Thankfully, as AI has matured, so too have roles, processes, and technologies designed to drive its success at scale. Specialized roles such as data engineer and machine learning engineer have emerged to offer skills vital for achieving scale. A rapidly expanding stack of technologies and services has enabled teams to move from a manual and development-focused approach to one that’s more automated, modular, and fit to address the entire AI life cycle, from managing incoming data to monitoring and fixing live applications. Start-up technology companies and open-source solutions now offer everything from products that translate natural language into code to automated model-monitoring capabilities. Cloud providers now incorporate MLOps tooling as native services within their platform. And tech natives such as Netflix and Airbnb that have invested heavily in optimizing AI workflows have shared their work through developer communities, enabling enterprises to stitch together proven workflows.

Alongside this steady stream of innovation, MLOps has arisen as a blueprint for combining these platforms, tools, services, and roles with the right team operating model and standards for delivering AI reliably and at scale. MLOps draws from existing software-engineering best practices, called DevOps, which many technology companies credit for enabling faster delivery of more robust, risk-compliant software that provides new value to their customers. MLOps is poised to do the same in the AI space by extending DevOps to address AI’s unique characteristics, such as the probabilistic nature of AI outputs and the technology’s dependence on the underlying data. MLOps standardizes, optimizes, and automates processes, eliminates rework, and ensures that each AI team member focuses on what they do best (exhibit).

Since MLOps is relatively new and still evolving, definitions of what it encompasses within the AI life cycle can vary. Some, for example, use the term to refer only to practices and technologies applied to monitoring running models. Others see it as only the steps required to move new models into live environments. We find that when the practice encompasses the entire AI life cycle—data management, model development and deployment, and live model operations—and is supported by the right people, processes, and technologies, it can dramatically raise the bar for what companies can achieve.

Would you like to learn more about McKinsey Analytics?

The business impact of MLOps

To understand the business impact of end-to-end MLOps, it is helpful to examine the potential improvements from four essential angles: productivity and speed, reliability, risk, and talent acquisition and retention. Inefficiencies in any of these areas can choke an organization’s ability to achieve scale.

Increasing productivity and speed to embed AI organization-wide

We frequently hear from executives that moving AI solutions from idea to implementation takes nine months to more than a year, making it difficult to keep up with changing market dynamics. Even after years of investment, leaders often tell us that their organizations aren’t moving any faster. In contrast, companies applying MLOps can go from idea to a live solution in just two to 12 weeks without increasing head count or technical debt, reducing time to value and freeing teams to scale AI faster. Achieving productivity and speed requires streamlining and automating processes, as well as building reusable assets and components, managed closely for quality and risk, so that engineers spend more time putting components together instead of building everything from scratch.

Organizations should invest in many types of reusable assets and components. One example is the creation of ready-to-use data “products” that unify a specific set of data (for instance, combining all customer data to form a 360-degree view of the customer), using common standards, embedded security and governance, and self-service capabilities. This makes it much faster and easier for teams to leverage data across multiple current and future use cases, which is especially crucial when scaling AI within a specific domain where AI teams often rely on similar data.

An Asian financial-services company, for example, was able to reduce the time to develop new AI applications by more than 50 percent—in part by creating a common data-model layer on top of source systems that delivered high-quality, ready-to-use data products for use in numerous product and customer-centric AI applications. The company also standardized supporting data-management tooling and processes to create a sustainable data pipeline, and it created assets to standardize and automate time-consuming steps such as data labeling and data-lineage tracking. This was a stark difference from the company’s previous approach, where teams structured and cleaned raw data from source systems using disparate processes and tools every time an AI application was being developed, which contributed to a lengthy AI development cycle.

Another critical element for speed and productivity improvements is developing modular components, such as data pipelines and generic models that are easily customizable for use across different AI projects. Consider the work of a global pharmaceutical company that deployed an AI recommendation system to optimize the engagement of healthcare professionals and better inform them of more than 50 drug–country combinations, ultimately helping more appropriate patient populations get access to and benefit from these medicines. By building a central AI platform and modular premade components on top, the company was able to industrialize a base AI solution that could rapidly be tailored to account for different drug combinations in each market. As a result, it completed this massive deployment in under a year and with only ten AI project teams (a global team and one in each target country)—five times faster and less resource intensive than if it had delivered in the “traditional” way. To get there, executives made investments in new operating models, talent, and technologies. For example, they erected an AI center of excellence, hired MLOps engineers, and standardized and automated model development to create “model production pipelines” that speed time to value and reduce errors that can cause delays and introduce risks.

Enhancing reliability to ensure 24/7 operation of AI solutions

Organizations often invest significant time and money in developing AI solutions only to find that the business stops using nearly 80 percent of them because they no longer provide value—and no one can figure out why that’s the case or how to fix them. In contrast, we find that companies using comprehensive MLOps practices shelve 30 percent fewer models and increase the value they realize from their AI work by as much as 60 percent.

One way they’re able to do this is by integrating continuous monitoring and efficacy testing of models into their workflows, instead of bolting them on as an afterthought, as is common. Data integrity and the business context for certain analytics can change quickly with unintended consequences, making this work essential to create always-on AI systems. When setting up a monitoring team, organizations should, where possible, make sure this team is independent from the teams that build the models, to ensure independent validation of results.

The aforementioned pharmaceutical company, for instance, put a cross-functional monitoring team in place to ensure stable and reliable deployment of its AI applications. The team included engineers specializing in site reliability, DevOps, machine learning, and cloud, along with data scientists and data engineers. The team had broad responsibility for managing the health of models in production, from detecting and solving basic issues, such as model downtime, to complex issues, such as model drift. By automating key monitoring and management workflows and instituting a clear process for triaging and fixing model issues, the team could rapidly detect and resolve issues and easily embed learnings across the application life cycle to improve over time. As a result, nearly a year after deployment, model performance remains high, and business users continue to trust and leverage model insights daily. Moreover, by moving monitoring and management to a specialized operations team, the company reduced the burden on those developing new AI solutions, so they can maintain a laser focus on bringing new AI capabilities to end users.

Reducing risk to ensure regulatory compliance and trust at scale

Despite substantial investments in governance, many organizations still lack visibility into the risks their AI models pose and what, if any, steps have been taken to mitigate them. This is a significant issue, given the increasingly critical role AI models play in supporting daily decision making, the ramp-up of regulatory scrutiny, and the weight of reputational, operational, and financial damage companies face if AI systems malfunction or contain inherent biases.

While a robust risk-management program driven by legal, risk, and AI professionals must underlie any company’s AI program, many of the measures for managing these risks rely on the practices used by AI teams. MLOps bakes comprehensive risk-mitigation measures into the AI application life cycle by, for example, reducing manual errors through automated and continuous testing. Reusable components, replete with documentation on their structure, use, and risk considerations, also limit the probability of errors and allow for component updates to cascade through AI applications that leverage them. One financial-services company using MLOps practices has documented, validated, and audited deployed models to understand how many models are in use, how those models were built, what data they depend on, and how they are governed. This provides its risk teams with an auditable trail so they can show regulators which models might be sensitive to a particular risk and how they’re correcting for this, enabling them to avoid heavy penalties and reputational damage.

Better talent retention and acquisition for implementing AI at scale

In many companies, the availability of technical talent is one of the biggest bottlenecks for scaling AI and analytics in general. When deployed well, MLOps can serve as part of the proposition to attract and retain critical talent. Most technical talent get excited about doing cutting-edge work with the best tools that allow them to focus on challenging analytics problems and seeing the impact of their work in production. Without a robust MLOps practice, top tech talent will quickly become frustrated by working on transactional tasks (for instance, data cleansing and data integrity) and not seeing their work have a tangible business impact.

Executive’s guide to developing AI at scale

Explore the interactive

The CEO’s role

Implementing MLOps requires significant cultural shifts to loosen firmly rooted, siloed ways of working and focus teams on creating a factory-like environment around AI development and management. Building an MLOps capability will materially shift how data scientists, engineers, and technologists work as they move from bespoke builds to a more industrialized production approach. As a result, CEOs play a critical role in three key areas: setting aspirations, facilitating shared goals and accountability, and investing in talent.

Setting a clear aspiration for impact and productivity

As in any technology transformation, CEOs can break down organizational barriers by vocalizing company values and their expectations that teams will rapidly develop, deliver, and maintain systems that generate sustainable value. CEOs should be clear that AI systems operate at the level of other business-critical systems that must run 24/7 and drive business value daily. While vision setting is key, it pays to get specific on what’s expected.

Among the key performance metrics CEOs can champion are the following:

the percentage of models built that are deployed and delivering value, with an expectation of 90 percent of models in production having real business impact
the total impact and ROI from AI as a measurement of true scalability
near-real-time identification of model degradation and risks, including shifts in underlying data (particularly important in regulated industries)

Fully realizing such goals can take 12 to 24 months, but with careful prioritization of MLOps practices, many teams we work with see significant progress toward those goals in just two to three months.

Ensuring shared goals and joint accountability among business, AI, data, and IT teams

One of the fundamental litmus tests for impact is the degree to which goals are shared across business leaders and the respective AI, data, and IT teams. Ideally, the majority of goals for AI and data teams should be in service of business leaders’ goals. Conversely, business leaders should be able to articulate what value they expect from AI and how it will come to fruition.

Another measure is the level of collaboration around strategic technology investments to provision tooling, technologies, and platforms that optimize AI workflows. With the rapid pace of technological change, IT often struggles to balance the need for new AI tooling and technologies with concerns that short-term fixes increase technology costs over the long term. Comprehensive MLOps practices ensure a road map to reduce both complexity and technical debt when integrating new technologies.

Most AI leaders we know spend significant time building strong relationships with their IT counterparts to gain the support they need. But when CEOs actively encourage these partnerships, it accelerates their development considerably.

Investing in upskilling existing AI talent and new roles

The role of data scientists, for example, is changing. While they previously depended on low-level coding, they must now possess knowledge of software engineering to assemble models from modular components and build production-ready AI applications from the start.

Newer roles needed on AI teams have emerged as well. One is that of the machine learning engineer who is skilled in turning AI models into enterprise-grade production systems that run reliably. To build out its ML engineering team, a North American retailer combined existing expertise of internal IT developers who understood and could effectively navigate the organization’s systems with new external hires who brought broad experience in MLOps from different industries.

AI is no longer just a frontier for exploration. While organizations increasingly realize value from AI applications, many fail to scale up because they lack the right operational practices, tools, and teams. As demand for AI has surged, so has the pace of technological innovations that can automate and simplify building and maintaining AI systems. MLOps can help companies incorporate these tools with proven software-engineering practices to accelerate the development of reliable AI systems. With knowledge of what good MLOps can do and what levers to pull, CEOs can facilitate the shift to more systematic AI development and management.

Scaling AI like a tech native: The CEO’s role

About the authors