Technology leaders in the public sector are increasingly adopting cloud services to drive innovation, agility, and economic value. Cloud spend in the US public sector, for example, is expected to increase from around $15 billion in 2022 to around $23 billion by 2025.1 The value to the business from all of that spend, however, has so far been mixed: two-thirds of cloud programs do not generate expected benefits, and 30 percent of the spend is wasted (that is, the money is spent, but the potential services aren’t utilized).2 More than ten years ago, the Centers for Medicare & Medicaid Services (CMS), the federal agency that provides health coverage to more than 100 million Americans, embarked on a cloud transformation journey. The agency views cloud as a core enabler of its mission to drive innovation in healthcare, provide and scale a seamless and secure experience to beneficiaries, and be a responsible steward of public funds.
McKinsey’s Naufal Khan, Wasim Lala, Abdallah Saleme, and Sakshi Jain sat down with Rajiv Uppal, CIO at CMS and incoming CIO at the IRS, and Mark Oh, director of infrastructure and user services group at CMS, to discuss the benefits, challenges, and learnings from the agency’s ongoing transformation. Naufal Khan led the wide-ranging discussion. What follows are edited highlights from their conversation.
Cloud transformation rooted in empathy
Naufal Khan: Cloud requires a different engagement model. What have you done to transform how your team works?
Rajiv Uppal: We realized soon after we started moving to cloud that the larger transformation was on the business side, and it was on multiple levels. It wasn’t enough to say, “Give us the business requirements, and we’ll go build it.” In many cases, the business side is almost telling you what they think you want to hear, just to get through the requirement sessions. The IT organization needs to make an effort to understand the business so that the solutions we create address the needs of that business. That’s a big shift. We had to build trust with our business partners and work with them to think about cloud decisions. For instance, we had to think together on whether a workload should be a lift and shift or more of a new development on cloud. It required a fair bit of trust-building with CMS’s business units.
The truth is, IT organizations can hide behind policies, so if our partners are going to trust us, they need to know we understand what it means to move something to cloud and are familiar with new technologies like AI and machine learning. So we set up a program to train our folks on human-centered design, product management, and cloud technologies.
Mark Oh: It all comes back to leadership emphasis on empathy. If you’re not empathetic, you cannot deliver services and solutions that people will use. You have to put yourself in the user’s position and understand what solution is needed and how it will be consumed. We’ve learned that being empathetic means your job’s not done just because you’ve delivered cloud services. Your job is to make sure it’s been optimized and is efficient for the user.
It all comes back to leadership emphasis on empathy. If you’re not empathetic, you cannot deliver services and solutions that people will use.
Moving from a center of excellence to a ‘community of practice’
Naufal Khan: In the public sector, budgets and decision making are highly decentralized, and it can become critical to manage a large group of stakeholders. How did you navigate these complexities?
Rajiv Uppal: One of the fundamental approaches we took was to avoid mandating things and encourage enablement instead. A lot of the success we had in this regard can be attributed to the “community of practice” we established, where we involved the entire stakeholder community to come and work with us and help shape solutions [see sidebar, “What a ‘community of practice’ is at CMS”]. The fact that we can sit down with our business partners and say, “This worked, and this did not. Let’s see how we can make it better,” is a big success. This community was instrumental in sharing best practices so we all could benefit. It also allowed people to feel they were part of the solution instead of it being mandated. This is core to the commitment to empathy. By doing this, we were able to build and provide solutions that met business needs and became the obvious choice, eliminating the need for mandates.
Mark Oh: In a typical IT environment, you have a center of excellence, where the best ideas and practices are determined and the right rules and guidelines are implemented. We found that method doesn’t really work at CMS, so we took the community-of-practice approach, in which leadership from each of the business units and offices participates as a part of the community that looks at the data and makes decisions.
In our case, CMS has benefited significantly from the community of practice when it comes to our FinOps3 program. So instead of seeing our costs rise by 30 percent this past year, we achieved, conservatively, a 15 percent overall savings on annual spend. That is a significant amount of money for an agency like CMS, where our spend is over $100 million.
One of the fundamental approaches we took was to avoid mandating things and encourage enablement instead.
FinOps to drive greater efficiency, not just cut costs
Naufal Khan: One of the things that often comes up when we talk about cloud operating models and team skills is the FinOps operating model. Can you tell us more about your approach?
Mark Oh: A key starting point for FinOps is to understand that not every workload is suited for cloud. If we try to move all applications to cloud, we risk having the cost of migrating workloads exceed the benefits we get from cloud. Secondly, you need to optimize and set workloads up right or you won’t see the value from cloud. When you just scale the cloud-first approach to 100 systems, you may be duplicating the inefficiencies 100 times. You have to be smarter about your approach. That means embedding FinOps and other relevant services into your cloud transformation program from the beginning—what we refer to as a “smart cloud-first” approach.
A critical component of this smart cloud-first approach for FinOps is that cost savings aren’t the main focus. Our approach to cloud transformation has been focused more on improving effectiveness by, for example, shortening how long it takes to get a solution to market or ensuring the solution meets the need. A by-product of this focus is that we have been able to better manage costs as well.
You have to be smarter about your approach. That means embedding FinOps and other relevant practices into your cloud transformation program from the beginning.
Rajiv Uppal: To determine how best to drive efficiency, we relied on data, which really is at the core of this journey. We invested in data analysis so we could better understand what was happening and make data-driven decisions. This allowed us to collect data, provide evidence to show usage patterns, and recommend what improvements to make. One thing we learned, for example, was that more than 50 percent of our servers, in some cases, were running at less than 10 percent capacity. That visibility allowed us to have the right conversations with the community of practice.
Mark Oh: With this data and focus on efficiency, we wanted to create a FinOps tool that provided transparency, but we knew that alone wasn’t enough. This is where the empathy part again came in because we worked closely with the community of practice to understand what they needed and how they would use the tool, so we could tailor it to make it easy to use. What we built was a tool that didn’t just show the savings potential to teams but also had accompanying tailored presentations and reports that showed to the business users how they could drive savings for their specific workloads. The result of this partnership was that some business units managed 20 percent cost savings, and others achieved as much as 50 percent.
Cost savings aren’t the main focus. Our approach to cloud has been focused more on improving effectiveness.
How cloud helps reach sustainability goals
Naufal Khan: Given the federal guidance on clean energy, how is CMS thinking about sustainability as you continue to scale up cloud adoption?
Rajiv Uppal: We are pursuing multiple avenues. We are rightsizing our footprint (both on premises and on cloud) by, for example, consolidating our data centers from eight to two and moving to cloud. We are also leveraging the flexibility and scalability that cloud offers to optimize our consumption in line with our business needs. This ties to the efficiency concept we mentioned: the more efficient we are in using compute resources, the closer we are to our sustainability goals.
Supporting other federal agencies with a smart cloud-first approach
Naufal Khan: How might your learnings help other federal agencies accelerate their cloud transitions?
Mark Oh: I’ve talked to a number of my counterparts at federal agencies who are looking to learn from our experience and understand how to take a smart cloud-first approach grounded in best practices such as FinOps. I believe a community of practice at the federal level can provide significant benefits to help other federal agencies leverage what we have. We share not only that technology and tools but also the processes.
Rajiv Uppal: In addition, we have built two solutions that could be sharable across various agencies. One is our spend-transparency solution, which allows teams to see funding available by application, what cloud resources they have used compared to what was budgeted, and where there are potential opportunities to be more effective, such as by turning off unused compute instances. The other is our platform solution, which has a set of controls built in that are needed to pass the required security check, called “the authority to operate.”
Leveraging learnings from cloud for other technologies
Naufal Khan: How are you leveraging learnings from cloud when it comes to generative AI [gen AI]?
Mark Oh: We’re looking to apply the lessons we learned from our cloud journey and start a community of practice for gen AI as well. One reason for that is because there are many vendors coming forward with a variety of gen AI solutions. We are creating a storefront of pretested and preapproved gen AI solutions that CMS employees and partners can use. As Rajiv stressed, we’re not mandating. We’re creating a safe place for people to contribute and collaborate, using best practices. Just as we learned the lessons about smart cloud adoption, we need to figure out smart gen AI adoption and utilization.
Just as we learned the lessons about smart cloud adoption, we need to figure out smart gen AI adoption and utilization.
Rajiv Uppal: One of our greatest assets is our information repository. We are exploring gen AI, or other AI solutions, to see how we can solve some of our most challenging and time-consuming issues when it comes to leveraging this asset. For example, privacy is critical to our mission, and in crafting our policies, our privacy team researches and reviews a lot of legislation. Thanks to AI and some of the solutions we’ve created, what traditionally would have taken us more than two months now takes just a few hours.