Thomas Dohmke on improving engineering experience using generative AI

In this episode of McKinsey on Building Products, a podcast dedicated to the exploration of software product management and engineering, McKinsey partners Rikki Singh and Shivam Srivastava talk with GitHub CEO Thomas Dohmke. Together, they discuss how generative AI (gen AI) has changed and improved the software engineering experience, how the developer role may change over time, and the opportunity to usher in a new era of innovation. This interview took place in September 2023 as part of the McKinsey Engineering Academy speaker series. An abridged version of their conversation follows.

The influence of technology on the developer’s experience

Rikki Singh: Thomas, tell us about yourself and your perspective on how technology has impacted developers in recent years.

Thomas Dohmke: I’m a developer, and I’ve been doing software development since I was 11 years old. The biggest changes since I started have been the internet and open source. Now, in many ways, every company is a software company, whether it’s an energy supplier on the West Coast, a bank in Paris, an automaker in China, or any federal or local government. They all produce software every day.

My developers add more lines of code than they remove, so we’re sitting on a growing pile of software that we have to manage. At GitHub, we realized a while back that we needed to improve the developer experience. Engineering projects are hard, and solving hard problems takes a long time. So we needed a new solution. We believe the answer to that is AI, which will help us as human software developers to fix problems so we can accelerate into the future.

We started our journey of building Copilot, and then Stable Diffusion came out, and ChatGPT changed the world. By that time, we had GitHub Copilot available to all developers. It was a once-in-a-lifetime moment, when a big company such as Microsoft had a product ready before the big discussion in the market happened.

Rikki Singh: Tell us more about your experience building Copilot and what you’ve learned so far.

Thomas Dohmke: The magic with Copilot is that it meets developers where they are. It doesn’t introduce a new concept to their interaction models; they’re still typing in their editor and writing code. If they don’t like the suggestion, they can just keep typing until they get something that is useful. It keeps developers in the flow.

Regardless of whether you’re a more experienced developer or a junior developer, you will have to look something up when you’re writing code. You have to figure out the right answer, copy and paste the solution, and then adjust it so it works with your code. Copilot is already in the network flow, so it finds and provides the solution and modifies until it works. By using an auto-completion workflow and keeping developers as the pilot in charge of that workflow, we could offer the full value of gen AI to software developers.

In fact, we conducted a case study with about 100 developers, in which 50 had Copilot and 50 didn’t have it. We asked them to build an HTTP web server using JavaScript. The group using Copilot was 55 percent faster than the other group: they got the project done in an hour and 10 minutes, compared with two hours and 40 minutes. Not only that, but the group with Copilot extracted the highest success rate: 78 percent of them got it done, compared with 70 percent for the group without Copilot. Modern developer productivity means we can get more stuff done in the same amount of time.

At GitHub, our mission is to increase developer happiness because we believe that if developers are not happy—if they’re grumpy, frustrated, or bogged down by the system—they’re not going to be creative. Developing is a creative job. Building things is the fun part of development; going through a thousand linter errors is not. We can replace our linters with auto AI.

Not only are CTOs [chief technology officers] and CIOs [chief information officers] happy, but developers are, too. We did a survey of more than 2,000 developers, and 75 percent of them told us they’re more fulfilled, require less mental capacity, do fewer menial tasks, and feel better and more creative when using Copilot.

Rikki Singh: What other implications might the adoption of Copilot have on software development?

Thomas Dohmke: Slowly but gradually, more companies are allowing Copilot as part of interview loops, and more professors are allowing Copilot and ChatGPT to be a part of learning. The space went from thinking these technologies were evil and a form of cheating to realizing that people who don’t know how to use AI or ChatGPT are not a fit for a company that understands these AI technologies and is ramping up its usage of them.

Moreover, Copilot can teach people coding without them having to learn English first. Most software development is predominantly in English, and if you file an issue on GitHub, you have to speak English. If you grew up in Germany like me, or any other non-English speaking country, you had to learn English before you could code. Now, you can use Copilot to learn coding first. I’m convinced that kids will learn coding before they learn English in the future.

Copilot can also teach people how to write better code. For instance, we looked at acceptance rate, which is the number of times someone presses the tab key to accept the suggestion from Copilot. Less experienced developers have a higher acceptance rate than more experienced developers, because if you’re less experienced, you have to ask more questions. Regardless of whether you’re experienced or not, your acceptance rate goes up over time; that shows you’re learning to write better code. It can also help improve story-pointing issues and can help write a summary of pull requests once the code is done.

All these things will be supported by AI to free up our time to build more for this world.

Balancing intellectual property and innovation

Shivam Srivastava: A lot of engineering leaders tell us they hear concerns about intellectual property [IP] infringement from their legal counterparts. How do you approach that?

Thomas Dohmke: First, I’ll provide clarity on how customer data is shared with Microsoft. The way Copilot works is when someone types into their editor, the information is sent to a GitHub back end, and that instance is then owned by GitHub. So there is no data sharing between what people send to GitHub and OpenAI or other teams at Microsoft. It is similar to your source code. Your repositories are processed for the sake of giving you a code completion, but there is no ownership transfer of what you’re writing and what OpenAI is trained on.

The second piece is legal oversight. The general counsel at Microsoft created a blog post about indemnification, which means providing compensation for harm. Microsoft has committed to giving all Copilot for Business customers uncapped indemnification for any output generated by these models to guarantee the safety of their IP.

Third, we built a feature that compares generated code to all public code on GitHub. If a 150-character chunk in that model matches any of the public code on GitHub, Copilot won’t suggest it. That way, we are preventing developers from writing code that already exists in public repositories on GitHub. Only about 1 percent of suggestions fall into that category and are getting filtered, so it doesn’t happen often. A new version of that feature shows developers a preview of the matching code instead of blocking it, so they can see the list of repositories and the licenses. That way, they can determine if the repositories are compatible with a certain license and can attribute those lines of code to the project if they end up using them.

There’s going to be a lot of innovation in that space. We’re looking to see how much code we’re already reusing across the space. One day, maybe we will have Copilot embedded into applications so it can execute the code at runtime. Then the whole IP question may fully dissolve.

Rolling out gen AI capabilities among teams

Shivam Srivastava: Are some teams or organizations adopting gen AI particularly well? If so, what are they doing? How are they changing their development experience and the development teams?

Thomas Dohmke: Microsoft has more than 50,000 engineers on Copilot, which effectively means every team at Microsoft is using Copilot to some degree. At Microsoft, we had a small team adopt it first, and then we did an organic rollout. We created a landing page where people could try Copilot themselves. What often works well with large companies is to have office hours, where teams provide weekly informal check-ins to assess how the program is working before it’s rolled out to the rest of the organization.

For people who live in markets that are not Silicon Valley and are not as highly paid as we are in the Western world, there’s an even stronger focus on catching up and realizing productivity gains. For example, with Mercado Libre in Argentina, they went from proof of concept to integrating the site into their tool set and bought it for everybody. In these cases, companies cannot go another day without that kind of productivity. They want to be the leader in their space.

Shivam Srivastava: In terms of productivity, are there concerns around burnout and engineers being asked to do too much?

Thomas Dohmke: There are two sides to every coin. We have invented these technologies—video messaging platforms and async communication—to make it possible to work outside of an office. GitHub, for example, is a remote company, but we haven’t figured out how to remove ourselves from all this communication. I recently read an interview with Lewis Hamilton, the Formula One driver. He said he put his iPhone away in a safe on a family vacation so he could be present and occupy his time in other ways. We’re so addicted to our phones that they’re hard to put down. We as humans need to evolve, and that starts with ourselves. The technology won’t help us to get away from technology. Practices and talking about it will help. And I’m the worst offender in many ways, too.

The other side of that is assessing how we are spending time as developers. We know that most developers don’t write code for more than two to four hours a day. The rest of the day, they’re doing emails, stand-ups, DevOps and lifestyle reviews, debugging incidents live, and planning ahead. There are many other things we have to do. So removing the boilerplate tasks and having AI do that for us so we can build bigger systems will be incredibly gratifying and create enough energy to overcome feelings of burnout.

Shivam Srivastava: What metrics are you using to measure success? How can other companies ensure they’re improving their productivity by 55 percent as well?

Thomas Dohmke: The caveat with the 55 percent is that it was measuring the end-to-end time from getting a task to solving a task in one case study. We have other things to do to get an end-to-end 55 percent productivity improvement. For instance, we don’t yet use AI in pull requests, test generation, or CI/CD [continuous integration and continuous delivery]. For now, I think a 5 to 10 percent productivity improvement is still pretty good.

At GitHub, we measure success by calculating the number of lines of code with an acceptance rate, which are elite indicators. We also conduct surveys about engineering systems to track developer happiness, and we conduct employee polls and pulse surveys.

The future of gen AI and the skills required to leverage it

Shivam Srivastava: Copilot is part of a broader vision you’re driving with AI and with search and code. Looking three to five years ahead, what do the developer workload and experience look like?

Thomas Dohmke: It feels a bit like the 1990s when the dot-com boom happened. We’re in the AI boom right now, which makes it harder to predict where things are going. But it’s clear that there are certain tasks that we will do only with AI, such as reading code, explaining code, and finding SQL [structured query language] injections in code. You can ask Copilot to find a security vulnerability, and it finds the SQL injection and explains it to you. You can even ask it how an attacker would exploit that SQL injection. Security problems will shift from the developer to the editor, so the software can solve issues before they appear. Productivity, workflow, security, scanning, and linters will be more or less automated with AI eventually. So developers will get a powerful assistant so they can uplevel to become systems thinkers.

Shivam Srivastava: What would be your advice for current computer science students or experienced developers who are thinking about their next set of necessary skills?

Thomas Dohmke: Being a software developer is the best job in the world because you can build things with almost no investment. So many founders from nowhere have found success in this space. So expand your skills as software developers, find your happiness, and find what you love doing. If you have to do menial tasks every day, figure out how you can automate them and leverage AI. I think we will create exciting innovations that way. That’s what’s driving us as software developers. The world has a lot of challenges, whether it’s climate change, wars around the world, or cancer, and AI will certainly help with a lot of these issues. For instance, there are start-ups that are trying to create vaccines that are adapted for specific body proteins. There’s so much to build. There’s never been a more exciting time as a software developer, so keep building and learning. And the truth is that it’s never been easier to learn as a software developer with the documentation that is available.

Shivam Srivastava: Will Copilot or gen AI tools work efficiently with hardware languages like Verilog? More broadly, if you’re an embedded software developer, when should you expect similar benefits?

Thomas Dohmke: There are three potential solutions. One is training a specific model on this, which is expensive. At the same time, the innovation that’s happening on those large models is so fast that it’s hard to keep up with. This option is not our preferred option. Second, you could fine-tune models by taking your additional data and tuning the billions of parameters to be more aligned with that specific language. For example, putting CUDA [compute unified device architecture] in the GPU [graphics processing unit] ecosystem. This option still takes time and is expensive.

The third approach is what we believe to be the most efficient: it’s called skill embeddings. In this option, you can feed additional context into the editor to inform it, such as QuickBooks files or a repository. You can copy and paste information and then ask the system to summarize it. Or you can train the model to pull that information from an API. Then it can pull all that additional information into the context of your prompt and provide that information to the model.

The challenge in the user interface will be for VS [visual studio] developers—or the editor itself—to specify the context that is needed to answer the question. Inference time depends on how much data you provide, so there’s a trade-off between accuracy and latency. But there are easy solutions for this. You can command Copilot to create a Jupyter Notebook with all that additional information, for instance. Regardless of your hardware languages, database schemas, or C++ libraries, Copilot can be the agent that can embed this information to fulfill the prompt.

Explore a career with us