Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott

(8 pages)

In this episode of the McKinsey Global Institute’s Forward Thinking podcast, MGI’s James Manyika explores the implications of artificial general intelligence (AGI) for jobs, particularly in rural America, with Kevin Scott, Microsoft’s chief technology officer and author of Reprogramming the American Dream: From Rural America to Silicon Valley—Making AI Serve Us All (HarperCollins, 2020).

An edited transcript of this episode follows. Subscribe to the series on Apple Podcasts, Google Podcasts, Spotify, Stitcher, or wherever you get your podcasts.

Michael Chui: We’ve been hearing for a long time that robots are coming for our jobs. Now, with widespread global unemployment due to COVID-19, that sounds even more ominous. But what if robots and AI could, in fact, help with recovery? Well, it’s possible. For instance, in some rural parts of the US, artificial intelligence and machine learning are making these regions more economically viable.

One of the big topics we analyze at the McKinsey Global Institute (MGI) is artificial intelligence, how it’s impacting work, and what that means for society.

In this episode of Forward Thinking, we’ll hear an interview with one of the leading technology strategists in the world: Kevin Scott. Kevin is the chief technology officer and vice president of artificial intelligence and research at Microsoft. He also has a new book out called Reprogramming the American Dream.

The interview is conducted by MGI’s own James Manyika, who is a co-chairman and director of the McKinsey Global Institute, and a senior partner at McKinsey & Company. He’s also a deep expert in his own right when it comes to artificial intelligence and machine learning. That’s why James sat down with Kevin to discuss how AI might be the key to democratizing technology to work better for all of us. It’s a fascinating conversation. See what you think.

Kevin Scott: Thank you so much for having me, James.

James Manyika: Delighted to have you. I’ve been looking forward to this conversation for some time. We’re going to spend a fair amount of time discussing your book, but first I wanted to talk about what you’re working on right now. You’re building some of the largest, most complicated computer systems in the world. And much of that is being applied to AI systems. What are you most excited about?

Kevin Scott: There are many things that we’ve been working on for the past couple of years that I’m excited about, including these large-scale computing platforms for training a new type of deep neural network model. Some people called them unsupervised—we’ve taken to calling them self-supervised learning systems.

It’s been really thrilling to build all of the systems infrastructure to support these training computations that are absolutely enormous, and to see the progress made on these large self-supervised models and with deep reinforcement learning. I never thought we would get to some of the milestones that we’ve been able to hit over the past couple of years.

James Manyika: Well, Kevin, as always, you’ve already said a lot that’s interesting in your opening remarks. Say more about supervised learning for a moment. This is the idea that the AI techniques that we use today mostly learn from examples that we give them. You’re talking about going beyond that to self-supervised or unsupervised systems. Why is that such a big shift?

Kevin Scott: Around 2012 or so, the big revolution began happening with deep neural networks in machine learning, and these models doing supervised learning have been able to accomplish a lot in speech recognition and computer vision and a whole bunch of these perceptual domains. We very quickly went from a plateau that we had hit with the prior set of techniques to new levels, that in many cases approximate or exceed human performance at the equivalent task.

The challenge with supervised learning is that you need a lot of data and a lot of computational power. And the data that you train on is labeled. With self-supervised models, you also are training on huge amounts of data. And you need enormous amounts of compute. But you require very little or, in some cases, no supervision. No labels, no examples, so to speak, to tell the model what it is that you want it to do. The models are training over these enormous amounts of data to learn general representations of a particular domain. Then you use them to solve a whole variety of problems in that domain. This has absolutely transformed natural language processing over the past couple of years.

James Manyika: I remember last year when you published your results with Turing and NLG (Natural language generation) that was quite impressive.

Kevin Scott: The really interesting thing is that you don’t have the constraint of having to supply these models with large numbers of labeled training data points or examples. You really are in this mode where the models are scaling up, mostly as a function of the compute power you can apply to them.

James Manyika: When we start training systems with these very large-scale models, does that help us with another problem—the possibility of transfer learning? Because that hopefully will obviate the need to train systems every single time.

Kevin Scott: The exciting thing that we’re seeing with these big self-supervised models is that transfer learning does work. That means that you can train a model on a general set of data and then deploy it to solve a whole variety of different tasks.

James Manyika: That also takes us down the path of potentially democratizing technology access.

Kevin Scott: That’s one of the things that I wrote about in my book, and it’s perhaps the primary thing that gets me out of bed every morning and makes me excited to go to work. I really do believe that when we’re thinking about technology, we should always be thinking about what platforms we can create that empower other people to solve the problems that they are seeing, and to help them achieve what they want to achieve.

It can’t only be a small handful of large companies, or companies that are only located in urban innovation centers, that are able to make full use of the tech that we’re developing to solve problems. It really does have to become democratized. What I’ve been telling folks as I’ve talked about the book is that in 2003 or 2004, when I wrote my first real machine learning system, you really did need a graduate degree in an analytical discipline. You would sit down with these daunting graduate-level textbooks and stacks of papers. You had to be relatively mathematically sophisticated in order to understand how the systems worked. Then you would spend a whole bunch of time and energy writing lots of low-level code to do something. I went through this process for that first system I wrote, and it took about six months of coding to solve the problem that I was working on.

Fast-forward 14 years to 2020. Because of open-source software, because we’ve thought about how to solve these problems in a platform-oriented way, because we have cloud computing infrastructure that makes training power accessible to everyone, and because you have things like YouTube and online training materials that help you more quickly understand how all of these framework pieces work, my guess is that a motivated high school student could solve the same problem in a weekend, whereas that took me six months over 14 years ago. That really is a tangible testament to how far these tools have already become democratized. All the indicators point to the fact that they’re going to become further democratized over the next handful of years.

The future of work after COVID-19

Read the report

James Manyika: Kevin, as you know, I’m involved with the Broad Institute, which is one of the leading genomics research institutes in the world. And today, fully a third of the people there are AI computational people. That’s become necessary and integral to the research enterprise that places like the Broad are doing. But before we leave the question of AI and computing, I have to ask this question. Where do you think we are on this path towards AGI? And here I’m not really asking you to make a prediction, because I know that’s difficult. But I’m curious to hear where you think we are on that journey and what you believe are some of the big problems we have to solve before we can even get to AGI.

Kevin Scott: Ever since the Dartmouth workshop in 1956, where they invented the name for the field, the whole history of AI has been about attempting to create AGI. That started back in 1956, when the luminaries of the AI discipline of computer science met. At that time, they laid out this road map where they were trying to build software that could emulate, in a very general way, human intelligence.

That has proven to be a very difficult task. It’s unclear exactly how many problems of intelligence you can solve with more data and more compute, which I think is one of the reasons why it’s tricky to make accurate predictions when you get to general intelligence.

Every time that we have used AI to solve a problem that we thought was some high watermark of human intelligence, we have changed our mind about how important that watermark was. When we were both much younger, when I was in graduate school, the problem trying to be addressed was whether we build a computer, an AI, that can beat a grand master at chess.

Turns out the answer to that was yes. And we did it. There was a whole bunch of fanfare at that time. I think it helped us a little bit. It’s shone a light on how you could advance a particular part of artificial intelligence. But it certainly didn’t mean that the machines were suddenly taking over the role that human beings played.

It really hasn’t even made a material dent in chess, other than some of the techniques that we built in our AI are now used to help humans practice to become better human chess players. That’s actually the thing we care about: humans playing humans at chess.

James Manyika: Exactly. I remember when I was a grad student researching and studying AI, we used to think about Turing tests and all kinds of tests. But we keep moving the goalposts, as you said. Every time we solve a problem, we shift what we think the watermark really is. Which leads me to the subject matter and the topics you address in your book. One of the things I loved about your book is that it also gave me a window into you, Kevin, because I’ve always thought about you as a technologist building these very large-scale systems. But you grew up in a place that most people don’t associate with technology. You grew up in a small town in Virginia. Say more about that.

Kevin Scott: I grew up in this small town in rural central Virginia called Gladys. I’m fairly certain that there are more cows in Gladys than there are residents. There’s one state route that runs through the town. There isn’t even a stoplight to slow traffic down as it flows through. It’s a farming community. It is not a place that was associated with technology.

Neither of my parents had gone to college. I was the first person in my family to get a four-year college degree. I grew up in the ’70s and ’80s. When I was born, there was no personal computer.

There’s a lot of good luck that I benefited from in my career and in my life. I was between ten and 12 years old when the personal computer really started to hit—remember the Commodore C64, the RadioShack color computers, and the TRS-80s and the Apple IIe? You had computers on the shelves in department stores. And I was just completely fascinated by these things. I wanted to understand how they worked.

James Manyika: That’s fascinating. I remember I had the Sinclair Spectrum. I don’t know if you’re familiar with it—we had it in the UK and other parts of the world. One of the claims that you make in your book that is very exciting is this idea that technology, and AI in particular, can bring prosperity to all parts of America, including rural America. Tell me more about why you think that’s possible.

Kevin Scott: When I started writing the book, I had been doing machine learning for so long in a bunch of different contexts. I had been living in Silicon Valley and working in the technology industry for such a long time that I really had this idea in my head that maybe these technologies weren’t going to benefit people in rural America.

One of the first things I did was to go back home and chat with some people I grew up with. And I had this “aha” moment, almost the second that I set foot in some of these places that my friends were working. I was just reminded that these are some of the most ingenious and industrious people that I know.

They were already running businesses. They had pivoted with all the twists and turns that the economy had thrown at them and built businesses that were already using the most advanced technology that they could lay their hands on.

What’s more, I believe that the machine learning systems are going to get exposed to entrepreneurs who are in these communities in more concrete ways, so that they can build even more ambitious things.

If they didn’t have this technology, these businesses wouldn’t exist. The reason that they are competitive in this fierce global market for manufacturing is because the automation that they can leverage is just as efficient no matter where it’s running geographically.

They created a whole bunch of high-skilled jobs in this tiny little community in central Virginia that wouldn’t exist otherwise. With more high-skilled jobs, they create this beneficial effect inside of the community. Enrico Moretti wrote this wonderful book called The New Geography of Jobs. Enrico is an economist at the University of California, Berkeley. In his research, he posited that a single high-skilled job can create five lower-skilled jobs inside of the community where the high-skilled job is created. And you can see that economic effect in some of these places.

Subscribe to the Forward Thinking podcast

Apple Podcasts Spotify Listen to previous episodes

James Manyika: I guess the question in my mind is, why don’t we see this happen in more places?

Kevin Scott: Well, the thing that I saw—and this is, granted, just anecdotes—is that it’s happening in more places than I thought. As soon as I saw this pattern, I thought, wow, where else might this be happening?

You can see it at scale in Germany with the Mittelstand, which typifies this model of combining the high-skill, highly trained labor and augmenting them with really sophisticated technology, whether it’s a manufacturing business or a services business, or whatever.

You have lots of these businesses in Germany creating lots of economic output. In some ways, I think the Mittelstand is this pillar of the German economy. And when I started looking here in the United States, I saw more of these sorts of businesses than I expected.

One of the challenges for getting these instances running in communities is partially about capital allocation. Do the entrepreneurs in these communities have reasonable access to venture capital, so that they can try out their most ambitious ideas?

Then you have this basic stuff that’s just shameful that it isn’t already solved, like access to broadband, or the vocational education required to ensure people can use these tools effectively to do the work of the future.

James Manyika: Think about all the examples you’ve got in your book and coupling those together with the moment that we’re in. We’re now in this extraordinary moment where it feels like the economy fell out from underneath us.

At the same time, there’s all this transformation that’s required. What do you think we could do, and America could do, in this moment in order to capitalize on the ideas you’ve got in the book, and also use this galvanizing moment to reimagine what the future might be? Any thoughts?

Kevin Scott: One of the things that I wrote about in the book is this idea that government investment can be a good catalyst for innovation. Think about the self-driving industry, for instance, and all these autonomous vehicles. I would argue that there is a primary reason that these ingenious people decided when they were graduate students at Stanford and Carnegie Mellon to focus on solving that problem of how do you get a vehicle to be able to drive itself. At the time there were these DARPA Grand Challenge problems. You had funding that was going to graduate schools and a prize at these milestones toward solving this problem.

I think that could be applied in tangible ways to helping solve some of the big challenges that we, as a society, face. You could even go much more ambitious than something like a DARPA Grand Challenge. And I don’t think it’s an either/or. Maybe you want to do a bunch of these things.

We ran the Apollo program in the in the ’60s not because there was anything especially necessary about putting a human being on the moon, but because solving that problem was a great way to focus human ingenuity at a massive scale on a set of technologies that turned out to be very beneficial. For example, our modern aerospace industry came out of the Apollo program.

I think we could pick a challenge like healthcare. We could say, enough is enough, it’s time that every human being on the planet has access to high-quality, low-cost healthcare. And here’s this list of diseases and conditions that we want to radically transform. Let’s cure, eliminate, minimize the impact and suffering from these things and spend at the same level of the Apollo program. Maybe you don’t even need to spend that much.

It’s not a huge amount. It’s about 2 percent of GDP for a handful of years. I think you really could transform not just human well-being through the end product of what you’re building. The process of solving the problem could put into place this infrastructure that could also define entire new sectors of the industry and our economic outputs for decades ahead.

James Manyika: I couldn’t agree more. It’s quite sobering to remember, to your point, Kevin, that the peak of investment in overall federal R&D funding as a percentage of GDP was in 1964. That’s when it was about close to 2 percent. It stayed there for a while. Today it’s now dropped to about, at least in the US, to about 0.6 percent. I think there’s a role, as you said, for what reallocation in investment could do to drive the change that we’re talking about.

But let me ask this. There’s clearly so much that society and the economy could benefit from democratizing technology and innovation and doing it at a very large scale. But there’s also always the concern about potential misuse, misapplication, or risks associated with technology. How do you think about that question? How can we be more thoughtful and ensure we don’t misapply these technologies?

Kevin Scott: One of the ways that I think about it is that, as we invented software engineering as a discipline over the course of the past 60 years or so, we realized that finding all the bugs in software is hard. We built a whole bunch of practices to try to catch the most common type of software bugs. We created a set of techniques to help us mitigate the impact that the bugs that slip through will have.

We’re going to have to build a similar set of things for machine learning models and AI. To give you a few examples of what we’re thinking about, we at Microsoft have a sensitive uses practice for AI now. Anybody building a machine learning model that’s going to be used in a product that the company has a set of guidelines that define what is or what isn’t a potential sensitive use of that technology.

If it is a sensitive use, then it has to go through a very rigorous review process to make sure that we are using the technology in ways that are making fair and unbiased decisions, that the data that we’re using to train the model and that the data that we’re collecting as a byproduct of the use of the model is treated in a proper way, preserving all of our covenants that we have with all of our stakeholders, having the degree of transparency and control that you need in the most sensitive situations.

Bias in data is something we’ve talked a lot about as a community over the past handful of years, and we now have tools that can detect when a data set has fundamental biases in it. We’re using GANs (generative adversarial networks), which are another type of neural network, to generate synthetic data to compensate for those biases, so that we can train on unbiased data sets even though we may not have representative data that is naturally occurring in the data set to help us train the model in the way that we want.

It’s a whole spectrum of things, and I think the dialogue that we’ve got right now between stakeholders, people building these models, and people who are analyzing them and sort of pushing us and advocating for accountability—all of that’s good. It’s a good thing that’s happening, and I’m delighted that we have this ongoing debate.

James Manyika: One of the things I like about what you’re describing, Kevin, is that you’re emphasizing the fact that there isn’t a silver bullet solution to these issues. That it’s going to take concerted effort by the engineers, the scientists, the ethicists, a whole range of people thinking together about how to make sure these systems are safe. That we get all the benefits that we should be getting out of them.

Kevin Scott: Absolutely.

James Manyika: Kevin, before we go, I wanted to hear more about the smaller-scale AI projects you’re working on. I’ve heard something about an AI coffeemaker. Does it know how much caffeine you need, by the way?

Kevin Scott: I think that’s an incomputable thing! I need lots of caffeine. I really can’t help myself. One of the ways that my curiosity manifests as I try to understand the world is, like, I just want to build things. It’s something that I get from my from my grandfathers and from my dad, who spent their entire lives making things with their hands, in their work and in their hobbies.

I weirdly have this thing for coffee machines. I’ve built four of them in the past, and I’m working on one right now, which is a vacuum siphon coffee machine that has an AI user interface. Which creates this cognitive dissonance when I’m trying to slam these two things together.

Vacuum siphon coffeemaking has been around since Victorian England, potentially longer than that. But they were very popular in Victorian society. And when you look at one of these things, it is almost the definition of steampunk. I’m building this steampunk coffee machine that has a modern user interface on it. Instead of buttons and a screen, it has a speaker, a camera, and a microphone. And so, when you approach the machine, it uses facial recognition to see whether it recognizes you. If it doesn’t, it has a dialogue with you about how you want your coffee made. And then it offers to remember your preferences associated with your face. And if it recognizes your face because you’ve given it your preferences before, it will ask you “Kevin, would you like a cup of coffee?”

The funny thing about this machine is that the hard part wasn’t the AI. All the speech recognition, face recognition, and the code that you need to store preferences and whatnot is not that complicated, given open-source tools. The electronics required to run the stuff is a little bit pricier than you would want to use in a consumer-grade coffee device. The fascinating thing is the $30 worth of electronics it takes right now to run the AI part of the machine. Plus, the control loop for the rest of the device. Like, the compute there is still one of the places where Moore’s Law is working well. These low-cost microprocessors and microcontrollers are on silicon fabrication technology that’s a few generations behind.

And those cheap microcontrollers and microprocessors are going to be getting much more capable over the coming years. I think we might get to the point soon where you can do all of the stuff that I’m doing for three bucks or a dollar’s worth of electronics, which then makes it a very feasible way to build a user interface for something. It’s coming.

James Manyika: That’s very cool. Well, I can’t wait to try out the coffee. Hopefully, it’s actually good coffee too.

Kevin Scott: We will see [laughs]!

James Manyika: Well, Kevin, I want to thank you again so much for joining us. This was such a pleasure for me and for hopefully the audience that’s going to listen to this. I’m looking forward to catching up again and continuing our conversations.

Kevin Scott: Thank you so much for having me. And as always, it’s a pleasure chatting with you.

Explore a career with us

Search Openings

Forward Thinking on artificial intelligence with Microsoft CTO Kevin Scott

The future of work after COVID-19

Subscribe to the Forward Thinking podcast

Explore a career with us

Related Articles

Programming life: An interview with Jennifer Doudna

Forward Thinking on unemployment with Sir Christopher Pissarides

Forward Thinking on measuring GDP and productivity with Diane Coyle