Changing the game: “How I beat Watson and came out a different player”

(PDF-494 KB)

When the producers of Jeopardy! e-mailed me in October 2009 to ask if I wanted to spar against Watson, its game show–savvy supercomputer, my first thought was: “I’m going to lose.” My apprehension was reinforced several months later while signing the hefty nondisclosure agreement after noting that no one who had won more than two episodes of the show was allowed to face off against the machine. (I had won twice the previous fall.) It was then that I realized I was the trivia equivalent of chum, fated to be tossed overboard as bait for a still-adolescent Watson.

In the months before Watson faced off against his primary human opponents, 74-time champ Ken Jennings and Brad Rutter, the quiz show’s highest-earnings player, IBM recruited several hundred former contestants to playtest it on all of the soft skills required to win at Jeopardy!, including buzzer speed and betting strategy. The sparring matches were staged in a mock studio off the lobby of the company’s Thomas J. Watson Research Center (named after IBM’s founder), in Yorktown Heights, New York, where the real-life Watson’s famous dictum “THINK” is mounted in the lobby. The building’s vintage 1960s, Eero Saarinen–designed interiors are reminiscent of 2001: A Space Odyssey, which felt appropriate, considering we had come to see a modern-day HAL.

The studio had been hastily constructed in the room adjacent to Watson’s hardware, a cluster of IBM POWER7 servers comprising 2,880 processing cores stuffed full—really full—of facts and algorithms. His brains were hidden behind a curtain and double-paned windows, the dull roar of his cooling fans still faintly audible. As far as actual game play was concerned, verisimilitude was the goal—the contestant podiums and buzzers were near-perfect replicas of those on the show, and IBM had even hired an archly funny host (actor Todd Alan Crain) to stand in for real host Alex Trebek.

What we didn’t know at the time was that IBM and Jeopardy!’s owners at Sony Pictures Entertainment had been battling for months over the details of the televised matches, including whether Watson would need to buzz with an electro-mechanical hand or not (he would). It had been Jeopardy!’s decision to stall Watson’s development by feeding it substandard opponents at first.

How Watson “thinks” has been covered at length elsewhere, most notably in Stephen Baker’s recent book, Final Jeopardy: Man vs. Machine and the Quest to Know Everything (Houghton Mifflin Harcourt, 2011), and in the NOVA episode “Smartest Machine on Earth.” My sparring rounds also make an appearance in Baker’s book, in which he describes me “thrashing” Watson. In fact, I won all three of my matches against the machine—his only opponent to do so.

While I was pleased to win, what stands out for me about our matches is how easily Watson trained me to play the game his way. Yes, I won, but my confrontation with an alien mind changed me as a player.

To beat Watson, I was convinced I’d need to play a perfect game. His speed on the buzzer would be lightning-fast, and with a full 30 seconds to calculate an answer in Final Jeopardy, I feared he would be invincible. My only hope was to steer him into categories where the clues would be full of allusions and wordplay—and pray that the semantic difficulty would trip him up long enough for me to buzz in first and bet big. Otherwise, he would wage a war of attrition, strip-mining clues off the board as I fell further and further behind. I would need to beat him to the Daily Doubles in order to launch the Hail Marys that would keep me in contention—and guarantee a lead going into Final Jeopardy.

The only problem with this strategy is that Watson’s assumed weaknesses were my weaknesses as well. Classic Jeopardy! categories like Rhyme Time and Before & After added an extra cognitive step that usually threw me off. My own style of play, which I had developed over a decade of playing Quiz Bowl (the team trivia format that had evolved from the old “GE College Bowl”), was to read the clues in two seconds or less, and only buzz if I immediately knew the answer, or had a strong guess—in other words, a practically subconscious series of calculations that produced an answer which “felt” right. (For example, during the first of my three televised appearances, I was trailing by a very large margin when I selected the last clue on the board, the final Daily Double. Wagering almost everything on a clue about Colonial Williamsburg, at first I drew a blank. After spending a few seconds desperately probing my memory banks, the correct response, “What is the House of Burgesses?” came to me with the physical sensation of dislodging itself from the roof of my skull and falling onto my tongue. I’ll never forget it.)

While my strategy turned out to be sound (and Watson wasn’t the invincible opponent I had expected), what struck me was how quickly I jettisoned long-established patterns of playing the game against human opponents to grapple with a computer who had no idea how to play “the right way.” This manifested itself during our matches in two ways.

The first was in how he chose clues. Most players start with the easiest, lowest-value ones and work their way down the board, learning from patterns buried in the clues. Not realizing this, or caring, Watson tended to start each new category with the highest-value clue and work his way up. As Baker notes in his book, “There was a logic to this. While humans heard all the answers, right and wrong, and learned from them, Watson was deaf to the proceedings. If it won the buzz, answered the clue, and got to pick another one, it could assume that it had been right. But that was its only feedback. Watson was senseless to all of the signals its human competitors displayed—the smiles, the gasps, the confident tapping of fingers, the halting speech and darting eyes spelling panic.”

My eyes started darting when I realized what Watson was doing—preying on our cognitive blind spot and vacuuming the highest-value clues off the board before we had a chance to gain our sea legs. But my response surprised me—by my second match, I was doing the same thing. Trailing Watson and my fellow human opponent at the start of Double Jeopardy, I chose the $2,000 clue in my category of choice: Nonfiction. As it happened, I had lucked into a Daily Double. I wagered everything on the following clue: “A 2009 biography of this builder of Grand Central Terminal calls him ‘the first tycoon.’” (Answer: “Who is Cornelius Vanderbilt?”) From there, I rallied to win the game.

Trailing again late in my third match against Watson, I abandoned all pretense of trying to beat him outright and began scouring the board for the remaining Daily Double, knowing I had to find it and nail it to have any chance of winning. I had never seen anyone adopt such tactics in all my years of watching Jeopardy!. Against humans, I might have been content to grind it out, but I was on the run, and scared. Once again, the plan worked, as I took the lead just in time for Final Jeopardy. The clue for the category “Olympic Venues,” was: “At above 7,000 feet, this Western Hemisphere city had the highest altitude ever of a Summer Olympics host city.” Correct response: “What is Mexico City?” Both Watson and I got this one right, but thanks to my lead going in, I won the match.

I was hardly the first person to try and beat a computer at its own game rather than stick to a human one. World chess champion Garry Kasparov, in the third game of his match with IBM’s Deep Blue in May 1997, chose to open with the esoteric Mieses Opening¹ in a deliberate attempt to drag the computer out of its well-practiced repertoire of openings. It worked, but required Kasparov to abandon his repertoire as well. The game eventually ended in a draw.

Kasparov lost that match three games later in crushing fashion, leading Newsweek to dub his defeat “The brain’s last stand.” Rather than be embittered by his loss and computing’s subsequent hostile takeover of chess, Kasparov has become a proponent of man–machine collaboration. In “freestyle” tournaments, human–computer teams running the most basic commercial software have managed to crush the best chess programs on the market, which in turn had crushed most grand masters. “Having a computer program available during play was as disturbing as it was exciting,” he wrote in the New York Review of Books.

“The machine doesn’t care about style or patterns or hundreds of years of established theory,” he added. “It is entirely free of prejudice and doctrine, and this has contributed to the development of players who are almost as free of dogma as the machines with which they train. Increasingly, a move isn’t good or bad because it looks that way or because it hasn’t been done that way before. It’s simply good if it works and bad if it doesn’t. Although we still require a strong measure of intuition and logic to play well, humans today are starting to play more like computers.”

Ironically, IBM’s rationale for developing Watson was to make computers understand us better—or at least how we use language. What I learned from playing Watson was a better understanding of my own prejudices. In more than a decade of playing trivia games of one kind or another, I had the run of my life against him. He made me a better player.

Changing the game: “How I beat Watson and came out a different player”

Explore a career with us

Related Articles

The programmer’s dilemma: Building a Jeopardy! champion

IT growth and global change: A conversation with Ray Kurzweil