The Atlantic Monthly
In the race to build computers that can think like humans, the proving ground is the Turing Test—an annual battle between the world’s most advanced artificial-intelligence programs and ordinary people. The objective? To find out whether a computer can act “more human” than a person. In his own quest to beat the machines, the author discovers that the march of technology isn’t just changing how we live, it’s raising new questions about what it means to be human.
Brighton, England, September 2009. I wake up in a hotel room 5,000 miles from my home in Seattle. After breakfast, I step out into the salty air and walk the coastline of the country that invented my language, though I find I can’t understand a good portion of the signs I pass on my way—LET AGREED, one says, prominently, in large print, and it means nothing to me.
I pause, and stare dumbly at the sea for a moment, parsing and reparsing the sign. Normally these kinds of linguistic curiosities and cultural gaps intrigue me; today, though, they are mostly a cause for concern. In two hours, I will sit down at a computer and have a series of five-minute instant-message chats with several strangers. At the other end of these chats will be a psychologist, a linguist, a computer scientist, and the host of a popular British technology show. Together they form a judging panel, evaluating my ability to do one of the strangest things I’ve ever been asked to do.
I must convince them that I’m human.
Fortunately, I am human; unfortunately, it’s not clear how much that will help.
The Turing Test
Each year for the past two decades, the artificial-intelligence community has convened for the field’s most anticipated and controversial event—a meeting to confer the Loebner Prize on the winner of a competition called the Turing Test. The test is named for the British mathematician Alan Turing, one of the founders of computer science, who in 1950 attempted to answer one of the field’s earliest questions: can machines think? That is, would it ever be possible to construct a computer so sophisticated that it could actually be said to be thinking, to be intelligent, to have a mind? And if indeed there were, someday, such a machine: how would we know?
Instead of debating this question on purely theoretical grounds, Turing proposed an experiment. Several judges each pose questions, via computer terminal, to several pairs of unseen correspondents, one a human “confederate,” the other a computer program, and attempt to discern which is which. The dialogue can range from small talk to trivia questions, from celebrity gossip to heavy-duty philosophy—the whole gamut of human conversation. Turing predicted that by the year 2000, computers would be able to fool 30 percent of human judges after five minutes of conversation, and that as a result, one would “be able to speak of machines thinking without expecting to be contradicted.”
Turing’s prediction has not come to pass; however, at the 2008 contest, the top-scoring computer program missed that mark by just a single vote. When I read the news, I realized instantly that the 2009 test in Brighton could be the decisive one. I’d never attended the event, but I felt I had to go—and not just as a spectator, but as part of the human defense. A steely voice had risen up inside me, seemingly out of nowhere: Not on my watch. I determined to become a confederate.
The thought of going head-to-head (head-to-motherboard?) against some of the world’s top AI programs filled me with a romantic notion that, as a confederate, I would be defending the human race , à la Garry Kasparov’s chess match against Deep Blue.
During the competition, each of four judges will type a conversation with one of us for five minutes, then the other, and then will have 10 minutes to reflect and decide which one is the human. Judges will also rank all the contestants—this is used in part as a tiebreaking measure. The computer program receiving the most votes and highest ranking from the judges (regardless of whether it passes the Turing Test by fooling 30 percent of them) is awarded the title of the Most Human Computer. It is this title that the research teams are all gunning for, the one with the cash prize (usually $3,000), the one with which most everyone involved in the contest is principally concerned. But there is also, intriguingly, another title, one given to the confederate who is most convincing: the Most Human Human award.
One of the first winners, in 1994, was the journalist and science-fiction writer Charles Platt. How’d he do it? By “being moody, irritable, and obnoxious,” as he explained in Wired magazine—which strikes me as not only hilarious and bleak, but, in some deeper sense, a call to arms: how, in fact, do we be the most human we can be—not only under the constraints of the test, but in life?
THE IMPORTANCE OF BEING YOURSELF
Since 1991, the Turing Test has been administered at the so-called Loebner Prize competition, an event sponsored by a colorful figure: the former baron of plastic roll-up portable disco dance floors, Hugh Loebner. When asked his motives for orchestrating this annual Turing Test, Loebner cites laziness, of all things: his utopian future, apparently, is one in which unemployment rates are nearly 100 percent and virtually all of human endeavor and industry is outsourced to intelligent machines.
To learn how to become a confederate, I sought out Loebner himself, who put me in touch with contest organizers, to whom I explained that I’m a nonfiction writer of science and philosophy, fascinated by the Most Human Human award. Soon I was on the confederate roster. I was briefed on the logistics of the competition, but not much else. “There’s not much more you need to know, really,” I was told. “You are human, so just be yourself.”
Just be yourself has become, in effect, the confederate motto, but it seems to me like a somewhat naive overconfidence in human instincts—or at worst, like fixing the fight. Many of the AI programs we confederates go up against are the result of decades of work. Then again, so are we. But the AI research teams have huge databases of test runs for their programs, and they’ve done statistical analysis on these archives: the programs know how to deftly guide the conversation away from their shortcomings and toward their strengths, know which conversational routes lead to deep exchange and which ones fizzle. The average off-the-street confederate’s instincts—or judge’s, for that matter—aren’t likely to be so good. This is a strange and deeply interesting point, amply proved by the perennial demand in our society for dating coaches and public-speaking classes. The transcripts from the 2008 contest show the humans to be such wet blankets that the judges become downright apologetic for failing to provoke better conversation: “I feel sorry for the humans behind the screen, I reckon they must be getting a bit bored talking about the weather,” one writes; another offers, meekly, “Sorry for being so banal.” Meanwhile a computer appears to be charming the pants off one judge, who in no time at all is gushing LOL s and smiley-face emoticons. We can do better.
Thus, my intention from the start was to thoroughly disobey the advice to just show up and be myself—I would spend months preparing to give it everything I had.
Ordinarily this notion wouldn’t be odd at all, of course—we train and prepare for tennis competitions, spelling bees, standardized tests, and the like. But given that the Turing Test is meant to evaluate how human I am, the implication seems to be that being human (and being oneself) is about more than simply showing up.
To understand why our human sense of self is so bound up with the history of computers, it’s important to realize that computers used to be human . In the early 20th century, before a “computer” was one of the digital processing devices that permeate our 21st-century lives, it was something else: a job description.
From the mid-18th century onward, computers, many of them women, were on the payrolls of corporations, engineering firms, and universities, performing calculations and numerical analysis, sometimes with the use of a rudimentary calculator. These original, human computers were behind the calculations for everything from the first accurate prediction, in 1757, for the return of Halley’s Comet—early proof of Newton’s theory of gravity—to the Manhattan Project at Los Alamos, where the physicist Richard Feynman oversaw a group of human computers.
It’s amazing to look back at some of the earliest papers on computer science and see the authors attempting to explain what exactly these new contraptions were. Turing’s paper, for instance, describes the unheard-of “digital computer” by making analogies to a human computer:
Philosophers, psychologists, and scientists have been puzzling over the essential definition of human uniqueness since the beginning of recorded history. The Harvard psychologist Daniel Gilbert says that every psychologist must, at some point in his or her career, write a version of what he calls “The Sentence.” Specifically, The Sentence reads like this:
We once thought humans were unique for using language, but this seems less certain each year; we once thought humans were unique for using tools, but this claim also erodes with ongoing animal-behavior research; we once thought humans were unique for being able to do mathematics, and now we can barely imagine being able to do what our calculators can.
We might ask ourselves: Is it appropriate to allow our definition of our own uniqueness to be, in some sense, reactive to the advancing front of technology? And why is it that we are so compelled to feel unique in the first place?
“Sometimes it seems,” says Douglas Hofstadter, a Pulitzer Prize–winning cognitive scientist, “as though each new step towards AI, rather than producing something which everyone agrees is real intelligence, merely reveals what real intelligence is not .” While at first this seems a consoling position—one that keeps our unique claim to thought intact—it does bear the uncomfortable appearance of a gradual retreat, like a medieval army withdrawing from the castle to the keep. But the retreat can’t continue indefinitely. Consider: if everything that we thought hinged on thinking turns out to not involve it, then … what is thinking? It would seem to reduce to either an epiphenomenon—a kind of “exhaust” thrown off by the brain—or, worse, an illusion.
Where is the keep of our selfhood ?
The story of the 21st century will be, in part, the story of the drawing and redrawing of these battle lines, the story of Homo sapiens trying to stake a claim on shifting ground, flanked by beast and machine, pinned between meat and math.
Is this retreat a good thing or a bad thing? For instance, does the fact that computers are so good at mathematics in some sense take away an arena of human activity, or does it free us from having to do a nonhuman activity, liberating us into a more human life? The latter view seems to be more appealing, but less so when we begin to imagine a point in the future when the number of “human activities” left for us to be “liberated” into has grown uncomfortably small. What then?
Alan Turing proposed his test as a way to measure technology’s progress, but it just as easily lets us measure our own. The Oxford philosopher John Lucas says, for instance, that if we fail to prevent the machines from passing the Turing Test, it will be “not because machines are so intelligent, but because humans, many of them at least, are so wooden.”
Beyond its use as a technological benchmark, the Turing Test is, at bottom, about the act of communication. I see its deepest questions as practical ones: How do we connect meaningfully with each other, as meaningfully as possible, within the limits of language and time? How does empathy work? What is the process by which someone enters into our life and comes to mean something to us? These, to me, are the test’s most central questions—the most central questions of being human.
Part of what’s fascinating about studying the programs that have done well at the Turing Test is seeing how conversation can work in the total absence of emotional intimacy. A look at the transcripts of Turing Tests past is, frankly, a sobering tour of the various ways in which we demur, dodge the question, lighten the mood, change the subject, distract, burn time: what shouldn’t pass for real conversation at the Turing Test probably shouldn’t be allowed to pass for real conversation in everyday life either.
HOW TO TALK LIKE A HUMAN
When the Turing Test was first proposed in 1950, it was a hypothetical: technology was nowhere near the point at which a practical test was possible. But, as we know, it got there; the first conversational computer program to attract significant notice and attention was Eliza, written in 1964 and 1965 by Joseph Weizenbaum at MIT. Modeled after a Rogerian therapist, Eliza worked on a very simple principle: extract key words from the users’ own language, and pose their statements back to them. (“I am unhappy.” “Do you think coming here will help you not to be unhappy?”) If in doubt, it might fall back on some completely generic phrases, like “Please go on.” This technique of fitting the users’ statements into predefined patterns and responding with a prescribed phrasing of its own—called “template matching”—was Eliza’s only capacity.
A look at an Eliza transcript reveals how adeptly such an impoverished set of rules can, in the right context, pass at a glance for understanding:
One of the strangest twists to the Eliza story, however, was the reaction of the medical community, which decided Weizenbaum had hit upon something both brilliant and useful. The Journal of Nervous and Mental Disease , for example, said of Eliza in 1966:
Mind vs. Machine]click here to read the rest