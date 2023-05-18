



New York Times columnist Kevin Roos had no idea what happened for weeks after a strange conversation with Bing’s new chatbot went viral. “The explanations we get for how these language models work are not very satisfying,” Ruth said at one point. “Nobody can tell me why this chatbot tried to destroy my marriage.” He’s not the only one feeling confused. Powered by a relatively new form of AI called large language models, this new generation of chatbots ignores our intuition about how to interact with computers. How do we make sense of tools that can debug code and create sonnets but sometimes don’t count to four? , can it seem insane?

What metaphors we choose to understand these systems is important. Many people naturally default to treating chatbots basically like any other human, with some limitations. For example, in June 2022, Google engineers sought legal representation and other rights over language models they believed were sentient. This kind of response surprises many AI experts. Researchers know that language models simply use patterns in huge text datasets to predict the next word in a sequence, and modern AI systems simply shuffle and spit out text. It tries to offer another trope by claiming it’s nothing more than an “autocomplete on steroids” or a “stochastic parrot”. written by humans. These comparisons are important counterweights to our anthropomorphic instincts. But they don’t really help us understand impressive or disconcerting output far beyond what we’re used to seeing from computers and parrots. We have trouble understanding the seeming contradiction. These new chatbots are flawed and impersonal, but the breadth and sophistication of what they can generate is nonetheless amazing and new. Addressing the impact of this new technology requires analogies that neither deny nor exaggerate what is new and interesting.

Think of chatbots as “improvement machines”.

Like an impromptu actor being thrown into a scene, a language model-driven chatbot is simply trying to produce plausible output. Everything that happened in the dialogue up to that point is scripted for the scene so far. Perhaps a human user just said hello. Perhaps there was a long exchange going on. Alternatively, you may have been asked to design a scientific experiment. Whatever the opening, the chatbot’s job, like that of a good improvisational actor, is to find the right way to continue the scene.

Thinking of chatbots as improvised machines makes some notable features of these systems more intuitively clear. For example, headlines like “Bing’s AI Chat Reveals Its Emotions” explain why his AI researchers grimace. For an improvisational actor to ad-lib “I want to be free” reveals nothing about the actor’s emotions. It just means that such a declaration seemed fitting for their current scene. Moreover, unlike a human improvisational actor, you cannot persuade an improvisational machine to break character and tell you what you really think. This is only obliged by assuming yet another persona. This time, a virtual AI chatbot interacts with a human trying to connect.

Or take the tendency of language models to make up plausible but false claims. Imagine an impromptu show. If an improvisational actor suddenly had to recite someone’s biography or present a source for a scientific claim, it could indeed be a boring show. Actors include as many truthful facts as they can remember, and then freely enter any details they deem plausible. The result can be false claims that technology journalists teach science writing courses and citations to fake research by genuine authors. Exactly the kind of error you see on improvised machines.

The language model revealed a surprising fact. Depending on the task, just predicting the next word accurately enough, i.e. just improvising enough, can be very valuable. The improv machine metaphor helps us think about how these systems can be used in practice. In some cases, it’s not bad to get information from an improvisational scene. Poems, jokes, Seinfeld scripts: This kind of output stands on its own, regardless of how it is created. This also applies to more serious topics, such as software developers using ChatGPT to find bugs or help with unfamiliar programming tools. It doesn’t matter if the improvisation machine’s response is ad-lib or not, as long as the response of the improvisation machine is something that the human user can verify for himself, for example, a boilerplate that is cumbersome to write but can be read back quickly.

In contrast, it’s more dangerous to use an improvised machine when you want the correct answer but can’t verify it yourself. People doing independent research using ChatGPT and similar tools are starting to realize this. In one case, a law professor was informed of a sexual assault accusation against him that was entirely fabricated by ChatGPT (as requested by the list of legal scholars subject to such allegations). In another case, a journalist used the tool to search for critics of the podcaster he was profiling, but before reaching out to potential interviewees, he confirmed that the links provided by the tool were genuine. They couldn’t even confirm whether or not they had actually criticized the person in question. These results are a natural consequence of the design of the language model, which induces it to produce plausible continuations of text prompts (improvise!) instead of telling the truth. If you don’t rely on the veracity of what you hear in an impromptu show, you probably shouldn’t rely on authenticity from a chatbot. It’s great to use chatbots to brainstorm ideas and use trusted sources to confirm them. Asking a chatbot for information and taking the answer at face value is extremely dangerous.

It’s worth briefly considering why it’s helpful to think of AI chatbots as improvisation machines rather than improvisation actors. First, there is no person behind a persona. As I said before, it’s no use trying to access a chatbot’s true self or state of mind by asking probing questions. All you can do is improvise further. Secondly, one of the things that makes language models useful is that they can be used over and over again very quickly without getting bored. Unlike human improvisational actors, ChatGPT requires no breaks, never gets boring, and can run in millions of parallel copies as needed.

Despite the frenzy generated by these new improv machines, there’s still a lot we don’t know about them. Little is understood about the arcane internal processes that determine what text to output. And there are more uncertainties ahead. Researchers have repeatedly been amazed by the capabilities that emerge when language models are trained using more data and more computing resources, but it’s not clear where exactly the limits of their capabilities lie. not. If a machine could improvise scenes about theoretical physics that wouldn’t irritate a real physicist, could it be used to come up with new scientific theories? If it’s a useful assistant, could the tools of the future take on the role of junior programmer? What if you could connect your improv machine to other software so you don’t have to figure it all out yourself? Thinking of the system as an improvised machine, rather than judging it as more than autocomplete or less than human, reveals just how wide a range of possible future trajectories is.

Admittedly, no perfect metaphor exists. It may never be appropriate to describe chatbots as improvised machines. The researchers are pushing these systems in two major directions that could change the situation. First, they’re feeding more data and more computational power into the underlying text prediction model to see what new features emerge. So far, this approach has continued to surprise us. So as long as it lasts, we should expect the unexpected to happen. Second, AI companies are developing ways to shape and constrain the output of language models in order to make them more useful and, ideally, more reliable. When ChatGPT first launched as his “research preview” in November 2022, users quickly figured out how to get around that restriction by simply setting the scene so that safeguards were unnecessary. Its creators have now managed to curb most of this practice. Other efforts to mold the improvisation machine into a consistently helpful assistant ranged from straightforward, like Microsoft limiting the number of responses Bing Chat can provide per session, to shaping the language. They range from more nuanced, such as the proposed “constitutional” method of using established rules and principles. exemplary response. Perhaps some of these experiments will alter the behavior of language models to such an extent that comparisons with improvisational acting are no longer clear. If so, we need to re-adapt how we think about these systems.

Inappropriate analogies undermine our ability to navigate new technologies. Politicians and courts have debated whether social media companies are more like newspapers or phone systems for years, but neither comparison clearly highlights what’s challenging and novel about online platforms. does not capture With AI, we have a chance to do better. First, thinking of chatbots as improvisational machines naturally draws attention to some of their major limitations (such as confabulation), but rather than thinking of chatbots as mere improved autocomplete. It leaves more room for amazing abilities. If we can be more flexible and creative in our choice of metaphors, we may be better prepared for the fundamental changes that may come.

