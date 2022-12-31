



Last week, the New York Times said that ChatGPT is Google’s code red. Millions of people on Twitter share the same sentiment. For example, former Elon Musk nemesis George Hotz, who was hired by Twitter to fix its searches, retired after five weeks.

And sure enough, companies like perplexity.ai, neeva.com, and you.com are already piloting systems that combine traditional search with large-scale language models (like ChatGPT). A good friend of mine is also trying it. A well-known writer friend wrote to me:

I broke down and got a ChatGPT account yesterday and I use it for two purposes. I started writing a new book. And instead of Google, I figured I’d use it as a smarter search engine for references that couldn’t be easily specified in a text string What were the polls that showed how you feel?) . Name is?)

Can ChatGPT really do it all? Should Google be shaking in its boots?

perhaps. I’m not sure. On December 1st, I decided on the first round to Google:

It’s time for Round 2 today. That was almost a month ago, and the AI ​​hype goes on almost forever. How is it now at the end of December?

It turns out my writer friend wasn’t too happy either:

Unfortunately it was completely useless and had zero usable hits, but you should never generalize about groups of people and people should respect what they say and believe It’s only been two days, but the shallowness of your repeated[ly] What is discovered becomes its most legitimate use: a real handicap as a more conceptual and semantic search engine.

In all fairness, large language models are not inherently Ultra PCs. ChatGPT said; MetaAIs Galactica arguably had an essay on the alleged benefits of anti-Semitism and such, leading to its abrupt removal by Meta AI. Rather, the guardrails OpenAI added with ChatGPT filter out many of the most aggressive responses an LLM can generate.

The problem is that these guardrails are not provided for free. The system is still shallow. ChatGPT doesn’t really know what it’s defending against. Most of the time it looks like you’re just looking for a keyword. So it tends to generate nonsense like:

I like to call this kind of nonsensical incomprehensibility a nonsensical answer that indicates that the system has no idea what it’s talking about.

There is also another problem. Perhaps an even bigger concern in the context of search engines. It’s an illusion. When ChatGPT (which wasn’t intentionally designed as a search engine) stole my friend’s queries, it tried to send them to her YouChat, which was explicitly tailored for search (unlike ChatGPT). Again, the result was fluent prose, again dominated by hallucinations.

Sounds perfectly plausible. But A Mighty Wind is not about Orthodox Jews, nor is it a British comedy. (And where’s Harry Shearer?) We’re mashing a bunch of the real stuff (which was actually 2003, actually directed by Christopher Guest) with other things that don’t belong. Up It’s safe to say it’s not the film my writer friend was looking for.

What’s the use of a search engine that builds stuff?

Heres my friend’s other query:

I’m sorry, I made a mistake again. The NORC Center for Public Affairs Research sometimes conducts research in AP, but as far as I know, it never conducted a research on this specific topic in 2019 and the numbers are a hoax. (And when I tried again, I got a different number.)

If you want a search engine, you might want links to actual research. Maybe because I want to read the details myself?

Sorry, no such luck. As ResearchRabbit.ai founder Michael Ma told me in his DM, Google lets you follow things up, but pure chat limits your output to lines of text and your journey of discovery ends. .

All right, this is your last chance. This is perplexity.ai. I much prefer its interface – and it provides a reference!

But wait, do the 2019 Pew studies they link at the bottom really address the bigger issue of people watching what they say? , mainly about religion. It’s not at all clear that this is the study my author friend was looking for. Believe me in my ability to combine human ingenuity with the power of traditional search engines in the middle of the old-fashioned link-by page-link results).

On the other hand, a friend with access to Neeva.com queried my opinion and got back about 90% correct and 10% horrible bios. He believes that 90% of his jobs will be provided by AI within a few years. (Spoiler alert: I never said such a thing).

It is insidious that truth and falsehood are mixed in such a complete and authoritative way. I am not ready for our post-truth information lords.

what’s going on I said it once before in my essay “Why does GPT look so great one moment and breathtakingly stupid the next?”, but I’ll say it again using a different term: big A large language model is not a database. They are glimmers of bits that don’t always belong together.

A (traditional) search engine is a database, an organized collection of data that can be stored, updated and searched at will. A (traditional) search engine is an index. A form of database that connects keywords, etc. to URLs. Like updating a phone number in a database that holds contacts, it can be done in small increments as quickly as possible.

Large language models do very different things. They are not databases. They are text predictors, a turbocharged version of autocomplete. Essentially, what learners learn are relationships between bits of text, such as words, phrases, and even entire sentences. And then use those relationships to predict other bits of the text. And they do something almost magical: paraphrase those text fragments like a thesaurus, but much better. But as they do, something is often lost in translation when they end up together. That is, which bits of text do or do not really belong together.

The phrase an early 2000s orthodox Jewish British comedy movie is perfectly valid, as is the phrase Mighty Wind being a Christopher Guest movie. But that doesn’t mean these two bits of him belong together.

(Update: Remember the quirkiness that AI and I will soon replace jobs? A linked article on Medium reveals that I spoke at a conference in Beijing in 2017. And so did someone else, a prominent Chinese researcher named Feiyue Wang.In Wang’s opening keynote speech, Wang said it was my fault, and LLM changed his words to my name. erroneously associated with

As it happens, large language models are very difficult to update, usually requiring a complete retraining, sometimes weeks or months. For example, ChatGPT, released in November, is so old that I don’t know who owns his Twitter.

Get one more for Google.

The best I can say is that the Perplexity.ai and you.com chats are genuinely looking for interesting ideas. It’s a hybrid that combines a traditional search engine with a large language model, possibly allowing for faster updates. However, there is still a lot of work to be done to properly integrate the two: classical search and large language models. There is a proof of concept and some interesting research directions, but nothing like a reliable system. (There are also economic and speed issues. The average Google search is almost instant, and Google certainly costs less than a penny, but compiling an answer to a ChatGPT query can take seconds. , some estimate that ChatGPT queries cost a few cents each; and it’s also not very clear how to place ads..)

I look forward to the day when search engines can reliably return text and genuine, relevant references, just like perplexity.ai aims to do. But until all these bits come together in a reliable way, I prefer to borrow Ariana Grande’s memorable phrase. Thank U, Next.

Self-driving cars have taken years longer than originally said. Outliers (aka edge cases) have, at least so far, hindered the transition from demos to widely available reality. In our quest for new quests, I think we are once again on a difficult road.

Gary Marcus (@garymarcus) is a scientist, best-selling author, and entrepreneur. His latest book, Rebooting AI, co-authored with Ernest Davis, is one of his Forbess 7 Must Read Books in AI.

Addendum: hallucinations, incomprehensibility, and obscenity, the by-product of another guardrail failure all rolled into one.

Beware of emptor when it comes to LLM-driven searches.

