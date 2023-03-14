



Less than four months after the release of ChatGPT, a text-generating AI, OpenAI announced a new product called GPT-4. Rumors and hype about this program have been circulating for over a year now. Experts say the program is immeasurably powerful, allowing him to write a 60,000-word book and create a video out of a whole piece of cloth with just one prompt. Today’s announcement suggests that GPT-4’s capabilities, while impressive, will be more modest. It performs better than previous models in standardized tests and other benchmarks, works with dozens of languages, and can take images as input. Describe the contents of the photos and charts.

Unlike ChatGPT, this new model is not currently available for public testing (although you can apply for access). As such, the available information comes from OpenAI’s blog post and the New York Times article based on the demonstration. From what we know, compared to other programs, GPT-4 appears to have added 150 points to his SAT score, which is now 1410 out of 1600 and a mock bar exam. Jumped from the bottom of the performers to the top 10% on the exam. Program AP’s English score remains in the bottom quintile, despite the clear concern about AI writing. Also, ChatGPT can only handle text, but in one example, GPT-4 correctly answered a question about a picture of a computer cable. The image inputs, even those that were eventually granted access from the waiting list, have not yet been made public, so OpenAI’s claims cannot be verified.

The new GPT-4 model is the latest in a long lineage. GPT-1, GPT-2, GPT-3, GPT-3.5, InstructGPT, and ChatGPT, now known as Large Language Models (LLMs), are AI programs that learn. Predict which words are most likely to follow each other. These models work under the premise that they trace their origins to some of his earliest AI research in the 1950s. In other words, a computer that understands and generates language will inevitably be intelligent. This belief underpinned Alan Turing’s famous imitation his game, now known as the Turing Test. The game judged a computer’s intelligence by how human-readable its text output was.

These early linguistic AI programs involved computer scientists deriving complex handwritten rules rather than the deep statistical inferences used today. His LLM predecessor in modern times dates back to his early 2000s writings. The technology has advanced rapidly in recent years thanks to several key breakthroughs. In particular, the program improved attention. GPT-4 can make predictions based on many previous words, not just the previous phrase, weighting the importance of each word differently. LLMs today read books, Wikipedia entries, social media posts, and countless other sources to find these deep statistical patterns. OpenAI has also started using human researchers to fine-tune the model’s output. As a result, GPT-4 and similar programs have excellent capabilities for writing languages, short stories and essays, advertising copy, and more. Some linguists and cognitive scientists believe that these AI models have a good grasp of syntax, or even a faint understanding of comprehension and reasoning, at least according to OpenAI, but the latter The point is highly controversial, and formal grammatical fluency is an idea.

GPT-4 is the latest milestone in this research on language, as well as part of a broader explosion of generative AI (programs that can generate images, text, code, music, and videos at your prompts). . If such software delivers on its grand promise, it could redefine human perception and creativity. Just as the internet, writing and even fire have done before. We are assembling a new iteration of LLM as a step towards the company’s declared mission to create an artificial general intelligence – a computer that can learn everything and excel – in a profitable way. OpenAI CEO Sam Altman told the New York Times that while GPT-4 doesn’t solve reasoning or intelligence, it’s a big step up from what already exists.

With AGI’s goals in mind, the organization began as a non-profit providing much public documentation of its code. However, it soon adopted a capped profit structure, allowing investors to earn returns of up to 100 times their investment, with all profits above that going back to the non-profit, and OpenAI to support research. We are now able to raise the necessary funds to (Analysts estimate that training a high-end language model costs millions of dollars.) In addition to the economic shift, OpenAI has taken a more secretive approach to its code. Critics say the technology makes it harder to avoid making mistakes and taking responsibility. It’s a harmful output, but the company says the opacity protects against malicious use.

The company frames its shift from its founding values ​​as a compromise that, at least in theory, accelerates its arrival at an AI-saturated future that Altman describes as almost paradise. Drug discovery and basic science, the end of unskilled labor. But more advanced AI, generally intelligent or not, will put large segments of the population out of work, or replace rote jobs with new AI-related bureaucratic tasks and higher productivity demands. may be replaced. Email didn’t speed up communication so much that every day turned into a slog of replying to email. Electronic medical records should save doctors time, but in reality they have to spend extra unpaid time updating and consulting these databases.

Whether this technology is a boon or a burden to everyday people, those who control it will undoubtedly reap enormous benefits. Already everyone wants to be part of the AI ​​gold rush, just as OpenAI is heading towards commercialization and obscurity. Companies like Snap and Instacart are using OpenAIs technology to incorporate AI assistants into their services. Earlier this year, Microsoft invested his $10 billion in OpenAI and is now embedding chatbot technology into the Bing search engine. Google followed suit with a small investment in rival AI startup Anthropic (which recently had his $4.1 billion valuation), announcing various AI features in Google Search, Maps, and other apps. Amazon will incorporate his popular website Hugging Facea, which provides easy access to AI tools, into his AWS to keep up with Microsoft’s competing cloud service, Azure. Meta has had an AI department for a long time, and now Mark Zuckerberg is trying to build a specific generative AI team out of the pixelated ashes of the Metaverse. The startup has invested billions of dollars in venture capital investment. GPT-4 has the potential to further enhance the new Bing, integrate it into Microsoft Office, and automate many tasks.

At an event last month announcing the new Bing, powered by ChatGPT, Microsoft’s CEO said: In fact, GPT-4 is already here. But that quote should end quickly and break things, as any good text prediction tool will tell you. The Silicon Valley rush shouldn’t distract us from all the ways these technologies fail, whether it’s towards gold or his AGI.

LLM is great at making boilerplate copy, but many critics say they don’t, and probably can’t, understand the world fundamentally. They are like PCP’s autocompletes, giving the user a false sense of invincibility and increasing delusional abilities. That means you can easily spread convincing lies and reprehensible hatred. GPT-4 seems to earn its critics with its obvious ability to describe images, but its basic functionality remains very good at pattern matching and can only output text.

Those patterns are sometimes harmful. There are concerns that language models tend to replicate much of the sleazy text on the internet, only increasing the lack of transparency in their design and training. As Emily Bender told me in an email, we generally don’t eat foods whose ingredients we don’t know or can’t find.

The precedent shows that a ton of crap was built in. Microsoft’s original chatbot, released in 2016 under the name Tay, became misogynistic and racist, and soon Discontinued. Last year, Metas’ BlenderBot AI rehashed an anti-Semitic conspiracy, citing the company’s Galacticaa model, which was meant to help write scientific papers, as being biased and prone to information invention. (Meta removed it within 3 days). GPT-2 showed bias against women, queer people, and other demographic groups. GPT-3 said racist and sexist things. ChatGPT was also accused of making similarly toxic comments. OpenAI tried to fix the problem each time, but failed. Incorporating a more powerful version of ChatGPT, his New Bing uniquely produces offensive and offensive texts that teach children ethnic slurs, promote Nazi slogans, and invent scientific theories. I have written.

As the language model GPT-4 showed, it automatically makes me want to write the next sentence in this cycle [insert bias here]In fact, OpenAI argued in its blog post that GPT-4 is hallucinating facts, making inference errors, that fact checking itself has not improved much, and that its output may contain various biases. I admit there is. Yet, as any ChatGPT user can attest, even the most compelling patterns don’t yield perfectly predictable results.

A Meta spokesperson said in an email that much more work is needed to address what researchers call bias and hallucinations, the information AI invented in large-scale language models, including BlenderBot and Galactica. public research demos are critical to building better chatbots. A Microsoft spokesperson pointed to a post in which the company explained that he was improving Bing through the following virtuous cycle. [user] feedback. An OpenAI spokesperson pointed to a blog post about safety. This blog post outlines the company’s approach to preventing misuse. For example, be aware that by actually testing the product and receiving feedback, we can improve future iterations. In other words, the Big AI faction believes that even though programs can be dangerous, the only way to discover and improve them is to release them and risk putting the public at risk. It is a utilitarian calculation that there is.

As researchers pay more and more attention to bias, future iterations of language models such as GPT-4 may one day break this established pattern. But even if the new model proves its worth, there are still much bigger problems to tackle. In other words, who is this technology for? Whose lives is it disrupting? If you don’t like the answers, what can you do to challenge them?

