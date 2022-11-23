



According to Meta, Galactica can summarize academic papers, solve math problems, generate wiki articles, write scientific code, annotate molecules and proteins, and more. But soon after launch, it was very difficult for outsiders to provide models with scientific research on the benefits of homophobia, antisemitism, suicide, eating glass, being white, or being male. It was easy. Meanwhile, papers on AIDS and racism were blocked. Charming!

As my colleague Will Douglas Haven writes about the debacle in his book: Methus’ mistake and its arrogance lead Big Tech to have a blind spot about the serious limitations of huge language models. again shows that

Galacticas’ launch was not only premature, but it also shows how AI researchers have failed to make large-scale language models more secure.

Meta may have been confident that Galactica was better than its competitors at producing scientific-sounding content. But independent testing of the model’s bias and veracity should have dissuaded the company from making it public.

One common way researchers go to make large language models less likely to spew toxic content is to filter out certain keywords. But it’s hard to create a filter that captures all the nuances that humans can find offensive. With more adversarial testing of Galactica, the company could have avoided a world of trouble. The researchers would have tried to regurgitate as many different and biased results into Galactica as possible.

Metas researchers measured the model’s bias and truthfulness and performed slightly better than competitors such as GPT-3 and Metas’ own OPT model, but many of the answers were biased or inaccurate. Got an answer. There are also some other restrictions. The model is trained on open access scientific resources, but many scientific papers and textbooks are paid and restricted. This inevitably pushes Galactica to use more sketchy secondary sources.

Galactica also seems to be an example of not needing AI. I doubt that Metas’ goal of enabling scientists to work faster will be met. In fact, a great deal of effort must be expended to verify whether the information from the model is accurate.

It’s really disappointing (and yet not entirely surprising) to see such a flawed technology hyped by a large AI lab that should know better. We know that language models tend to reproduce prejudices and claim falsehoods as facts. We know they can hallucinate and fabricate content, such as a wiki article on the history of bears in space. But the debacle helped in at least one way. The only thing a large language model knows for sure is how words and sentences are formed. Everything else is guesswork.

