



Text-generating AI models like OpenAI’s GPT-4 are much touted, but they make many mistakes, some of which are harmful. James Vincent of The Verge once called one of his such models “an emotion-manipulating liar.”

The companies behind these models say they are taking steps to fix issues, such as implementing filters and teams of human moderators to fix issues when they are flagged. But there is no single right answer. Even today’s best models are susceptible to bias, toxicity, and malicious attacks.

In pursuit of a “safer” text generation model, Nvidia today announced NeMo Guardrails, an open-source toolkit aimed at making AI-powered apps more “accurate, relevant, and on-topic safe.” has been released.

Nvidia’s vice president of applied research, Jonathan Cohen, said the company had been working on the underlying system for Guardrails for “years”, but it didn’t fit well with models along the lines of GPT-4 and ChatGPT. He said he found out about a year ago. .

“Since then, we have been working towards this release of NeMo Guardrails,” Cohen told TechCrunch via email. “AI model safety tools are essential for deploying models in his case for enterprise use.”

Guardrails includes code, examples, and documentation for “adding safety” to AI apps that generate text as well as speech. Nvidia claims the toolkit is designed to work with most generative language models, allowing developers to create rules with just a few lines of code.

Specifically, guardrails are used to prevent, or at least prevent, models from straying from the topic, responding with inaccurate information or toxic language, or connecting to “unsafe” external sources. you can try For example, consider preventing a customer service assistant from answering questions about the weather, or a search engine chatbot linking to a disreputable academic journal.

“Ultimately, developers use Guardrails to control things outside the scope of their applications,” Cohen said. “They may develop guardrails that are too broad or conversely too narrow for their use cases.”

Universally correcting language model shortcomings sounds too good to be true, but it is. Companies like Zapier use guardrails to add a layer of safety to their generative models, but Nvidia admits their toolkit is not incomplete. That is, it doesn’t catch everything.

Cohen also says Guardrails works best in models like ChatGPT that are “good enough to follow orders” and use the popular LangChain framework for building AI-powered apps. . This disqualifies some of the open source options.

And — tech effectiveness aside — it should be emphasized that Nvidia isn’t necessarily releasing guardrails wholeheartedly. It’s part of the company’s NeMo framework, available through Nvidia’s enterprise AI software suite and NeMo fully managed cloud services. Any company can implement an open source release of Guardrails, but Nvidia wants you to pay for the hosted version instead.

So guardrails don’t seem to do any harm, but be aware that they’re not a silver bullet. Also, be careful if Nvidia claims otherwise.

