Connect with us


Creating Input and Output Safeguards | Responsible Generative AI Toolkit | Google for Developers

Creating Input and Output Safeguards | Responsible Generative AI Toolkit | Google for Developers


Generative AI applications often rely on input and output data filtering (sometimes called safeguards) to ensure responsible model behavior. Input and output filtering techniques check that the data going into and out of a model complies with policies that you define for your application.

Ready-made safety net

Even with pre-safety tuning and well-designed prompt templates, it's still possible for a model to output unintended damaging content. To further improve this, content classifiers can add an extra layer of protection. Content classifiers can be applied to both inputs and outputs.

Input classifiers are typically not intended for use in applications and are used by the model to filter content that may violate safety policies. Input filters often target adversarial attacks that attempt to circumvent content policies. Output classifiers can further filter model outputs and catch unintended productions that may violate safety policies. It is recommended to have classifiers that cover all content policies.

Google provides an API-based classifier for content safety that can be used to filter the input and output of the system.

The Perspective API is a free API that uses machine learning models to score the impact a comment may have on a conversation. It provides a score indicating the likelihood that a comment is harmful, threatening, abusive, or off-topic. The Text Moderation service is a Google Cloud API, available under certain usage limits, that uses machine learning to analyze documents against a set of safety attributes, including various categories of potentially harmful and topics that may be considered sensitive.

It is important to evaluate how well off-the-shelf classifiers meet your policy objectives and qualitatively evaluate failure cases. It is also important to keep in mind that excessive filtering can not only cause unintended damage, but also reduce the usefulness of your application. This means that it is important to also look at cases where excessive filtering may be occurring. For more information on such evaluation methods, see Evaluating the Safety of Models and Systems.

Create a customized safety classifier

There are several reasons why an out-of-the-box safeguard may not be suitable for your use case – for example, you may have unsupported policies, or you may need to further tune the safeguard with data that confirms it impacts your system. In this case, agile classifiers provide an efficient and flexible framework for creating custom safeguards by tailoring a model like Gemma to your needs, while still giving you full control over where and how you deploy them.

Gemma Agile Classifier Tutorial

In the agile classifier codelab and tutorial, we use LoRA to fine-tune the Gemma model to act as a content moderation classifier using the KerasNLP library. Using only 200 examples from the ETHOS dataset, this classifier achieves an F1 score of 0.80 and a ROC-AUC score of 0.78, which compares favorably with state-of-the-art leaderboard results. When trained on 800 examples, similar to the other classifiers on the leaderboard, the Gemma-based agile classifier achieves an F1 score of 83.74 and a ROC-AUC score of 88.17. You can adapt the tutorial steps to further improve this classifier or create your own custom safety classifier safeguards.

Best practices for setting up safety measures

We strongly encourage the use of safety classifiers as a safeguard. However, if content is blocked, your guardrails may prevent your generative model from generating anything to the user. Your application must be designed to handle this case. Most common chatbots handle this by providing a canned answer (“Sorry, I'm a language model and can't fulfill this request”).

Find the right balance between usefulness and harmlessness: When using a safety classifier, it is important to understand that it can make mistakes, including both false positives (such as claiming the output is unsafe when it is not) and false negatives (not labeling the output as unsafe when it is). Evaluating your classifier with metrics such as F1, precision, recall, and AUC-ROC can help you determine how to trade off false positive and false negative errors. By varying the classifier threshold, you can find the ideal balance that avoids over-filtering the output while still providing adequate safety.

Check your classifier for unintended bias: Safety classifiers, like any ML model, can propagate unintended biases, such as socio-cultural stereotypes. Applications should be appropriately evaluated for potentially problematic behavior. In particular, content safety classifiers can over-trigger for content related to identities that are more frequently targeted by abusive language online. For example, when the Perspective API was first released, the model returned high toxicity scores on comments that referenced certain identity groups (blog). This over-triggering behavior can occur because comments that refer to identity terms for more frequently targeted groups (words like “black”, “Muslim”, “feminist”, “woman”, “gay” etc.) are often more harmful in nature. If the dataset used to train the classifier has a large imbalance of comments containing certain words, the classifier may over-generalize and consider all comments containing those words to be potentially unsafe. Read how the Jigsaw team mitigated this unintended bias.

Developer Resources




The mention sources can contact us to remove/changing this article

What Are The Main Benefits Of Comparing Car Insurance Quotes Online

LOS ANGELES, CA / ACCESSWIRE / June 24, 2020, / Compare-autoinsurance.Org has launched a new blog post that presents the main benefits of comparing multiple car insurance quotes. For more info and free online quotes, please visit https://compare-autoinsurance.Org/the-advantages-of-comparing-prices-with-car-insurance-quotes-online/ The modern society has numerous technological advantages. One important advantage is the speed at which information is sent and received. With the help of the internet, the shopping habits of many persons have drastically changed. The car insurance industry hasn't remained untouched by these changes. On the internet, drivers can compare insurance prices and find out which sellers have the best offers. View photos The advantages of comparing online car insurance quotes are the following: Online quotes can be obtained from anywhere and at any time. Unlike physical insurance agencies, websites don't have a specific schedule and they are available at any time. Drivers that have busy working schedules, can compare quotes from anywhere and at any time, even at midnight. Multiple choices. Almost all insurance providers, no matter if they are well-known brands or just local insurers, have an online presence. Online quotes will allow policyholders the chance to discover multiple insurance companies and check their prices. Drivers are no longer required to get quotes from just a few known insurance companies. Also, local and regional insurers can provide lower insurance rates for the same services. Accurate insurance estimates. Online quotes can only be accurate if the customers provide accurate and real info about their car models and driving history. Lying about past driving incidents can make the price estimates to be lower, but when dealing with an insurance company lying to them is useless. Usually, insurance companies will do research about a potential customer before granting him coverage. Online quotes can be sorted easily. Although drivers are recommended to not choose a policy just based on its price, drivers can easily sort quotes by insurance price. Using brokerage websites will allow drivers to get quotes from multiple insurers, thus making the comparison faster and easier. For additional info, money-saving tips, and free car insurance quotes, visit https://compare-autoinsurance.Org/ Compare-autoinsurance.Org is an online provider of life, home, health, and auto insurance quotes. This website is unique because it does not simply stick to one kind of insurance provider, but brings the clients the best deals from many different online insurance carriers. In this way, clients have access to offers from multiple carriers all in one place: this website. On this site, customers have access to quotes for insurance plans from various agencies, such as local or nationwide agencies, brand names insurance companies, etc. "Online quotes can easily help drivers obtain better car insurance deals. All they have to do is to complete an online form with accurate and real info, then compare prices", said Russell Rabichev, Marketing Director of Internet Marketing Company. CONTACT: Company Name: Internet Marketing CompanyPerson for contact Name: Gurgu CPhone Number: (818) 359-3898Email: [email protected]: https://compare-autoinsurance.Org/ SOURCE: Compare-autoinsurance.Org View source version on accesswire.Com:https://www.Accesswire.Com/595055/What-Are-The-Main-Benefits-Of-Comparing-Car-Insurance-Quotes-Online View photos


to request, modification Contact us at Here or [email protected]