



The Google AI blog has announced KELM, a method that can be used to reduce search bias and toxic content (open domain question answering). Converts Knowledge Graph facts into natural language text using a method called TEKGEN. You can use this text to improve your natural language processing model.

What is KELM?

KELM is an acronym for Knowledge-Enhanced Language Model Pre-training. Natural language processing models like BERT are usually trained on the web and other documents. KELM proposes to add reliable factual content (enhanced knowledge) to the pre-training of the language model to improve factual accuracy and reduce bias.

TEKGEN transforms the structured data of the knowledge graph into natural language text that KELMCorpusKELM uses reliable data.

Google researchers have suggested using the Knowledge Graph to improve the accuracy of the facts, as it is a reliable source of facts.

“An alternative source is the Knowledge Graph (KG), which consists of structured data. Information is usually extracted from more reliable sources, and post-processing filters and human editors use inappropriate and inaccurate content. KG is factual to ensure removal. “

Does Google use KELM?

Google does not indicate if KELM is being used. KELM is an approach to pre-training language models, showing powerful potential and summarized on the Google AI blog.

Bias, factual accuracy, search results

According to research papers, this approach improves the accuracy of facts:

“It offers the added benefit of increased factual accuracy and reduced toxicity in the resulting language model.”

This study is important because reducing bias and increasing the accuracy of facts can affect your site’s ranking.

But until KELM is used, there is no way to predict how it will affect it.

Google is not currently fact checking search results.

When KELM is introduced, it can effectively affect sites that encourage false statements and ideas.

KELM can have more impact than search

The KELM corpus was released under a Creative Commons license (CC BY-SA 2.0).

So, in theory, other companies (Bing, Facebook, Twitter, etc.) can also use it to improve their pre-training in natural language processing.

In that case, the impact of KELM can spread to many search and social media platforms.

Indirect connection to MUM

Google also showed that the next-generation MUM algorithm will not be released until Google is convinced that bias does not adversely affect the answer it gives.

According to the announcement of Google MUM:

“As we have carefully tested many BERT applications released after 2019, MUM goes through the same process of applying these models to search, specifically avoiding the introduction of bias. In order to look for patterns that may indicate machine learning bias. In our system. “

The KELM approach is specifically aimed at reducing bias and can be useful in developing MUM algorithms.

Machine learning can produce biased results

Research papers state that the data used by natural language models such as BERT and GPT-3 for training can lead to “toxic content” and bias.

Computing has the old acronym GIGO, which stands for Garbage In – Garbage Out. That is, the quality of the output depends on the quality of the input.

If the one training the algorithm is of high quality, the result will be of high quality.

Researchers suggest improving the quality of data trained with technologies such as BERT and MUM to remove bias.

Knowledge graph

The Knowledge Graph is a collection of facts in a structured data format. Structured data is a markup language that conveys specific information in a way that is easy for machines to use.

In this case, the information is facts about people, places, and things.

Google Knowledge Graph was introduced in 2012 as a way to help Google understand the relationships between things. Therefore, when someone asks about Washington, Google may be able to identify whether the person asking the question is asking about Washington, or whether it is that person, state, or District of Columbia.

It was announced that Google’s Knowledge Graph consists of data from trusted fact sources.

Google’s 2012 announcement characterized Knowledge Graph as the first step in building the next generation of searches that we are currently enjoying.

Knowledge Graph and Factual Accuracy

Knowledge graph data is used in this research treatise to improve Google’s algorithms because the information is reliable and reliable.

Google’s research treatise suggests integrating knowledge graph information into the training process to remove bias and increase the accuracy of facts.

Google’s research suggests two things.

First, you need to convert your knowledge base to natural language text. You can then integrate the resulting corpus named Knowledge-Enhanced Language Model Pre-training (KELM) into the pre-training of the algorithm to reduce bias.

Researchers explain the problem as follows:

“Large, pre-trained natural language processing (NLP) models such as BERT, RoBERTa, GPT-3, T5, and REALM leverage a web-derived, task-specific, fine-tuned natural language corpus. Masu …

However, natural language text alone limits the scope of knowledge … Moreover, the presence of false information or toxic content in the text can ultimately bias the resulting model. “

From Structured Data in Knowledge Graph to Natural Language Text

Researchers say the problem with integrating knowledge base information into training is that knowledge base data is in the form of structured data.

The solution is to use a natural language task called Data to Text Generation to convert the structured data in the knowledge graph to natural language text.

They explained that because it is difficult to generate text from data, they created a new “pipeline” called “text from the KG generator (TEKGEN)” to solve the problem.

Citation: Knowledge Graph-based Synthetic Corpus Generation for Pre-Training of Knowledge-Enhanced Language Models (PDF)

TEKGEN natural language text has improved the accuracy of facts

TEKGEN is a technology created by researchers to convert structured data into natural language text. It is this final result, the fact text, that can be used to create the KELM corpus. This corpus can be used as part of pre-training in machine learning to help prevent bias from penetrating the algorithm.

Researchers noted that adding this additional knowledge graph information (corpus) to the training data improved the accuracy of the facts.

The TEKGEN / KELM treatise states:

“Furthermore, we show that by verbalizing a comprehensive and encyclopedic KG like Wikidata, we can integrate a structured KG with a natural language corpus.

… Our approach translates KG into natural text so that it can be seamlessly integrated into existing language models. The resulting language model has the added benefit of increased factual accuracy and reduced toxicity. “

The KELM article published a diagram showing how one structured data node is concatenated and converted into natural text (verbalization) from it.

I divided the illustration into two parts.

Below is an image showing the structured data of the knowledge graph. The data is bound to the text.

Screenshot of the first part of the TEKGEN conversion process

The image below shows the next steps in the TEKGEN process of taking concatenated text and converting it to natural language text.

Screenshot of text converted to natural language text

KELM corpus generation

There is another diagram showing how KELM natural language texts that can be used for pre-training are generated.

The TEKGEN treatise shows this figure and explanation.

“In step 1, the KG triple is aligned with the Wikipedia text using remote monitoring. In steps 2 and 3, T5 is first tweaked in order on this corpus, then a few steps on the WebNLG corpus. In step 4, BERT combines steps 2, 3 and 4 to form a TEKGEN. To generate a KELM corpus, in step 5, the relation pair alignment count from the generated training corpus. Create an entity subgraph using. In step 1, the subgraph triples are converted to natural text using TEKGEN. “

KELM works to reduce bias and increase accuracy

In a KELM article published on Google’s AI blog, KELM has a real application, especially question answering tasks explicitly related to information retrieval (search) and natural language processing (technologies such as BERT and MUM). Is stated to be included.

Google is studying a lot, some of which seem to be exploring what is possible, but otherwise it looks like a dead end. Research that probably won’t be incorporated into Google’s algorithms usually ends with a statement that more research is needed because the technology somehow doesn’t meet expectations.

However, that is not the case for KELM and TEKGEN studies. This article is, in fact, optimistic about the actual application of the discovery. This tends to make it more likely that KELM will eventually be searchable in some way.

This is how researchers conclude an article on KELM to reduce bias.

“This is a real application of knowledge-intensive tasks that require the provision of factual knowledge, such as question answering. In addition, such corpora can be applied to pre-training of large language models and are toxic. May reduce and improve the facts. “

Will KELM be used soon?

Google’s recently announced MUM algorithm requires the accuracy with which the KELM corpus was created. However, KELM applications are not limited to MUM.

Reducing bias and factual accuracy is a major concern in today’s society, and the fact that researchers are optimistic about the results tends to be more likely to be used in some way in future searches. There is.

Quote

KELM KELM Google AI article: Knowledge graph and language model pre-training corpus integration

KELM Research Paper (PDF) Knowledge Graph-Based Synthetic Corpus Generation for Pre-Training of Knowledge-Enhanced Language Models

TEKGEN Training Corpus on GitHub

LOS ANGELES, CA / ACCESSWIRE / June 24, 2020, / Compare-autoinsurance.Org has launched a new blog post that presents the main benefits of comparing multiple car insurance quotes. For more info and free online quotes, please visit https://compare-autoinsurance.Org/the-advantages-of-comparing-prices-with-car-insurance-quotes-online/ The modern society has numerous technological advantages. One important advantage is the speed at which information is sent and received. With the help of the internet, the shopping habits of many persons have drastically changed. The car insurance industry hasn't remained untouched by these changes. On the internet, drivers can compare insurance prices and find out which sellers have the best offers. View photos The advantages of comparing online car insurance quotes are the following: Online quotes can be obtained from anywhere and at any time. Unlike physical insurance agencies, websites don't have a specific schedule and they are available at any time. Drivers that have busy working schedules, can compare quotes from anywhere and at any time, even at midnight. Multiple choices. Almost all insurance providers, no matter if they are well-known brands or just local insurers, have an online presence. Online quotes will allow policyholders the chance to discover multiple insurance companies and check their prices. Drivers are no longer required to get quotes from just a few known insurance companies. Also, local and regional insurers can provide lower insurance rates for the same services. Accurate insurance estimates. Online quotes can only be accurate if the customers provide accurate and real info about their car models and driving history. Lying about past driving incidents can make the price estimates to be lower, but when dealing with an insurance company lying to them is useless. Usually, insurance companies will do research about a potential customer before granting him coverage. Online quotes can be sorted easily. Although drivers are recommended to not choose a policy just based on its price, drivers can easily sort quotes by insurance price. Using brokerage websites will allow drivers to get quotes from multiple insurers, thus making the comparison faster and easier. For additional info, money-saving tips, and free car insurance quotes, visit https://compare-autoinsurance.Org/ Compare-autoinsurance.Org is an online provider of life, home, health, and auto insurance quotes. This website is unique because it does not simply stick to one kind of insurance provider, but brings the clients the best deals from many different online insurance carriers. In this way, clients have access to offers from multiple carriers all in one place: this website. On this site, customers have access to quotes for insurance plans from various agencies, such as local or nationwide agencies, brand names insurance companies, etc. "Online quotes can easily help drivers obtain better car insurance deals. All they have to do is to complete an online form with accurate and real info, then compare prices", said Russell Rabichev, Marketing Director of Internet Marketing Company. CONTACT: Company Name: Internet Marketing CompanyPerson for contact Name: Gurgu CPhone Number: (818) 359-3898Email: [email protected]: https://compare-autoinsurance.Org/ SOURCE: Compare-autoinsurance.Org View source version on accesswire.Com:https://www.Accesswire.Com/595055/What-Are-The-Main-Benefits-Of-Comparing-Car-Insurance-Quotes-Online View photos