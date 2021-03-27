



Open Domain Long On Answer (LFQA) forms are the basics of natural language processing (NLP), including taking documents related to a particular query and using them to generate detailed paragraph-length answers. Ask questions about challenges.

Recently, significant progress has been made in factoid open domain question answering (QA). This technique allows you to answer a question with a short phrase or entity, but it does much less work with a long question answering (LFQA). LFQA is an important task. This is primarily to provide a testbed for measuring the facts of the generated text model. However, current benchmarks and metrics are not well suited to advance LFQA.

In a recent paper, “Hurdles to Progress in Long-form Question Answering,” to be published in NAACL 2021, Google.ai takes advantage of two recent advances in NLP to create a new open-domain long-form question answering. Introducing the system. One is a state-of-the-art sparse attention model such as Routing Transformers (RT). This allows you to scale your attention-based model to longer sequences. The other is a search-based model such as REALM that makes searching easier. The number of Wikipedia articles related to a particular query.

The system combines information from multiple retrieved Wikipedia articles related to a particular question before generating an answer. This brings a new cutting edge in ELI5. This is the only large dataset published for long-form question answering.

However, while the system is at the top of the public leaderboard, researchers have found some alarming trends in the ELI5 dataset and associated metrics. In particular, they find little evidence that the model uses a search that beats modern systems with conditions and trivial baselines (such as input copies). Researchers also observed that the dataset had significant train / validation duplication. This white paper proposes mitigation strategies for each of these challenges.

Text generation

The main component of the NLP model is the Transformer architecture. Every token in the sequence corresponds to every other token in the series, making it a model that can be quadratically scaled according to the length of the sequence. The RT model introduces a dynamic content-based mechanism that reduces the complexity of attention in the Transformer model.

An important element of RT work is that each token, which corresponds to all other tokens, is often redundant and can be estimated with a combination of local and global attention. The RT model is pre-trained on the Project Gutenberg (PG-19) dataset for language modeling purposes.

Information retrieval

Researchers have demonstrated the effectiveness of the RT model by combining it with a search from REALM. The REALM model is a search-based model that utilizes the largest dot product search to fetch Wikipedia articles related to a particular query or question. Researchers have improved the quality of REALM searches by using contrasting losses.

Evaluation

The model was tested with a long question answering using the ELI5 dataset, which is part of the KILT benchmark and is the only large LFQA dataset published. We then tweaked the pre-trained RT model and obtained it from c-REALM on the KILT ELI5 dataset.

Submission is first made on the KILT leaderboard for long questions that answer ELI5 with a total KILTRL score of 2.36. The model is at the top of the leaderboard, but there were some challenges associated with it.

Researchers have found little or no evidence that the model is next-generation-based in the retrieved documents. They also found a large overlap in ELI5 training, validation, and test sets. In addition, the Rouge-L metric used to evaluate the quality of text generation had problems with a trivial, nonsensical baseline. Researchers hope that the community will work together to solve these problems and enable researchers to make significant progress in this area.

