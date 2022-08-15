



Describe its methodology and use cases, and clarify its differences from PatentBERT and DeepPatent

If you’re going to market to identify competitors’ patents, you shouldn’t have a rough idea of ​​the numbers. I am working on a patent and needed to validate it to the extent that it could be submitted for examination. After researching multiple websites, information sources, and legal services, I came across a feature developed by Google. [2]help with these tasks. Another BERT model. Additionally, I came across two other approaches: DeepPatent and PatentBERT. After figuring out what was what and activating it for my task, I decided to clarify them myself.

A patent is a form of intellectual property that gives its owner the legal right to exclude others from making, using, selling, or importing the invention for a specified, disclosed period. Therefore, there are time constraints. This is a complex area that spans how time is applied in practice and in principle. I won’t cover it here. ) To expand further, the inventor essentially takes away a legal document that grants them the exclusive legal right to make, use, and sell the invention. As a form of protection, a patent protects an invention from being invented, used, or sold by others without the inventor’s permission.

There are over 20 million active patents and applications worldwide. Each patent contains about 10,000.

Two distinctions must be made upfront. (1) This is a model created by Google and trained using over 100 million patent publications. And these aren’t just patents that cover the United States. (2) This is a BERT model.

When Google released the BERT model in 2018 [2], researchers flocked to the results to validate its performance metrics.Delivering it is better than many other leading edge models [5] Overall natural language processing (NLP) benchmarks (GLUE, MultiNLI, SQuAD, etc.) [2]), many of us were the first to apply BERT to a myriad of NLP implementation pipelines.

Note, however, that the maximum input length for tokens is 512. [3] Compare with bart. Additionally, 8,000 words have been added (as opposed to his standard BERT vocabulary). [3].

One way to see the result description is to look at the context token indicator. It is categorized using the following labels: (1) Abstract. (2) Claims. (3) Summary. (4) Invention [3]This categorical representation crosswalks text to patent information.

A Boolean search in the context of identifying information about patents is a type of search that requires the use of specific terms. The challenge with Boolean searches then becomes how to identify what is required to perform that search. The terminology used in patents is intentionally constructed to be unambiguous, creating difficulties when searching for patents.

The United States Patent and Trademark Office (USPTO) has provided nearly 9,000 examples to help you perform a Boolean search (here [6]as an example) applies to the entire Joint Patent Classification (CPC) code search.

Google created a custom tokenizer to signal specific terms used in patents. It’s like a dictionary that helps improve the accuracy of prediction tasks, especially for patent terms. They noted a 0.5% improvement. [9]in the hope that it will eventually allow for better interpretation, especially when it comes to synonym generation. [9].

outlined here [7], don’t confuse the two. Led by Jieh Hisang et al., they present a specific study on BERT-Base pre-trained models (his 110 million uncased parameters). [7]) was applied using different training datasets (unrelated to Google’s patented BERT model).

Yet another use case: DeepPatent demonstrates the task of analyzing and obtaining technical drawings [8] From a design patent. Interestingly, PatentBERT used the DeepPatents F1 score as a benchmark baseline. [7]Additionally, PatentBERT has demonstrated through their F1 Scores that they are superior. [7] deep patent.

In addition to addressing the methodology and foundations of this new approach, we briefly explained the differences between PatentBERT and DeepPatent. If you’d like to suggest an edit to this post or have any recommendations for further expanding this topic area, please share your thoughts with me.

