Download PDFOpen PDF in browser

On the Gap Between AI-Generated and Human-Written Patent Texts

EasyChair Preprint 15361

10 pagesDate: November 4, 2024

Abstract

Since the GPT-X models have made progress in generative tasks, a large number of large language models (LLMs) have sprung up. When the powerful features of LLMs have attracted the interest of numerous researchers, their misuse has also become a source of growing concern for human beings. In fact, LLMs have been used to generate fake news, fake academic papers, and fake patent application documents. Detecting whether content is generated by artificial intelligence (AI) has been a significant problem. Unfortunately, to our knowledge, there is currently no existing research focused on AI-generated patent text detection, nor are there any datasets tailored for patents publicly available. In this paper, to explore the differences between AI-generated and human-written patent texts, we generate a set of patent abstract texts by ChatGPT, in Chinese and English, from granted patent claims. Each generated patent abstract text corresponds to its original patent abstract. We analyze the linguistic characteristics of two types of patent texts by various comparison experiments. We anticipate that our work can assist people in identifying the patents generated by AI from the ocean of patents.

Keyphrases: Artificial Intelligence Generation, Large Language Model, Patent Texts, Text Generation Detection

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:15361,
  author    = {Zhanhao Xiao and Wei Hu and Yanqiang Wu and Weiqi Chen and Huihui Li and Xiaoyong Liu},
  title     = {On the Gap Between AI-Generated and Human-Written Patent Texts},
  howpublished = {EasyChair Preprint 15361},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browser