Download PDFOpen PDF in browserOn the Gap Between AI-Generated and Human-Written Patent TextsEasyChair Preprint 1536110 pages•Date: November 4, 2024AbstractSince the GPT-X models have made progress in generative tasks, a large number of large language models (LLMs) have sprung up. When the powerful features of LLMs have attracted the interest of numerous researchers, their misuse has also become a source of growing concern for human beings. In fact, LLMs have been used to generate fake news, fake academic papers, and fake patent application documents. Detecting whether content is generated by artificial intelligence (AI) has been a significant problem. Unfortunately, to our knowledge, there is currently no existing research focused on AI-generated patent text detection, nor are there any datasets tailored for patents publicly available. In this paper, to explore the differences between AI-generated and human-written patent texts, we generate a set of patent abstract texts by ChatGPT, in Chinese and English, from granted patent claims. Each generated patent abstract text corresponds to its original patent abstract. We analyze the linguistic characteristics of two types of patent texts by various comparison experiments. We anticipate that our work can assist people in identifying the patents generated by AI from the ocean of patents. Keyphrases: Artificial Intelligence Generation, Large Language Model, Patent Texts, Text Generation Detection
|