Download PDFOpen PDF in browserTeam Zhang at Factify 2: Unimodal Feature-Enhanced and Cross-Modal Correlation Learning for Multi-Modal Fact VerificationEasyChair Preprint 1010415 pages•Date: May 12, 2023AbstractIn recent years, social media has enabled users to get exposed to a myriad of misinformation and disinformation which have attracted a great deal of attention in research fields. Despite the progress in text-based fact-checking, there has been very limited work on applying multi-modal techniques to fact verification. In this work, we propose a novel unimodal feature-enhanced and cross-modal correlation learning approach (UFCC) for multi-modal fact verification by jointly modeling the basic intra-modal semantic correlation and the inter-modal correlation. Specifically, UFCC consists of a text-semantic feature Module, an image-semantic feature module and a text-image correlation module. In the textsemantic feature module, UFCC exploits pre-trained backbones to separately extract text features from claims and documents. Then we utilize the signed attention mechanism to enhance text information representation by using different text features as query. The image-semantic feature module is similar to the text. In the text-image correlation module, UFCC first adopts the fine-tuned clip model to encode the claims’ (or documents’) textual and visual features. Then, UFCC explore the cross-modal relationships between the extracted features by using similarity layer. Based on this, we finally fuse the text and image features for better performance. Our team, Zhang, won the fourth prize (F1-score: 77.423%) in Factify challenge hosted by De-Factify2 @ AAAI 2023, which demonstrated the effectiveness of the method. Keyphrases: De-Factify2, Multi-modal fact verification, attention, fine-tuned clip
|