Bilingual COVID-19 Fake News Detection Based on LDA Topic Modeling and BERT Transformer

Published in 2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA), 2023

Authors

Pouria Omrani, Zahra Ebrahimian, Ramin Toosi, Mohammad Ali Akhaee

Abstract

The spread of fake news has become more prevalent given the popularity of social media and the various news that circulates on it. As a result, it is crucial to discern between real and fake news. During the COVID-19 pandemic, there have been numerous tweets, posts, and news about this illness in social media and electronic media worldwide. This research presents a bilingual model combining Latent Dirichlet Allocation (LDA) topic modeling and the BERT transformer to detect COVID-19 fake news in both Persian and English. First, the dataset is prepared in Persian and English, and then the proposed method is used to detect COVID-19 fake news on the prepared dataset. Finally, the proposed model is evaluated using various metrics such as accuracy, precision, recall, and the f1-score. As a result of this approach, we achieve 92.18% accuracy, which shows that adding topic information to the pre-trained contextual representations given by the BERT network, significantly improves the solving of instances that are domain-specific. Also, the results show that our proposed approach outperforms previous state-of-the-art methods.