MNPF: A Multimedia News Processing Framework For Fake News Detection Using OCR, Speech-to-Text, And Transformer-Based Classification | International Journal of Science, Engineering and Technology

Authors: Yogesh Sopan Modhe, Prof. (Dr.) D. B. Kshirsagar

Abstract: The rapid proliferation of digital communication platforms has enabled fake news to spread across diverse media formats, including textual articles, screenshots, scanned documents, images, and audio recordings. Most existing fake news detection systems assume that news content is already available in structured textual form, thereby neglecting the practical challenge of extracting information from heterogeneous multimedia sources. This limitation significantly reduces their effectiveness in real-world misinformation analysis. To address this gap, this paper proposes the Multimedia News Processing Framework (MNPF) — a unified architecture that integrates Optical Character Recognition (OCR), Speech-to-Text conversion, text preprocessing, feature extraction, and multi-paradigm classification for multimedia-aware fake news detection. The MNPF processes image-based news, scanned documents, screenshots, and audio content and transforms them into a standardized textual corpus. Six feature extraction techniques spanning statistical (TF-IDF), semantic (Word2Vec, GloVe, FastText), and contextual (BERT, XLNet) representations are systematically compared. Nine classification architectures from Machine Learning (Logistic Regression, SVM, Random Forest, AdaBoost), Deep Learning (CNN, LSTM, BiLSTM), and Transformer (BERT, XLNet) paradigms are evaluated using three benchmark datasets: LIAR, ISOT, and WELFake. Experimental results demonstrate that the MNPF pipeline preserves sufficient semantic fidelity through OCR and speech extraction (EasyOCR: 96.2%, Whisper: 97.1%) for accurate downstream classification. Among all evaluated models, XLNet achieves the highest performance with 98.1% accuracy, 97.8% precision, 97.6% recall, 97.7% F1-score, and 0.99 ROC-AUC. The proposed framework bridges the critical gap between multimedia content processing and intelligent fake news detection, providing a scalable and practically deployable solution for real-world misinformation analysis. The findings further establish a strong experimental foundation for the development of advanced hybrid architectures for fake news detection.

DOI: http://doi.org/10.5281/zenodo.20642234