Authors :
Saranya N.; Karmukilan B.; Kishore P.; Mohammed Safik M.; Nithan Anto J.
Volume/Issue :
Volume 11 - 2026, Issue 3 - March
Google Scholar :
https://tinyurl.com/2n3t5363
Scribd :
https://tinyurl.com/26hk5sbz
DOI :
https://doi.org/10.38124/ijisrt/26mar1568
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Ever since the arrival of the digital world, misleading information through the use of manipulated pictures and
poorly constructed text on news websites and social media has become a world wide issue. Detection of such manipulated or
false information is required in order to protect the public opinion, state security, as well as trust to digital environments, so
in order to detect misinformation using a single input, which is a visual information that also has hidden text (memes, posters,
and screenshots), this work suggests a deep learning dual-stream fusion model. The system recognizes the text on the image
using optical character recognition (OCR) and derives the semantic meaning with the help of a BERT-BiLSTM model. Highlevel features of the image are simultaneously extracted by feeding it through a ResNet convolutional neural network. After
that, the two features (text and visual) are fused on a deep fusion layer and maintain context and multi-modal dependency
and generate a more robust and accurate classification. Such an integrated solution provides the final classifier that can
distinguish the authenticity of supplied image-text input and prevent the further propagation of the misinformation
introduced by multimedia sources and can greatly enhance the correctness of fake news recognition based on either Natural
Language Understanding or Visual Semantics.
Keywords :
Disinformation Detection, Sustainable Digital Infrastructure, Multimodal Deep Learning, BERT-BiLSTM.
References :
- Zidan, A., et al. (2025). Multimodal Fake News Detection using Cross-Modal Attention Fusion. IEEE Transactions on Neural Networks and Learning Systems, 36(4), 1256–1267.
- Ahmad, R., et al. (2025). BMMFN: Domain-Specific Multimodal Fake News Detection Framework. Expert Systems with Applications, 243, 123–145.
- Alam, F., et al. (2024). TM-FID: Transfer-Learned Multimodal Fake News Identification. Information Processing & Management, 61(2), 102–118.
- Song, Y., et al. (2023). MMCN: Multi-Modal Cross-Attention Network for Fake News Detection. Proceedings of the 2023 IEEE International Conference on Data Mining (ICDM), 802–809.
- Qian, H., et al. (2023). HMCAN: Hierarchical Multimodal Contextual Attention Network for Fake News Detection. Knowledge-Based Systems, 272, 110–129.
- Chen, X., et al. (2023). CAFE: Contextual Ambiguity-Aware Fusion for Multimodal Misinformation Detection. Pattern Recognition Letters, 169, 25–33.
- Yadav, R., et al. (2022). ETMA: Enhanced Transformer-based Multimodal Alignment for Fake News Detection. Neural Computing and Applications, 34(11), 8415–8430.
- Zhou, J., et al. (2022). FND-CLIP: Contrastive Learning for Multimodal Fake News Detection Using CLIP and BERT. ACM Multimedia Conference, 4120–4131.
- Akhtar, N., et al. (2023). A Comprehensive Survey on Automated Fact Checking and Multimodal Fake News Detection. ACM Computing Surveys, 55(9), 1–39.
- Alam, F., et al. (2022). COLING 2022 Survey: Comparative Study of Text, Image, and Video-Based Disinformation Detection Techniques. Proceedings of COLING 2022, 612–626.
- Vlachos and S. Riedel, “Fact checking: Task definition and dataset construction,” in Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, 2014.
- K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu, “Fakenewsnet: A data repository with news content, social context and dynamic information for studying fake news on social media,” arXiv preprint arXiv:1809.01286, 2018.
- K. Crammer and Y. Singer, “On the algorithmic implementation of multiclass kernel-based vector machines,” J. Mach. Learn. Res., vol. 2, pp. 265–292, Mar. 2002.
Ever since the arrival of the digital world, misleading information through the use of manipulated pictures and
poorly constructed text on news websites and social media has become a world wide issue. Detection of such manipulated or
false information is required in order to protect the public opinion, state security, as well as trust to digital environments, so
in order to detect misinformation using a single input, which is a visual information that also has hidden text (memes, posters,
and screenshots), this work suggests a deep learning dual-stream fusion model. The system recognizes the text on the image
using optical character recognition (OCR) and derives the semantic meaning with the help of a BERT-BiLSTM model. Highlevel features of the image are simultaneously extracted by feeding it through a ResNet convolutional neural network. After
that, the two features (text and visual) are fused on a deep fusion layer and maintain context and multi-modal dependency
and generate a more robust and accurate classification. Such an integrated solution provides the final classifier that can
distinguish the authenticity of supplied image-text input and prevent the further propagation of the misinformation
introduced by multimedia sources and can greatly enhance the correctness of fake news recognition based on either Natural
Language Understanding or Visual Semantics.
Keywords :
Disinformation Detection, Sustainable Digital Infrastructure, Multimodal Deep Learning, BERT-BiLSTM.