Authors :
Utsha Sarker; Archy Biswas; Yubraj Kumar Rauniyar; Dhiraj Jha; Apsara Das Shreshtha
Volume/Issue :
Volume 11 - 2026, Issue 2 - February
Google Scholar :
https://tinyurl.com/4atsv9tp
Scribd :
https://tinyurl.com/25uarut6
DOI :
https://doi.org/10.38124/ijisrt/26feb1223
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Alzheimer's disease (AD) is a growing global health burden that mandates the application of early detection
modalities based on the integrated exploitation of multimodal biomarkers obtained from neuroimaging and longitudinal
electronic health records (EHRs) [22], [18]. Recent advances in artificial intelligence - based clinical decision support
systems, despite their instrumental importance, are still hindered by the predominance of opaque, black box architectures,
limiting their interpretability, undermining clinician confidence, and ruling out their approval by regulatory bodies in the
high stakes environment of neurology [1], [4]. In order to address these challenges, we propose XLLM - CDSS: an
explainable large language model (LLM) - augmented multi-modal architecture - fusion of Vision Transformer - based
neuroimaging encoder and medical text BERT - based encoder and LLaMS backbone with explainable AI (XAI) modules
[9], [11]. Evaluated on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, XLLM- CDSS had an area under
the receiver operating characteristic curve for early AD classification of 92.3% - representing a 15% improvement in
explainability metrics over state-of the-art multimodal baselines [15], [16]. The system additionally prepares for the
radiologist rationales in natural language that are matched to salient neuroimaging biomarkers and to longitudinal cognitive
indicators, thus further promoting clinical transparency and interpretability [2], [3]. The key contributions of this
investigation are: (1) a unified cross-modality representation learning paradigm for neuroimaging and EHRs integration;
(2) an LLM-augmented reasoning architecture with the embedded explainability capabilities; and (3) a comprehensive
evaluation strategy for the predictive performance synthesis with the interpretability measurements [6], [8]. This endeavour
is a step towards trustworthy human-centred artificial intelligence towards precision neurology and scaleable clinical
decision support systems.
Keywords :
Alzheimer's Disease, Explainable Artificial Intelligence (XAI), Large Language Models (LLMs), Multimodal Deep Learning, Clinical Decision Support Systems (CDSS), Neuroimaging Analytics, Electronic Health Records (EHRs), Vision Transformer Fused.
References :
- M. Mesinovic, P. Watkinson, and T. Zhu, “Explainability in the age of large language models for healthcare,” Communications Engineering, vol. 4, art. no. 128, Jul. 2025, doi: 10.1038/s44172-025-00453-y.
- “Orchestrating explainable artificial intelligence for multimodal and longitudinal data in medical imaging,” npj Digital Medicine, 2024, doi: 10.1038/s41746-024-01190-w.
- “Large language model as clinical decision support system: evaluation for identifying prescribing errors,” Cell Reports Medicine, 2025, Art. no. S2666379125003969 (ScienceDirect/PII).
- “Large language model–based clinical decision support framework for syncope recognition in the emergency department,” 2025, Art. no. S0953620524004059 (ScienceDirect/PII).
- “Leveraging ChatGPT and explainable AI to enhance machine learning classification using tabular data in healthcare: The HealthAI Prompt framework,” Scientific Reports, vol. 15, art. no. 6837, 2025, doi: 10.1038/s41598-025-22784-8.
- J. Li, Z. Zhou, H. Lyu, and Z. Wang, “Large language models-powered clinical decision support: Enhancing or replacing human expertise?,” Intelligent Medicine, vol. 5, no. 1, pp. 1–4, Feb. 2025, doi: 10.1016/j.imed.2025.01.001.
- “Advances, Evaluation, and Explainability of Large Language Models for Healthcare,” ACM Computing Surveys (early access), Feb. 2026, doi: 10.1145/3786334.
- S. Maity and M. J. Saikia, “Large Language Models in Healthcare and Medical Applications,” Bioengineering, vol. 12, no. 6, art. no. 631, Jun. 2025, doi: 10.3390/bioengineering12060631.
- J. Vrdoljak, Z. Boban, M. Vilović, M. Kumrić, and J. Božić, “A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration,” Healthcare, vol. 13, no. 6, art. no. 603, Mar. 2025, doi: 10.3390/healthcare13060603.
- L. Xu, H. Sun, Z. Ni, H. Li, and S. Zhang, “MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation,” arXiv preprint arXiv:2409.19684, Sep. 2024.
- M. Moor et al., “Med-Flamingo: A multimodal medical few-shot learner,” arXiv preprint arXiv:2307.15189, Jul. 2023.
- C. Li et al., “LLaVA-Med: Training a large language-and-vision assistant for biomedicine in one day,” arXiv preprint arXiv:2306.00890, Jun. 2023.
- D. Dai et al., “PA-LLaVA: A large language-vision assistant for human pathology image understanding,” arXiv preprint arXiv:2408.09530, Aug. 2024.
- X. Chen et al., “R-LLaVA: Improving Med-VQA understanding through visual region of interest,” arXiv preprint arXiv:2410.20327, Oct. 2024 (rev. Mar. 2025).
- X. Yang et al., “Medical large vision language models with multi-image visual ability,” arXiv preprint arXiv:2505.19031, May 2025.
- A. Shourya, M. Dumontier, and C. Sun, “Adapting lightweight vision language models for radiological visual question answering,” arXiv preprint arXiv:2506.14451, Jun. 2025.
- J. Ye and H. Tang, “Multimodal Large Language Models for Medicine: A Comprehensive Survey,” arXiv preprint arXiv:2504.21051, Apr. 2025.
- M. Paschali et al., “Foundation models in radiology: What, how, why, and why not,” Radiology, vol. 314, no. 2, art. no. e240597, Feb. 2025, doi: 10.1148/radiol.240597.
- “A medical multimodal-multitask foundation model for lung cancer screening,” Nature Communications, vol. 16, art. no. 10260, 2025, doi: 10.1038/s41467-025-56822-w.
- S.-C. Huang, M. Jensen, S. Yeung-Levy, M. P. Lungren, H. Poon, and A. S. Chaudhari, “Multimodal Foundation Models for Medical Imaging – A Systematic Review and Implementation Guidelines,” medRxiv preprint, Oct. 23, 2024, doi: 10.1101/2024.10.23.24316003.
- S.-C. Huang, M. Jensen, S. Yeung-Levy, M. P. Lungren, H. Poon, and A. S. Chaudhari, “A Systematic Review and Implementation Guidelines of Multimodal Foundation Models in Medical Imaging,” Research Square [Preprint], Apr. 2025, doi: 10.21203/rs.3.rs-5537908/v1.
- S. Chakraborty et al., “A Multimodal Vision Transformer for Interpretable Fusion of Neuroimaging Data and Genetic Data in Alzheimer’s Disease,” Aging (Albany NY), vol. 16, no. 20, pp. 10968–10985, Oct. 2024, doi: 10.18632/aging.206188.
- A. Barragán-Montero, C. Rodríguez-Huertas, M. Granados, et al., “An interpretable approach for anomaly detection in medical images and reports via multimodal foundation models,” Frontiers in Bioengineering and Biotechnology, 2025, art. no. 1644697, doi: 10.3389/fbioe.2025.1644697.
- “Explainable artificial intelligence for medical imaging systems: A review,” Cluster Computing, 2025, doi: 10.1007/s10586-025-05281-5.
- X. Chen, H. Xie, X. Tao, F. L. Wang, M. Leng, and B. Lei, “Artificial intelligence and multimodal data fusion for smart healthcare: topic modeling and bibliometrics,” Artificial Intelligence Review, vol. 57, art. no. 91, Mar. 2024, doi: 10.1007/s10462-024-10712-7.
Alzheimer's disease (AD) is a growing global health burden that mandates the application of early detection
modalities based on the integrated exploitation of multimodal biomarkers obtained from neuroimaging and longitudinal
electronic health records (EHRs) [22], [18]. Recent advances in artificial intelligence - based clinical decision support
systems, despite their instrumental importance, are still hindered by the predominance of opaque, black box architectures,
limiting their interpretability, undermining clinician confidence, and ruling out their approval by regulatory bodies in the
high stakes environment of neurology [1], [4]. In order to address these challenges, we propose XLLM - CDSS: an
explainable large language model (LLM) - augmented multi-modal architecture - fusion of Vision Transformer - based
neuroimaging encoder and medical text BERT - based encoder and LLaMS backbone with explainable AI (XAI) modules
[9], [11]. Evaluated on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, XLLM- CDSS had an area under
the receiver operating characteristic curve for early AD classification of 92.3% - representing a 15% improvement in
explainability metrics over state-of the-art multimodal baselines [15], [16]. The system additionally prepares for the
radiologist rationales in natural language that are matched to salient neuroimaging biomarkers and to longitudinal cognitive
indicators, thus further promoting clinical transparency and interpretability [2], [3]. The key contributions of this
investigation are: (1) a unified cross-modality representation learning paradigm for neuroimaging and EHRs integration;
(2) an LLM-augmented reasoning architecture with the embedded explainability capabilities; and (3) a comprehensive
evaluation strategy for the predictive performance synthesis with the interpretability measurements [6], [8]. This endeavour
is a step towards trustworthy human-centred artificial intelligence towards precision neurology and scaleable clinical
decision support systems.
Keywords :
Alzheimer's Disease, Explainable Artificial Intelligence (XAI), Large Language Models (LLMs), Multimodal Deep Learning, Clinical Decision Support Systems (CDSS), Neuroimaging Analytics, Electronic Health Records (EHRs), Vision Transformer Fused.