Cross domain transfer of natural language explanation models pretraining on esnli and adapting to a new target task| International Journal of Innovative Science and Research Technology

Cross Domain Transfer of Natural Language Explanation Models: Pretraining on e-SNLI and Adapting to a New Target Task

Authors : Md. Farhad Rahman; Mohammad Sayduzzaman; Tawhidur Rahman; Monira Mostafa

Volume/Issue : Volume 11 - 2026, Issue 3 - March

Google Scholar : https://tinyurl.com/3bbczsc5

Scribd : https://tinyurl.com/2ub9d9dr

DOI : https://doi.org/10.38124/ijisrt/26mar233

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : The extensive use of AI in the critical ICT systems requires not only accurate, but transparent and credible models. Nevertheless, state of the art models are usually black boxes and their answers to decisions can be fragile, and do not make generalizations in different areas of operation. The problem of designing effective, transferable natural language explanations (NLEs) is discussed by building a multi task T5 based model that takes the label-prefixed format of decoders to jointly assign NLI labels and produce explanations. Pretraining of the model occurs on e-SNLI then fine tuning is done under different cross domain conditions, such as label only supervision, frozen encoders, and loss weight variations. Although there are no explanations to be found in the fine-tuning process, the experimental results show that explanation pretraining can greatly improve the linguistic fluency, structure, and relevance of explanations. The partial faithfulness is also provided by token deletion tests which reveal that the explanations are based on the same evidence as the classifier does. Abalation studies demand stable and transferable explanations to be characterized by balanced loss weighting, encoder adaptation, and explanation oversight. These results point to the necessity of standardized assessment tools of NLE and indicate directions on how the explanation-capable models can be incorporated into ICT systems that need transparency and accountability.

Keywords : Natural-Language Explanations, e-SNLI, Cross-Domain Transfer, Multi-Task Learning, Faithfulness Evaluation, Explainable AI, ICT Standardization, Trustworthy AI.

References :

Samuel R Bowman, Gabor Angeli, Christopher Potts, and Christopher D Manning. A large annotated corpus for learning natural language inference. In Proceedings of EMNLP, 2015.
Oana-Maria Camburu, Tim Rocktaschel, ¨ Thomas Lukasiewicz, and Phil Blunsom. e-snli: Natural language inference with natural language explanations. In NeurIPS, 2018.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “why should i trust you?” explaining the predictions of any classifier. In Proceedings of KDD, 2016.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Anchors: High-precision model-agnostic explanations. In Proceedings of AAAI, 2018.
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. ICML, 2017.
David Alvarez-Melis and Tommi Jaakkola. Towards robust interpretability with self-explaining neural networks. In NeurIPS, 2018.
Shi Feng, Eric Wallace, Alvin Grissom II, et al. Pathologies of neural models making explanations. In ACL Workshop BlackboxNLP, 2018.
Jay DeYoung, Sarthak Jain, Nazneen Rajagopal, Rishabh Jha, and Byron C Wallace. Eraser: A benchmark to evaluate rationalized nlp models. In Proceedings of ACL, 2020.
Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong, and Richard Socher. Explain yourself! leveraging language models for commonsense reasoning. In ACL, 2019.
John Morris, Eli Lifland, Jin Yong Yoo, Jake Grigsby, Di Jin, and Yanjun Qi. Textattack: A framework for adversarial attacks, data augmentation, and adversarial training in nlp. In Proceedings of the 2020 conference on empirical methods in natural language processing: System demonstrations, pages 119–126, 2020.
Sarah Wiegreffe and Yuval Pinter. Attention is not not explanation. In Proceedings of EMNLP, 2019.
Colin Raffel, Noam Shazeer, Adam Roberts, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 2020.
Mike Lewis, Yinhan Liu, Naman Goyal, et al. Bart: Denoising sequence-to-sequence pre-training for natural language generation. ACL, 2019.
Suchin Gururangan et al. Don’t stop pretraining: Adapt language models to domains and tasks. ACL, 2020.
Peter Hase and Mohit Bansal. Evaluating explainable ai: Which explanation works best? arXiv preprint, 2020.
Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. EMNLP, 2016.
Jasmijn Bastings, Wilker Aziz, and Ivan Titov. On the evaluation of causal explanations in nlp. arXiv preprint, 2021.
Andrew Slavin Ross, Michael C Hughes, and Finale Doshi-Velez. Right for the right reasons: Training differentiable models by constraining their explanations. In Proceedings of IJCAI, 2017.
Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence functions. ICML, 2017.
Alon Talmor, Jonathan Herzig, Nicholas Lourie, and Jonathan Berant. Commonsenseqa: A question answering challenge targeting commonsense knowledge. In NAACL, 2019.
Various authors. Cross-domain evaluation of natural-language explana-tions. Survey / Workshop Papers, 2022.

The extensive use of AI in the critical ICT systems requires not only accurate, but transparent and credible models. Nevertheless, state of the art models are usually black boxes and their answers to decisions can be fragile, and do not make generalizations in different areas of operation. The problem of designing effective, transferable natural language explanations (NLEs) is discussed by building a multi task T5 based model that takes the label-prefixed format of decoders to jointly assign NLI labels and produce explanations. Pretraining of the model occurs on e-SNLI then fine tuning is done under different cross domain conditions, such as label only supervision, frozen encoders, and loss weight variations. Although there are no explanations to be found in the fine-tuning process, the experimental results show that explanation pretraining can greatly improve the linguistic fluency, structure, and relevance of explanations. The partial faithfulness is also provided by token deletion tests which reveal that the explanations are based on the same evidence as the classifier does. Abalation studies demand stable and transferable explanations to be characterized by balanced loss weighting, encoder adaptation, and explanation oversight. These results point to the necessity of standardized assessment tools of NLE and indicate directions on how the explanation-capable models can be incorporated into ICT systems that need transparency and accountability.

Keywords : Natural-Language Explanations, e-SNLI, Cross-Domain Transfer, Multi-Task Learning, Faithfulness Evaluation, Explainable AI, ICT Standardization, Trustworthy AI.

Paper Submission Last Date
30 - June - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.