An Automated MCQ Generator using NLP


Authors : Arpan Neupane; Sonam Chaudhari; Suruchi Shah

Volume/Issue : Volume 10 - 2025, Issue 6 - June


Google Scholar : https://tinyurl.com/yvy57eub

DOI : https://doi.org/10.38124/ijisrt/25jun1878

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : The automation of multiple-choice question (MCQ) generation has emerged as a crucial advancement in educational assessment, aiming to reduce the time, effort, and domain expertise required for manual question creation. This research introduces a Natural Language Processing (NLP)-based system that generates high-quality, contextually relevant MCQs from textual content. The system accepts diverse input formats, including plain text and PDF documents, and utilizes advanced transformer models such as T5 (Text-to-Text Transfer Transformer), Flan-T5(Fine-tuned Language Net T5), and DistilBERT (Distilled Bidirectional Encoder Representations from Transformers) for keyword extraction, question formulation, and distractor generation. A web-based interface, developed using Django, enables users to customize parameters like question quantity and model selection, ensuring flexibility across educational domains. Evaluation using BLEU (Bilingual Evaluation Understudy) and ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics confirms that fine-tuned models outperform their base counterparts in coherence and relevance. Additionally, the use of efficient fine-tuning techniques like LoRA (Low-Rank Adaptation) and QLoRA (Quantized Low-Rank Adaptation) significantly reduces computational overhead without degrading performance. The proposed system demonstrates strong potential to streamline formative assessment and enhance learning feedback loops, while future work will focus on mobile deployment and integration with digital learning platforms to expand accessibility.

Keywords : Automated MCQ Generation; NLP; Transformer Models; Distractor Generation; Educational Technology.

References :

  1. Y. Folajimi and O. Omojola, “Natural language processing techniques for automatic test questions generation using discourse connectives,” Journal of Computer Science and Its Application, vol. 20, no. 2, pp. 60–76, 2013.
  2. Nwafor and I. E. Onyenwe, “An automated multiple-choice question generation using natural language processing techniques,” arXiv preprint, arXiv:2103.14757, 2021.
  3. Y. Susanti, T. Tokunaga, H. Nishikawa, and H. Obari, “Automatic distractor generation for multiple-choice English vocabulary questions,” Research and Practice in Technology Enhanced Learning, vol. 13, no. 1, p. 15, 2018
  4. Das and M. Majumder, “Factual open cloze question generation for assessment of learner’s knowledge,” International Journal of Educational Technology in Higher Education, vol. 14, pp. 1–12, 2017.
  5. K. Grover, K. Kaur, K. Tiwari, Rupali, and P. Kumar, “Deep learning based question generation using T5 transformer,” in Advanced Computing: 10th International Conference, IACC 2020, Panaji, Goa, India, December 5–6, 2020, Revised Selected Papers, Part I, Springer, 2021, pp. 243–255.
  6. H.-L. Chung, Y.-H. Chan, and Y.-C. Fan, “A BERT-based distractor generation scheme with multi-tasking and negative answer training strategies,” arXiv preprint, arXiv:2010.05384, 2020.
  7. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Roziere, N. Goyal, E. Hambro, F. Azhar, and others, “LLaMA: Open and efficient foundation language models,” arXiv preprint, arXiv:2302.13971, 2023.
  8. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186.
  9. V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter,” arXiv preprint, arXiv:1910.01108, 2019.
  10. J. Oza and H. Yadav, “Enhancing question prediction with FLAN-T5: A context-aware language model approach,” Authorea Preprints, 2023
  11. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” Journal of Machine Learning Research, vol. 21, no. 140, pp. 1–67, 2020
  12. N. C. P. Prakash, A. P. Narasimhaiah, J. B. Nagaraj, P. K. Pareek, N. B. Maruthikumar, and R. I. Manjunath, “Implementation of NLP based automatic text summarization using spaCy,” International Journal of Health Sciences, vol. 6, no. S5, pp. 7508–7521, 2022
  13. S. Rose, D. Engel, N. Cramer, and W. Cowley, “Automatic keyword extraction from individual documents,” in Text Mining: Applications and Theory, pp. 1–20, 2010.

The automation of multiple-choice question (MCQ) generation has emerged as a crucial advancement in educational assessment, aiming to reduce the time, effort, and domain expertise required for manual question creation. This research introduces a Natural Language Processing (NLP)-based system that generates high-quality, contextually relevant MCQs from textual content. The system accepts diverse input formats, including plain text and PDF documents, and utilizes advanced transformer models such as T5 (Text-to-Text Transfer Transformer), Flan-T5(Fine-tuned Language Net T5), and DistilBERT (Distilled Bidirectional Encoder Representations from Transformers) for keyword extraction, question formulation, and distractor generation. A web-based interface, developed using Django, enables users to customize parameters like question quantity and model selection, ensuring flexibility across educational domains. Evaluation using BLEU (Bilingual Evaluation Understudy) and ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics confirms that fine-tuned models outperform their base counterparts in coherence and relevance. Additionally, the use of efficient fine-tuning techniques like LoRA (Low-Rank Adaptation) and QLoRA (Quantized Low-Rank Adaptation) significantly reduces computational overhead without degrading performance. The proposed system demonstrates strong potential to streamline formative assessment and enhance learning feedback loops, while future work will focus on mobile deployment and integration with digital learning platforms to expand accessibility.

Keywords : Automated MCQ Generation; NLP; Transformer Models; Distractor Generation; Educational Technology.

CALL FOR PAPERS


Paper Submission Last Date
30 - November - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe