Authors :
Jinay M. Patel; Shruti G. Patel
Volume/Issue :
Volume 11 - 2026, Issue 4 - April
Google Scholar :
https://tinyurl.com/3rhjdy44
Scribd :
https://tinyurl.com/ye4a5j82
DOI :
https://doi.org/10.38124/ijisrt/26apr1789
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
The exponential growth in the parameter counts of large language models (LLMs) has amplified concerns
regarding computational cost, energy consumption, and deployment feasibility in resource-limited environments. This
paper investigates three interconnected strategies within the Green AI paradigm: (i) post-training quantization of domainspecific LLMs (Llama 3 and Mistral 7B) for medical and legal natural language processing tasks; (ii) benchmarking of
small language models (SLMs) with 1B–3B parameters against their large counterparts across standard NLP tasks
including sentiment analysis and named-entity recognition; and (iii) edge deployment of compact vision models such as
MobileNetV3 and nano-YOLOv8 for real-time agricultural disease detection on embedded IoT hardware.
Keywords :
Green AI; Model Quantization; Small Language Models; Edge AI; IoT; LLM Compression; MobileNet; YOLO; Knowledge Distillation; Sustainable Machine Learning.
References :
- T. B. Brown et al., "Language Models are Few-Shot Learners," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, pp. 1877–1901, 2020.
- A. Chowdhery et al., "PaLM: Scaling Language Modeling with Pathways," J. Mach. Learn. Res., vol. 24, no. 240, pp. 1–113, 2023.
- R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, "Green AI," Commun. ACM, vol. 63, no. 12, pp. 54–63, Dec. 2020.
- E. Strubell, A. Ganesh, and A. McCallum, "Energy and Policy Considerations for Deep Learning in NLP," in Proc. 57th Annu. Meet. Assoc. Comput. Linguist. (ACL), Florence, Italy, pp. 3645–3650, 2019.
- D. Patterson et al., "Carbon Emissions and Large Neural Network Training," arXiv:2104.10350, 2021.
- International Energy Agency (IEA), "Electricity 2024: Analysis and Forecast to 2026," IEA, Paris, 2024. [Online]. Available: https://www.iea.org/reports/electricity-2024
- K. Lottick, S. Susai, S. A. Friedler, and J. P. Wilson, "Energy Usage Reports: Environmental Awareness as Part of Algorithmic Accountability," NeurIPS Workshop, 2019.
- T. Dettmers, M. Lewis, Y. Belkada, and L. Zettlemoyer, "LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 35, 2022.
- T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, "QLoRA: Efficient Finetuning of Quantized LLMs," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 36, 2023.
- G. Xiao et al., "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models," in Proc. 40th Int. Conf. Mach. Learn. (ICML), PMLR, vol. 202, pp. 38087–38099, 2023.
- S. Gururangan et al., "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks," in Proc. 58th Annu. Meet. Assoc. Comput. Linguist. (ACL), pp. 8342–8360, 2020.
- S. Gunasekar et al., "Textbooks Are All You Need," arXiv:2306.11644, 2023.
- G. Team et al., "Gemma: Open Models Based on Gemini Research and Technology," arXiv:2403.08295, 2024.
- P. Zhang et al., "TinyLlama: An Open-Source Small Language Model," arXiv:2401.02385, 2024.
- D. Bhatt, K. Patel, and R. Shah, "Benchmarking Small Language Models on Indian Regional Language NLP Tasks," in Proc. EMNLP Workshop South & Southeast Asian NLP (SEALP), 2024.
- A. G. Howard et al., "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv:1704.04861, 2017.
- Howard et al., "Searching for MobileNetV3," in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), pp. 1314–1324, 2019.
- G. Jocher, A. Chaurasia, and J. Qiu, "Ultralytics YOLOv8," GitHub, 2023. [Online]. Available: https://github.com/ultralytics/ultralytics
- S. P. Mohanty, D. P. Hughes, and M. Salathé, "Using Deep Learning for Image-Based Plant Disease Detection," Front. Plant Sci., vol. 7, p. 1419, 2016.
- K. P. Ferentinos, "Deep Learning Models for Plant Disease Detection and Diagnosis," Comput. Electron. Agric., vol. 145, pp. 311–318, 2018.
- A. Ramcharan et al., "Deep Learning for Image-Based Cassava Disease Detection," Front. Plant Sci., vol. 8, p. 1852, 2017.
- E. Frantar, S. Ashkboos, T. Hoefler, and D. Alistarh, "GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers," in Proc. 11th Int. Conf. Learn. Represent. (ICLR), 2023.
- D. Jin et al., "What Disease Does This Patient Have? A Large-Scale Open Domain Question Answering Dataset from Medical Exams," Appl. Sci., vol. 11, no. 14, p. 6421, 2021.
- D. Hendrycks et al., "CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review," in Proc. NeurIPS Datasets & Benchmarks Track, 2021.
- J. Kim, S. Lee, and H. Park, "Quantization-Aware Training for Legal Document Classification," in Proc. 3rd Workshop Natural Legal Lang. Process. (NLLP @ EMNLP), 2023.
The exponential growth in the parameter counts of large language models (LLMs) has amplified concerns
regarding computational cost, energy consumption, and deployment feasibility in resource-limited environments. This
paper investigates three interconnected strategies within the Green AI paradigm: (i) post-training quantization of domainspecific LLMs (Llama 3 and Mistral 7B) for medical and legal natural language processing tasks; (ii) benchmarking of
small language models (SLMs) with 1B–3B parameters against their large counterparts across standard NLP tasks
including sentiment analysis and named-entity recognition; and (iii) edge deployment of compact vision models such as
MobileNetV3 and nano-YOLOv8 for real-time agricultural disease detection on embedded IoT hardware.
Keywords :
Green AI; Model Quantization; Small Language Models; Edge AI; IoT; LLM Compression; MobileNet; YOLO; Knowledge Distillation; Sustainable Machine Learning.