Green ai and model efficiency a comprehensive study of quantization small language models and edge deploymentfor resourceconstrained environments| International Journal of Innovative Science and Research Technology

Green AI and Model Efficiency: A Comprehensive Study of Quantization, Small Language Models, and Edge Deploymentfor Resource-Constrained Environments

Authors : Jinay M. Patel; Shruti G. Patel

Volume/Issue : Volume 11 - 2026, Issue 4 - April

Google Scholar : https://tinyurl.com/3rhjdy44

Scribd : https://tinyurl.com/ye4a5j82

DOI : https://doi.org/10.38124/ijisrt/26apr1789

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : The exponential growth in the parameter counts of large language models (LLMs) has amplified concerns regarding computational cost, energy consumption, and deployment feasibility in resource-limited environments. This paper investigates three interconnected strategies within the Green AI paradigm: (i) post-training quantization of domainspecific LLMs (Llama 3 and Mistral 7B) for medical and legal natural language processing tasks; (ii) benchmarking of small language models (SLMs) with 1B–3B parameters against their large counterparts across standard NLP tasks including sentiment analysis and named-entity recognition; and (iii) edge deployment of compact vision models such as MobileNetV3 and nano-YOLOv8 for real-time agricultural disease detection on embedded IoT hardware.

Keywords : Green AI; Model Quantization; Small Language Models; Edge AI; IoT; LLM Compression; MobileNet; YOLO; Knowledge Distillation; Sustainable Machine Learning.

References :

T. B. Brown et al., "Language Models are Few-Shot Learners," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, pp. 1877–1901, 2020.
A. Chowdhery et al., "PaLM: Scaling Language Modeling with Pathways," J. Mach. Learn. Res., vol. 24, no. 240, pp. 1–113, 2023.
R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, "Green AI," Commun. ACM, vol. 63, no. 12, pp. 54–63, Dec. 2020.
E. Strubell, A. Ganesh, and A. McCallum, "Energy and Policy Considerations for Deep Learning in NLP," in Proc. 57th Annu. Meet. Assoc. Comput. Linguist. (ACL), Florence, Italy, pp. 3645–3650, 2019.
D. Patterson et al., "Carbon Emissions and Large Neural Network Training," arXiv:2104.10350, 2021.
International Energy Agency (IEA), "Electricity 2024: Analysis and Forecast to 2026," IEA, Paris, 2024. [Online]. Available: https://www.iea.org/reports/electricity-2024
K. Lottick, S. Susai, S. A. Friedler, and J. P. Wilson, "Energy Usage Reports: Environmental Awareness as Part of Algorithmic Accountability," NeurIPS Workshop, 2019.
T. Dettmers, M. Lewis, Y. Belkada, and L. Zettlemoyer, "LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 35, 2022.
T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, "QLoRA: Efficient Finetuning of Quantized LLMs," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 36, 2023.
G. Xiao et al., "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models," in Proc. 40th Int. Conf. Mach. Learn. (ICML), PMLR, vol. 202, pp. 38087–38099, 2023.
S. Gururangan et al., "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks," in Proc. 58th Annu. Meet. Assoc. Comput. Linguist. (ACL), pp. 8342–8360, 2020.
S. Gunasekar et al., "Textbooks Are All You Need," arXiv:2306.11644, 2023.
G. Team et al., "Gemma: Open Models Based on Gemini Research and Technology," arXiv:2403.08295, 2024.
P. Zhang et al., "TinyLlama: An Open-Source Small Language Model," arXiv:2401.02385, 2024.
D. Bhatt, K. Patel, and R. Shah, "Benchmarking Small Language Models on Indian Regional Language NLP Tasks," in Proc. EMNLP Workshop South & Southeast Asian NLP (SEALP), 2024.
A. G. Howard et al., "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv:1704.04861, 2017.
Howard et al., "Searching for MobileNetV3," in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), pp. 1314–1324, 2019.
G. Jocher, A. Chaurasia, and J. Qiu, "Ultralytics YOLOv8," GitHub, 2023. [Online]. Available: https://github.com/ultralytics/ultralytics
S. P. Mohanty, D. P. Hughes, and M. Salathé, "Using Deep Learning for Image-Based Plant Disease Detection," Front. Plant Sci., vol. 7, p. 1419, 2016.
K. P. Ferentinos, "Deep Learning Models for Plant Disease Detection and Diagnosis," Comput. Electron. Agric., vol. 145, pp. 311–318, 2018.
A. Ramcharan et al., "Deep Learning for Image-Based Cassava Disease Detection," Front. Plant Sci., vol. 8, p. 1852, 2017.
E. Frantar, S. Ashkboos, T. Hoefler, and D. Alistarh, "GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers," in Proc. 11th Int. Conf. Learn. Represent. (ICLR), 2023.
D. Jin et al., "What Disease Does This Patient Have? A Large-Scale Open Domain Question Answering Dataset from Medical Exams," Appl. Sci., vol. 11, no. 14, p. 6421, 2021.
D. Hendrycks et al., "CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review," in Proc. NeurIPS Datasets & Benchmarks Track, 2021.
J. Kim, S. Lee, and H. Park, "Quantization-Aware Training for Legal Document Classification," in Proc. 3rd Workshop Natural Legal Lang. Process. (NLLP @ EMNLP), 2023.

The exponential growth in the parameter counts of large language models (LLMs) has amplified concerns regarding computational cost, energy consumption, and deployment feasibility in resource-limited environments. This paper investigates three interconnected strategies within the Green AI paradigm: (i) post-training quantization of domainspecific LLMs (Llama 3 and Mistral 7B) for medical and legal natural language processing tasks; (ii) benchmarking of small language models (SLMs) with 1B–3B parameters against their large counterparts across standard NLP tasks including sentiment analysis and named-entity recognition; and (iii) edge deployment of compact vision models such as MobileNetV3 and nano-YOLOv8 for real-time agricultural disease detection on embedded IoT hardware.

Keywords : Green AI; Model Quantization; Small Language Models; Edge AI; IoT; LLM Compression; MobileNet; YOLO; Knowledge Distillation; Sustainable Machine Learning.

Paper Submission Last Date
30 - June - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.