Authors :
Sumit Autade; Saurav Yadav; Deepti Vijay Chandran; Chirag Rathod
Volume/Issue :
Volume 9 - 2024, Issue 1 - January
Google Scholar :
http://tinyurl.com/58c6yujw
Scribd :
http://tinyurl.com/2amrr2ws
DOI :
https://doi.org/10.5281/zenodo.10490435
Abstract :
"This research introduces a groundbreaking
project, "Enabling Independence through Sound
Classification," leveraging Artificial Neural Networks
(ANNs), Convolutional Neural Networks (CNNs), Mel-
frequency cepstral coefficients (MFCCs), and the
Librosa library to offer real-time auditory feedback to
individuals who are hearing impaired. The project's
core objective is to enhance the independence and safety
of this community by translating environmental sounds
into meaningful alerts and descriptions. Beyond the
technical aspects of sound classification, the study
emphasizes the profound social impact of promoting
inclusivity, self-reliance, and equity for those with
auditory challenges. Through a comprehensive
exploration of CNN and RNN architectures, along with
comparisons to TensorFlow and PyTorch models on a
prototype dataset, the proposed approach,
incorporating envelope functions, normalization,
segmentation, regularization, and dropout layers,
demonstrates superior accuracy and reduced loss
percentages. This research signifies a pivotal step
towards a more accessible and inclusive society,
harmonizing technology and empathy for the benefit of
individuals with sensory challenges."
Keywords :
ANN, MFCC, DL, ML, RNN, CNN, API, ReLu, CPU, GPU.
"This research introduces a groundbreaking
project, "Enabling Independence through Sound
Classification," leveraging Artificial Neural Networks
(ANNs), Convolutional Neural Networks (CNNs), Mel-
frequency cepstral coefficients (MFCCs), and the
Librosa library to offer real-time auditory feedback to
individuals who are hearing impaired. The project's
core objective is to enhance the independence and safety
of this community by translating environmental sounds
into meaningful alerts and descriptions. Beyond the
technical aspects of sound classification, the study
emphasizes the profound social impact of promoting
inclusivity, self-reliance, and equity for those with
auditory challenges. Through a comprehensive
exploration of CNN and RNN architectures, along with
comparisons to TensorFlow and PyTorch models on a
prototype dataset, the proposed approach,
incorporating envelope functions, normalization,
segmentation, regularization, and dropout layers,
demonstrates superior accuracy and reduced loss
percentages. This research signifies a pivotal step
towards a more accessible and inclusive society,
harmonizing technology and empathy for the benefit of
individuals with sensory challenges."
Keywords :
ANN, MFCC, DL, ML, RNN, CNN, API, ReLu, CPU, GPU.