Hexatalk using ANN and DNNS


Authors : M Ravi; Dr. A Obulesu; CH Vinod Vara Prasad; N Abhishek; N Rithish Reddy; V Anil Chary

Volume/Issue : Volume 10 - 2025, Issue 4 - April


Google Scholar : https://tinyurl.com/ykwrhxme

Scribd : https://tinyurl.com/267fh5p9

DOI : https://doi.org/10.38124/ijisrt/25apr1252

Google Scholar

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 15 to 20 days to display the article.


Abstract : Speaker recognition is an essential aspect of human-computer interaction, with applications in security, personalized services, and more. This project proposes an end-to-end speaker recognition system leveraging Long Short- Term Memory (LSTM) neural networks. Mel-Frequency Cepstral Coefficients (MFCCs) are used as audio features, processed by an LSTM model to classify speakers with high accuracy. The proposed system demonstrates the efficacy of LSTM for temporal feature analysis, achieving robust performance in noisy environments.

Keywords : Speaker Recognition, Deep Learning, MFCC, LSTM, Audio Classification.

References :

  1. Yu, D., & Deng, L. Automatic Speech Recognition: A Deep Learning Approach. Springer, 2015.J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68-73.
  2. Chollet, F. Deep Learning with Python. Manning Publications, 2018.
  3. Hochreiter, S., & Schmidhuber, J. Long Short-Term Memory. Neural Computation, 1997.

Speaker recognition is an essential aspect of human-computer interaction, with applications in security, personalized services, and more. This project proposes an end-to-end speaker recognition system leveraging Long Short- Term Memory (LSTM) neural networks. Mel-Frequency Cepstral Coefficients (MFCCs) are used as audio features, processed by an LSTM model to classify speakers with high accuracy. The proposed system demonstrates the efficacy of LSTM for temporal feature analysis, achieving robust performance in noisy environments.

Keywords : Speaker Recognition, Deep Learning, MFCC, LSTM, Audio Classification.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe