Artificial Intelligence Powered Voice to Text and Text to Speech Recognition Model – A Powerful Tool for Student Comprehension of Tutor Speech


Authors : Sonali Padhi; Kranthi Kiran; Ambica Thakur; Adityaveer Dhillon; Bharani Kumar Depuru

Volume/Issue : Volume 9 - 2024, Issue 3 - March

Google Scholar : https://tinyurl.com/3fsjfe32

Scribd : https://tinyurl.com/7smh6vjk

DOI : https://doi.org/10.38124/ijisrt/IJISRT24MAR1984

Abstract : Speech-to-Text and Text-to-Speech are both NLP(natural language processing) powered models which transform speech to text and vice versa, providing an increased scope of learning for the parties involved. For the past couple of years it's been observed that students have been moving abroad for quality education and better financial aid. Since there is an accent gap between students and tutors which reduces the understanding of students. Our work is done to solve the aforementioned problem. With its state-of-the-art STT(speech-to-text) and TTS(text-to-speech) softwares this work intends to ease the learning curve of the students. The key targets of this work are international students, individuals with disabilities. It can also be used to transcribe meetings for quick conversion of meeting discussion points into text. Companies can also use the model to get the data for the call recordings and further perform sentiment analysis and various such activities. This research aims to give a detailed walk through of the product as it stands, and provide details regarding all aspects of the product. This covers the various tech stacks used, the implementation of the said technologies, the reports shown to the different end users. This provides the workflow of the product.

Keywords : Artificial Intelligence, Deep Learning, Large Language Models, Automatic Speech Recognition, Transcription, Whisper AI, gTTs.

Speech-to-Text and Text-to-Speech are both NLP(natural language processing) powered models which transform speech to text and vice versa, providing an increased scope of learning for the parties involved. For the past couple of years it's been observed that students have been moving abroad for quality education and better financial aid. Since there is an accent gap between students and tutors which reduces the understanding of students. Our work is done to solve the aforementioned problem. With its state-of-the-art STT(speech-to-text) and TTS(text-to-speech) softwares this work intends to ease the learning curve of the students. The key targets of this work are international students, individuals with disabilities. It can also be used to transcribe meetings for quick conversion of meeting discussion points into text. Companies can also use the model to get the data for the call recordings and further perform sentiment analysis and various such activities. This research aims to give a detailed walk through of the product as it stands, and provide details regarding all aspects of the product. This covers the various tech stacks used, the implementation of the said technologies, the reports shown to the different end users. This provides the workflow of the product.

Keywords : Artificial Intelligence, Deep Learning, Large Language Models, Automatic Speech Recognition, Transcription, Whisper AI, gTTs.

CALL FOR PAPERS


Paper Submission Last Date
31 - May - 2024

Paper Review Notification
In 1-2 Days

Paper Publishing
In 2-3 Days

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe