Authors :
Sonali Padhi; Kranthi Kiran; Ambica Thakur; Adityaveer Dhillon; Bharani Kumar Depuru
Volume/Issue :
Volume 9 - 2024, Issue 3 - March
Google Scholar :
https://tinyurl.com/3fsjfe32
Scribd :
https://tinyurl.com/7smh6vjk
DOI :
https://doi.org/10.38124/ijisrt/IJISRT24MAR1984
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Speech-to-Text and Text-to-Speech are both
NLP(natural language processing) powered models
which transform speech to text and vice versa, providing
an increased scope of learning for the parties involved.
For the past couple of years it's been observed that
students have been moving abroad for quality education
and better financial aid. Since there is an accent gap
between students and tutors which reduces the
understanding of students. Our work is done to solve the
aforementioned problem. With its state-of-the-art
STT(speech-to-text) and TTS(text-to-speech) softwares
this work intends to ease the learning curve of the
students.
The key targets of this work are international
students, individuals with disabilities. It can also be used
to transcribe meetings for quick conversion of meeting
discussion points into text. Companies can also use the
model to get the data for the call recordings and further
perform sentiment analysis and various such activities.
This research aims to give a detailed walk through
of the product as it stands, and provide details regarding
all aspects of the product. This covers the various tech
stacks used, the implementation of the said technologies,
the reports shown to the different end users. This
provides the workflow of the product.
Keywords :
Artificial Intelligence, Deep Learning, Large Language Models, Automatic Speech Recognition, Transcription, Whisper AI, gTTs.
Speech-to-Text and Text-to-Speech are both
NLP(natural language processing) powered models
which transform speech to text and vice versa, providing
an increased scope of learning for the parties involved.
For the past couple of years it's been observed that
students have been moving abroad for quality education
and better financial aid. Since there is an accent gap
between students and tutors which reduces the
understanding of students. Our work is done to solve the
aforementioned problem. With its state-of-the-art
STT(speech-to-text) and TTS(text-to-speech) softwares
this work intends to ease the learning curve of the
students.
The key targets of this work are international
students, individuals with disabilities. It can also be used
to transcribe meetings for quick conversion of meeting
discussion points into text. Companies can also use the
model to get the data for the call recordings and further
perform sentiment analysis and various such activities.
This research aims to give a detailed walk through
of the product as it stands, and provide details regarding
all aspects of the product. This covers the various tech
stacks used, the implementation of the said technologies,
the reports shown to the different end users. This
provides the workflow of the product.
Keywords :
Artificial Intelligence, Deep Learning, Large Language Models, Automatic Speech Recognition, Transcription, Whisper AI, gTTs.