Authors :
B Mupini; S Chaputsira; Bk Sibanda
Volume/Issue :
Volume 9 - 2024, Issue 1 - January
Google Scholar :
http://tinyurl.com/26bz8f4f
Scribd :
http://tinyurl.com/3v5dkphd
DOI :
https://doi.org/10.5281/zenodo.10609671
Abstract :
Conversion of speech to text (STT) for various
applications is of huge interest, which involves
technological approaches which are innovative that
should be applied to accommodate spoken languages in
Africa. However, African countries are falling behind on
the embracing of STT technologies, with Automatic
Speech Recognition (ASR) having been done for popular
East African languages. This has always kept
transcription at a minimum and has also resulted in a
retard in the use of many African languages on a world-
wide scale, with another problem being that a single
African language may encompass several dialects. This
research looks at modern technologies and models that
have been implemented to construct ASR and STT models
for African languages and existing datasets, with
particular interest to the Shona language spoken by the
people of Zimbabwe. A survey has been done on STT for
the Shona language and it uncovers techniques present
which can be used to achieve effective STT for this
language. An example of such a technique is accounting
for procedures taken to convert spoken words into actual
text that can be displayed. The usage of ASR techniques
can help in many application areas such as assisting
individuals with hearing impairment, transcription
services, use in voice commands and control, dictation and
notes taking, language learning and translation, customer
service and support and also voice search and content
indexing. ASR is dominating together with other
technologies such as STT conversion, Text to Speech
(TTS) conversion and language translation.
Cumulatively, these technologies have aided in bridging
the gap between people who speak different languages
especially tourists and language enthusiasts. In African
countries most of which are underdeveloped, many
spoken African languages are underrepresented and
lowly resourced, which has hampered the advancement of
ASR technology on these low resource languages.
Bridging this gap will result in African languages,
especially Shona, being recognized more in the world and
finding use in everyday applications and technologies.
Keywords :
Transcribe, Dataset, Models, Dialect, Conversion.
Conversion of speech to text (STT) for various
applications is of huge interest, which involves
technological approaches which are innovative that
should be applied to accommodate spoken languages in
Africa. However, African countries are falling behind on
the embracing of STT technologies, with Automatic
Speech Recognition (ASR) having been done for popular
East African languages. This has always kept
transcription at a minimum and has also resulted in a
retard in the use of many African languages on a world-
wide scale, with another problem being that a single
African language may encompass several dialects. This
research looks at modern technologies and models that
have been implemented to construct ASR and STT models
for African languages and existing datasets, with
particular interest to the Shona language spoken by the
people of Zimbabwe. A survey has been done on STT for
the Shona language and it uncovers techniques present
which can be used to achieve effective STT for this
language. An example of such a technique is accounting
for procedures taken to convert spoken words into actual
text that can be displayed. The usage of ASR techniques
can help in many application areas such as assisting
individuals with hearing impairment, transcription
services, use in voice commands and control, dictation and
notes taking, language learning and translation, customer
service and support and also voice search and content
indexing. ASR is dominating together with other
technologies such as STT conversion, Text to Speech
(TTS) conversion and language translation.
Cumulatively, these technologies have aided in bridging
the gap between people who speak different languages
especially tourists and language enthusiasts. In African
countries most of which are underdeveloped, many
spoken African languages are underrepresented and
lowly resourced, which has hampered the advancement of
ASR technology on these low resource languages.
Bridging this gap will result in African languages,
especially Shona, being recognized more in the world and
finding use in everyday applications and technologies.
Keywords :
Transcribe, Dataset, Models, Dialect, Conversion.