Sign Language to Text and Speech Conversion


Authors : W. Sweta; J. Kartiki; K. Prerana; M. Aarya; P. Rutuja

Volume/Issue : Volume 10 - 2025, Issue 6 - June


Google Scholar : https://tinyurl.com/c9jc6bws

DOI : https://doi.org/10.38124/ijisrt/25jun285

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.


Abstract : This report presents the design and development of a Sign Language to Text and Speech Conversion System. The main goal of this project is to improve communication for people who are deaf or hard of hearing by translating sign language gestures into text and spoken words in real time. This helps bridge the communication gap between sign language users and people who don’t know sign language, making daily conversations easier and more inclusive. Our system uses a gesture recognition model based on Convolutional Neural Networks (CNNs) to accurately detect hand gestures that represent different signs. One of the major challenges in this process is handling changing lighting conditions, various backgrounds, and differences in hand shapes or gestures. To overcome these issues, we use vision-based techniques and landmark detection with the help of the MediaPipe library, which enhances the accuracy and performance of the system. After recognizing a gesture, the system converts it into text and uses Text-to-Speech (TTS) technology to generate clear spoken output. This allows people with hearing disabilities to communicate more smoothly with those unfamiliar with sign language, making interactions quicker and more effective. The report also discusses the positive impact this technology can have in places like schools, offices, and public service areas. It emphasizes the importance of ongoing improvements in machine learning and computer vision to make such systems even more reliable and user-friendly. Overall, this project highlights how modern technology can promote a more inclusive, accessible world for everyone.

Keywords : Sign Language to Text and Speech Conversion, Gesture Recognition, Convolutional Neural Networks (CNN), Text-to- Speech, Accessibility, Inclusivity, Computer Vision, MediaPipe, Real-Time Communication.

References :

  1. A. S. K. Raj and P. S. Babu, "A survey on sign language recognition system for Indian sign language using CNN," Journal of Computational and Theoretical Nanoscience, vol. 16, pp. 3983–3991, 2019.
  2. A. Z. Choudhury and S. P. Ghosh, "Sign language recognition using convolutional neural networks and MediaPipe for real-time applications," in Proc. IEEE Int. Conf. on Advanced Networks and Telecommunications Systems (ANTS), 2021.
  3. P. S. Patil, P. S. M. Sharma, and R. K. Chatterjee, "Real-time hand gesture recognition for sign language translation," in Proc. Int. Conf. on Artificial Intelligence and Computer Science (AICS), pp. 213–217, 2020.
  4. Google Inc., "Google Text-to-Speech (gTTS) Documentation," [Online]. Available: https://pypi.org/project/gTTS/. [Accessed: Feb. 15, 2025].
  5. H. J. Nguyen and M. B. Y. Chang, "Real-time sign language recognition using MediaPipe framework and deep learning," Int. J. of Computer Vision, vol. 31, no. 6, pp. 524–538, 2020.
  6. T. M. Soong, "Deep learning for gesture recognition in sign language communication," Journal of Artificial Intelligence in Engineering, vol. 28, pp. 215–225, 2018.
  7. A. Shalal, "Survey of modern techniques for real-time gesture recognition systems," IEEE Access, vol. 8, pp. 55871–55881, 2020.
  8. K. L. R. R. Reddy, S. S. Srinivas, and K. C. S. Prasad, "Sign language recognition using CNNs and deep learning," IEEE Access, vol. 8, pp. 117148 117160, 2020.
  9. A. Z. Choudhury and S. P. Ghosh, "Sign language recognition using convolutional neural networks and MediaPipe for real-time applications," IEEE Int. Conf. on Advanced Networks and Telecommunications Systems (ANTS), 2021. ADYPSOE, Department of Artificial Intelligence and Data Science 2024-25 69
  10. A. C. R. D. S. A. Rajasekaran, "Sign language recognition using hand gestures with MediaPipe and CNN," Int. J. of Computer Science and Network Security, vol. 21, pp. 241–245, 2021.
  11. P. M. B. C. E. J. Doe, "Exploring CNN-based approaches to real-time sign language recognition," Int. J. of Computer Vision, vol. 41, no. 7, pp. 891 904, 2019.
  12. F. H. P. Zhang and J. W. Li, "Convolutional neural networks for sign language recognition with real-time processing," IEEE Transactions on Neural Networks and Learning Systems, vol. 32, pp. 35–47, 2021.
  13. S. A. P. R. U. B. A. Kumar, "Real-time hand gesture recognition using MediaPipe framework and CNNs," Journal of Artificial Intelligence and Computer Vision, vol. 32, pp. 108–120, 2021.
  14. M. M. A. Singh and J. G. Ghosh, "Deep learning-based sign language recognition system for communication assistance," Int. J. of Applied Artificial Intelligence, vol. 30, pp. 1254–1265, 2020.
  15. R. Sharma and J. P. W. Wang, "Combining CNNs with MediaPipe for real-time gesture recognition in sign language," in Proc. Int. Conf. on Image Processing and Computer Vision, pp. 225–231, 2020.
  16. R. H. K. W. Wang, "Real-time translation of sign language to text and speech using machine learning," IEEE Transactions on Human-Machine Systems, vol. 50, no. 8, pp. 743–750, 2020.

This report presents the design and development of a Sign Language to Text and Speech Conversion System. The main goal of this project is to improve communication for people who are deaf or hard of hearing by translating sign language gestures into text and spoken words in real time. This helps bridge the communication gap between sign language users and people who don’t know sign language, making daily conversations easier and more inclusive. Our system uses a gesture recognition model based on Convolutional Neural Networks (CNNs) to accurately detect hand gestures that represent different signs. One of the major challenges in this process is handling changing lighting conditions, various backgrounds, and differences in hand shapes or gestures. To overcome these issues, we use vision-based techniques and landmark detection with the help of the MediaPipe library, which enhances the accuracy and performance of the system. After recognizing a gesture, the system converts it into text and uses Text-to-Speech (TTS) technology to generate clear spoken output. This allows people with hearing disabilities to communicate more smoothly with those unfamiliar with sign language, making interactions quicker and more effective. The report also discusses the positive impact this technology can have in places like schools, offices, and public service areas. It emphasizes the importance of ongoing improvements in machine learning and computer vision to make such systems even more reliable and user-friendly. Overall, this project highlights how modern technology can promote a more inclusive, accessible world for everyone.

Keywords : Sign Language to Text and Speech Conversion, Gesture Recognition, Convolutional Neural Networks (CNN), Text-to- Speech, Accessibility, Inclusivity, Computer Vision, MediaPipe, Real-Time Communication.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe