Optimizing Speech Models with Freezing
Authors : Revanth Reddy Pasula
Volume/Issue : RISEM–2025
Google Scholar : https://tinyurl.com/3v3yut9v
Scribd : https://tinyurl.com/y65zshfm
DOI : https://doi.org/10.38124/ijisrt/25jun167
Abstract : Adapting speech models to new languages requires an optimization of the trade-off between accuracy and computational cost. In this work, we investigate the optimization of Mozilla’s DeepSpeech model when adapted from English to German and Swiss German through selective freezing of layers. Employing a strategy of transfer learning, we analyze the performance impacts of freezing different numbers of network layers during fine-tuning. The experiment reveals that freezing the initial layers achieves significant performance improvements: training time decreases and accuracy increases. This layer-freezing technique hence offers an extensible way to improve automated speech recognition for under-resourced languages.
Keywords : Automatic Speech Recognition (ASR); Deep Speech; German; Layer Freezing; Low-Resource Languages; Swiss German; Transfer Learning.
Keywords : Automatic Speech Recognition (ASR); Deep Speech; German; Layer Freezing; Low-Resource Languages; Swiss German; Transfer Learning.

