Targeted Voice Separation


Authors : Aakanksha Desai; Varsha Kini; Vrunda Mange; Suvarna Chaure

Volume/Issue : Volume 7 - 2022, Issue 10 - October

Google Scholar : https://bit.ly/3IIfn9N

Scribd : https://bit.ly/3NObXqa

DOI : https://doi.org/10.5281/zenodo.7306591

Abstract : Speech is the preferred means of communication between people. It is starting to be the primary means of contact between machines and humans. Machines are increasingly able to imitate many of the conversational exchange capabilities for well-defined tasks. As a result, the ability of sophisticated machines can be used to meet social needs without burdening the consumer beyond the experience of natural spoken language. Speaker separation is a task to distinguish the target speaker’s voice from interference. This interference can be the voices of other speakers in the background. In this paper, we present a method for obtaining a solution to the cocktail party problem by using neural networks. The input is an audio file containing voices of multiple speakers talking at the same time, and the clean speech of the target speaker. The output will be target speech separated from mixed audio in input.

Keywords : Cocktail Party Problem, Neural Networks, Voice Separation.

Speech is the preferred means of communication between people. It is starting to be the primary means of contact between machines and humans. Machines are increasingly able to imitate many of the conversational exchange capabilities for well-defined tasks. As a result, the ability of sophisticated machines can be used to meet social needs without burdening the consumer beyond the experience of natural spoken language. Speaker separation is a task to distinguish the target speaker’s voice from interference. This interference can be the voices of other speakers in the background. In this paper, we present a method for obtaining a solution to the cocktail party problem by using neural networks. The input is an audio file containing voices of multiple speakers talking at the same time, and the clean speech of the target speaker. The output will be target speech separated from mixed audio in input.

Keywords : Cocktail Party Problem, Neural Networks, Voice Separation.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe