Authors :
Aakanksha Desai; Varsha Kini; Vrunda Mange; Suvarna Chaure
Volume/Issue :
Volume 7 - 2022, Issue 10 - October
Google Scholar :
https://bit.ly/3IIfn9N
Scribd :
https://bit.ly/3NObXqa
DOI :
https://doi.org/10.5281/zenodo.7306591
Abstract :
Speech is the preferred means of
communication between people. It is starting to be the
primary means of contact between machines and humans.
Machines are increasingly able to imitate many of the
conversational exchange capabilities for well-defined tasks.
As a result, the ability of sophisticated machines can be
used to meet social needs without burdening the consumer
beyond the experience of natural spoken language.
Speaker separation is a task to distinguish the target
speaker’s voice from interference. This interference can be
the voices of other speakers in the background. In this
paper, we present a method for obtaining a solution to the
cocktail party problem by using neural networks. The
input is an audio file containing voices of multiple speakers
talking at the same time, and the clean speech of the target
speaker. The output will be target speech separated from
mixed audio in input.
Keywords :
Cocktail Party Problem, Neural Networks, Voice Separation.
Speech is the preferred means of
communication between people. It is starting to be the
primary means of contact between machines and humans.
Machines are increasingly able to imitate many of the
conversational exchange capabilities for well-defined tasks.
As a result, the ability of sophisticated machines can be
used to meet social needs without burdening the consumer
beyond the experience of natural spoken language.
Speaker separation is a task to distinguish the target
speaker’s voice from interference. This interference can be
the voices of other speakers in the background. In this
paper, we present a method for obtaining a solution to the
cocktail party problem by using neural networks. The
input is an audio file containing voices of multiple speakers
talking at the same time, and the clean speech of the target
speaker. The output will be target speech separated from
mixed audio in input.
Keywords :
Cocktail Party Problem, Neural Networks, Voice Separation.