Solving permutations in frequency-domain for blind separation of an arbitrary number of speech sources

Iván Durán-Díaz; Auxiliadora Sarmiento; Sergio Cruces; Pablo Aguilera

doi:10.1121/1.3678657

Solving permutations in frequency-domain for blind separation of an arbitrary number of speech sources

J Acoust Soc Am. 2012 Feb;131(2):EL139-44. doi: 10.1121/1.3678657.

Authors

Iván Durán-Díaz¹, Auxiliadora Sarmiento, Sergio Cruces, Pablo Aguilera

Affiliation

¹ Signal Theory and Communications Department, University of Seville, Camino de los Descubrimientos S/N, 41092, Seville, Spain. duran@us.es

PMID: 22352613
DOI: 10.1121/1.3678657

Abstract

Blind separation of speech sources in reverberant environments is usually performed in the time-frequency domain, which gives rise to the permutation problem: the different ordering of estimated sources for different frequency components. A two-stage method to solve permutations with an arbitrary number of sources is proposed. The suggested procedure is based on the spectral consistency of the sources. At the first stage frequency bins are compared with each other, while at the second stage the neighboring frequencies are emphasized. Experiments for perfect separation situations and for live recordings show that the proposed method improves the results of existing approaches.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Humans
Psychophysics
Sound Localization / physiology*
Speech Perception / physiology*