Solving permutations in frequency-domain for blind separation of an arbitrary number of speech sources

J Acoust Soc Am. 2012 Feb;131(2):EL139-44. doi: 10.1121/1.3678657.

Abstract

Blind separation of speech sources in reverberant environments is usually performed in the time-frequency domain, which gives rise to the permutation problem: the different ordering of estimated sources for different frequency components. A two-stage method to solve permutations with an arbitrary number of sources is proposed. The suggested procedure is based on the spectral consistency of the sources. At the first stage frequency bins are compared with each other, while at the second stage the neighboring frequencies are emphasized. Experiments for perfect separation situations and for live recordings show that the proposed method improves the results of existing approaches.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Humans
  • Psychophysics
  • Sound Localization / physiology*
  • Speech Perception / physiology*