Improving separation of harmonic sources with iterative estimation of spatial cues

Jinyu Han*, Bryan A Pardo

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Recent work in source separation of two-channel mixtures has used spatial cues (cross-channel amplitude and phase difference coefficients) to estimate time-frequency masks for separating sources. As sources increasingly overlap in the time-frequency domain or the spatial angle between sources decreases, these spatial cues become unreliable. We introduce a method to reestimate the spatial cues for mixtures of harmonic sources. The newly estimated spatial cues are fed to the system to update each source estimate and the pitch estimate of each source. This iterative procedure is repeated until the difference between the current estimate of the spatial cues and the previous one is under a pre-set threshold. Results on a set of three-source mixtures of musical instruments show this approach significantly improves separation performance of two existing time-frequency masking systems.

Original languageEnglish (US)
Title of host publication2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009
Pages77-80
Number of pages4
DOIs
StatePublished - 2009
Event2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009 - New Paltz, NY, United States
Duration: Oct 18 2009Oct 21 2009

Publication series

NameIEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Other

Other2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009
Country/TerritoryUnited States
CityNew Paltz, NY
Period10/18/0910/21/09

Keywords

  • Audio source separation
  • Harmonic mask
  • Spatial cues

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Improving separation of harmonic sources with iterative estimation of spatial cues'. Together they form a unique fingerprint.

Cite this