Multi-resolution common fate transform

Fatemeh Pishdadian, Bryan A Pardo

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The multi-resolution common fate transform (MCFT) is an audio signal representation useful for representing mixtures of multiple audio signals that overlap in both time and frequency. The MCFT combines the invertibility of a state-of-theart representation, the common fate transform (CFT), and the multi-resolution property of the cortical stage output of an auditory model. Since the MCFT is computed based on a fully invertible complex time-frequency representation, separation of audio sources with high time-frequency overlap may be performed directly in the MCFT domain, where there is less overlap between sources than in the time-frequency domain. The MCFT circumvents the resolution issue of the CFT by using a multi-resolution two-dimensional (2D) filter bank instead of fixed-size 2D windows. This enables higher quality separation without the need to handtune the window size to the specific case. In this work, we describe theMCFT, discuss the properties of the MCFT with the aid of illustrative examples, and provide definitions and objective measures for two desirable representation properties: separability of source signals and clusterability of components of each signal. The utility of the MCFT for source separation is illustrated by performing ideal masking on a comprehensive dataset of audio mixtures of musical tones played in unison, including audio samples from a wide pitch range and a variety of instruments/playing techniques. Results show that the ideal masks made in the MCFT domain yield better separability than those made in commonly used time- frequency signal representations as well as the CFT. The use of the MCFT also results in more reliable clusterability than the CFT in most cases.

Original languageEnglish (US)
Article number8516327
Pages (from-to)342-354
Number of pages13
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume27
Issue number2
DOIs
StatePublished - Feb 2019

Keywords

  • Audio source separation
  • Clusterability
  • Multi-resolution common fate transform
  • Separability

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Multi-resolution common fate transform'. Together they form a unique fingerprint.

Cite this