Abstract
Musical works are often composed of two characteristic components: the background (typically the musical accompaniment), which generally exhibits a strong rhythmic structure with distinctive repeating time elements, and the melody (typically the singing voice or a solo instrument), which generally exhibits a strong harmonic structure with a distinctive predominant pitch contour. Drawing from findings in cognitive psychology, we propose to investigate the simple combination of two dedicated approaches for separating those two components: a rhythm-based method that focuses on extracting the background via a rhythmic mask derived from identifying the repeating time elements in the mixture and a pitch-based method that focuses on extracting the melody via a harmonic mask derived from identifying the predominant pitch contour in the mixture. Evaluation on a data set of song clips showed that combining such two contrasting yet complementary methods can help to improve separation performance-from the point of view of both components-compared with using only one of those methods, and also compared with two other state-ofthe- art approaches.
Original language | English (US) |
---|---|
Pages (from-to) | 1884-1893 |
Number of pages | 10 |
Journal | IEEE/ACM Transactions on Audio Speech and Language Processing |
Volume | 22 |
Issue number | 12 |
DOIs | |
State | Published - Dec 1 2014 |
Keywords
- Background
- Melody
- Pitch
- Rhythm
- Separation
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- Acoustics and Ultrasonics
- Computational Mathematics
- Electrical and Electronic Engineering