TY - GEN
T1 - Adaptive filtering for music/voice separation exploiting the repeating musical structure
AU - Liutkus, Antoine
AU - Rafii, Zafar
AU - Badeau, Roland
AU - Pardo, Bryan A
AU - Richard, Gael
PY - 2012
Y1 - 2012
N2 - The separation of the lead vocals from the background accompaniment in audio recordings is a challenging task. Recently, an efficient method called REPET (REpeating Pattern Extraction Technique) has been proposed to extract the repeating background from the non-repeating foreground. While effective on individual sections of a song, REPET does not allow for variations in the background (e.g. verse vs. chorus), and is thus limited to short excerpts only. We overcome this limitation and generalize REPET to permit the processing of complete musical tracks. The proposed algorithm tracks the period of the repeating structure and computes local estimates of the background pattern. Separation is performed by soft time-frequency masking, based on the deviation between the current observation and the estimated background pattern. Evaluation on a dataset of 14 complete tracks shows that this method can perform at least as well as a recent competitive music/voice separation method, while being computationally efficient.
AB - The separation of the lead vocals from the background accompaniment in audio recordings is a challenging task. Recently, an efficient method called REPET (REpeating Pattern Extraction Technique) has been proposed to extract the repeating background from the non-repeating foreground. While effective on individual sections of a song, REPET does not allow for variations in the background (e.g. verse vs. chorus), and is thus limited to short excerpts only. We overcome this limitation and generalize REPET to permit the processing of complete musical tracks. The proposed algorithm tracks the period of the repeating structure and computes local estimates of the background pattern. Separation is performed by soft time-frequency masking, based on the deviation between the current observation and the estimated background pattern. Evaluation on a dataset of 14 complete tracks shows that this method can perform at least as well as a recent competitive music/voice separation method, while being computationally efficient.
KW - Music/voice separation
KW - adaptive algorithms
KW - repeating pattern
KW - time-frequency masking
UR - http://www.scopus.com/inward/record.url?scp=84867605849&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867605849&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2012.6287815
DO - 10.1109/ICASSP.2012.6287815
M3 - Conference contribution
AN - SCOPUS:84867605849
SN - 9781467300469
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 53
EP - 56
BT - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
T2 - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Y2 - 25 March 2012 through 30 March 2012
ER -