TY - GEN
T1 - A simple music/voice separation method based on the extraction of the repeating musical structure
AU - Rafii, Zafar
AU - Pardo, Bryan A
PY - 2011
Y1 - 2011
N2 - Repetition is a core principle in music. This is especially true for popular songs, generally marked by a noticeable repeating musical structure, over which the singer performs varying lyrics. On this basis, we propose a simple method for separating music and voice, by extraction of the repeating musical structure. First, the period of the repeating structure is found. Then, the spectrogram is segmented at period boundaries and the segments are averaged to create a repeating segment model. Finally, each time-frequency bin in a segment is compared to the model, and the mixture is partitioned using binary time-frequency masking by labeling bins similar to the model as the repeating background. Evaluation on a dataset of 1,000 song clips showed that this method can improve on the performance of an existing music/voice separation method without requiring particular features or complex frameworks.
AB - Repetition is a core principle in music. This is especially true for popular songs, generally marked by a noticeable repeating musical structure, over which the singer performs varying lyrics. On this basis, we propose a simple method for separating music and voice, by extraction of the repeating musical structure. First, the period of the repeating structure is found. Then, the spectrogram is segmented at period boundaries and the segments are averaged to create a repeating segment model. Finally, each time-frequency bin in a segment is compared to the model, and the mixture is partitioned using binary time-frequency masking by labeling bins similar to the model as the repeating background. Evaluation on a dataset of 1,000 song clips showed that this method can improve on the performance of an existing music/voice separation method without requiring particular features or complex frameworks.
KW - Binary Time-Frequency Masking
KW - Music/Voice Separation
KW - Repeating Pattern
UR - http://www.scopus.com/inward/record.url?scp=80051641202&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80051641202&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2011.5946380
DO - 10.1109/ICASSP.2011.5946380
M3 - Conference contribution
AN - SCOPUS:80051641202
SN - 9781457705397
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 221
EP - 224
BT - 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings
T2 - 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
Y2 - 22 May 2011 through 27 May 2011
ER -