Separation of sound sources by non-negative matrix factorization.
-
TypeDoctorate
-
KeywordsSource separation, non-negative matrix factorisation, musical instruments classification
Description
Starting from a recording containing a mixture of several sound sources, source separation consists in estimating in a univocal way each of the sources. In this thesis the recordings contain several musical instruments, and the goal is to identify the parts where an instruments plays alone or to separate various instruments playing simultaneously. We propose to study non-negative matrix factorization (NMF) as a tool for source separation. Starting with a matrix M(m×n) whose entries are positive (e.g., the source mixture) and a factorization rank r (e.g., the number of sources), the goal of NMF is to compute two positive matrices U(m×r) and V(r×n) such as the product UV is the closest possible to M. The matrices U and V allow then to estimate the contribution of the various sources [1, 2, 3]. We will study in particular the effects of NMF on the distorsion of the sources after separation.
[1] Cichocki, A., Zdunek, R., Phan, A. H., & Amari, S. I. (2009). Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. John Wiley & Sons.
[2] Schmidt, M. N., & Olsson, R. K. (2006). Single-channel speech separation using sparse non-negative matrix factorization. In ISCA International Conference on Spoken Language Proceesing, (INTERSPEECH).
[3] Févotte, C., Bertin, N., & Durrieu, J. L. (2009). Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural computation, 21(3), 793-830.