A fundamental problem in sound separation is that when two or more sounds overlap with each other in time and frequency, separation is diffucult and there is no general method to resolve the component sounds. By making some assumptions of the underlying signals, the parameters of the sources can be estimated from the mixture signals, and signals which are perceptually close to the original ones can be synthesized.

- Exemplar-based speech enhancement and its application to noise-robust aut omatic speech recognition (submitted to the CHiME workshop)
- Non-negative matrix factorization based compensation of music for automatic speech recognition (Interspeech 2010)
- Spectral Covariance in Prior Distributions of Non-Negative Matrix Factorization Based Speech Separation (presented in EUSIPCO 2009)
- Mixtures of Gamma Priors for Non-Negative Matrix Factorization Based Speech Separation (presented in ICA 2009 (pdf)
- Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music (SAPA 2008) (pdf)
- Speech Recognition Using Factorial Hidden Markov Models for Separation in the Feature Space (ICSLP 2006) (pdf)
- Monaural Sound Source Separation by Non-Negative Matrix Factorization with Temporal Continuity and Sparseness Criteria
- Audiopianoroll - an interactive Matlab demonstration of a note separation algorithm
- Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine (EUSIPCO 2005)
- Drum transcription with non-negative spectrogram factorisation (EUSIPCO 2005)
- Monaural Sound Source Separation by Perceptually Weighted Non-Negative Matrix Factorization, (pdf)
- Separation of Sound Sources by Convolutive Sparse Coding (SAPA 2004) (pdf)
- Sound Source Separation Using Sparse Coding with Temporal Continuity Objective (ICMC 2003) (pdf)
- Separation of harmonic sounds using linear models for the overtone series (ICASSP 2002) (pdf)
- Separation of harmonic sounds using multipitch analysis and iterative parameter estimation (WASPAA 2001) (pdf)
- Separation of harmonic sound sources using sinusoidal modeling (ICASSP 2000) (pdf)

My Master of science thesis " Audio Signal Modeling with Sinusoids Plus Noise" in PDF format, and related demonstration signals.

Sinusoidal modeling represents the periodic part of a signal as a sum of sinusoids with time-varying amplitudes and frequencies. The residual, which is ideally contains the non-periodic stochastic part of the signal, is represented with filtered noise which preserves the short-time energies within each Bark band.

Sinusoidal modeling
is a good mid-level reprentation and a powerful musical tool, because
it preserves the exact frequencies of harmonic partials. The parametric data
obtained from sinusoidal or stochastical modeling can be further utilized
for example in automatic transcription and instrument recognition. The estimated parameters can also be used to modify the pitch and speed of the signal (demonstrations)

Back to the main page.

- Tuomas Virtanen, tuomas.virtanen@tut.fi