Non-negative matrix factorization based compensation of music for automatic speech recognition - Demonstrations

The following audio examples demonstrate the test signals used in this paper. The speech samples are from the Wall Street Journal database, and the music samples from RWC Genre and RWC Classical databases . The separated signals are obtained using the style and speaker dependent models described in the paper.

Original signal Mixture signal Music style SNR Separated speech
female speaker classical_15db_1.wav classical 15 dB classical_15db_1.wav
classical_5db_1.wav 5 dB classical_5db_1.wav
classical_minus5db_1.wav -5 dB classical_minus5db_1.wav
jazz_15db_1.wav jazz 15 dB jazz_15db_1.wav
jazz_5db_1.wav 5 dB jazz_5db_1.wav
jazz_minus5db_1.wav -5 dB jazz_minus5db_1.wav
latin_15db_1.wav latin 15 dB latin_15db_1.wav
latin_5db_1.wav 5 dB latin_5db_1.wav
latin_minus5db_1.wav -5 dB latin_minus5db_1.wav
world_15db_1.wav world 15 dB world_15db_1.wav
world_5db_1.wav 5 dB world_5db_1.wav
world_minus5db_1.wav -5 dB world_minus5db_1.wav
male speaker classical_15db_3.wav classical 15 dB classical_15db_3.wav
classical_5db_3.wav 5 dB classical_5db_3.wav
classical_minus5db_3.wav -5 dB classical_minus5db_3.wav
jazz_15db_3.wav jazz 15 dB jazz_15db_3.wav
jazz_5db_3.wav 5 dB jazz_5db_3.wav
jazz_minus5db_3.wav -5 dB jazz_minus5db_3.wav
latin_15db_3.wav latin 15 dB latin_15db_3.wav
latin_5db_3.wav 5 dB latin_5db_3.wav
latin_minus5db_3.wav -5 dB latin_minus5db_3.wav
world_15db_3.wav world 15 dB world_15db_3.wav
world_5db_3.wav 5 dB world_5db_3.wav
world_minus5db_3.wav -5 dB world_minus5db_3.wav

Demonstrations main page

- Tuomas Virtanen, tuomas.virtanen@tut.fi