Tuomas Virtanen research homepage
Tampere University of Technology
P.O. Box 553, FI-33101 Tampere, FINLAND
Tel. +358 401981308
tuomas.virtanen@tut.fi
I am a research fellow and an adjuct professor at the Department of Signal Processing,
Multimedia
Research Group, Tampere University of Technology
, where I am steering the Audio Research Team.
My
research topic is computational analysis of audio, especially
sound source separation, which
has several applications in the analysis, editing and
manipulation of audio signals. These include for example structured
audio coding, automatic transcription of music, and noise-robust automatic speech recognition.
I completed my PhD studies in November 2006. My doctoral thesis "Sound Source
Separation in
Monaural
Music Signals"
in pdf format: virtanen_phd.pdf.
Audio demonstrations.
Teaching
Publications
- E. Helander, H. Silen, T. Virtanen, M. Gabbouj.
Voice Conversion Using Dynamic Kernel Partial Least Squares
Regression. IEEE Transactions on Audio, Speech and Language
processing, Volume 20, Issue 3, 2012.
-
R. Saeidi, A. Hurmalainen, T. Virtanen, D.A. van Leeuwen.
Exemplar-based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification. In. proc.
Odyssey 2012: The Speaker and Language Recognition Workshop, Singapore, 2012.
- Ali Bahrami Rad and Tuomas Virtanen. Phase spectrum prediction of
audio signals,
5th International Symposium on Communications, Control and Signal
Processing Rome, Italy, 2012.
- J. F. Gemmeke and T. Virtanen and A. Hurmalainen Exemplar-based
sparse representations for noise robust automatic speech recognition,, IEEE
Trans. Audio, Speech and Language Processing, Volume: 19, Issue: 7, 2011.
- J.J. Carabias-Orti, T. Virtanen, P. Vera-Candeas, N. Ruiz-Reyes and
F.J. Canadas-Quesada. Musical Instrument Sound Multi-Excitation Model for
Non-Negative Spectrogram Factorization. IEEE Journal of Selected
Topics in Signal Processing, Volume: 5, Issue: 6, 2011.
- B. Raj, R. Singh, and T. Virtanen. Phoneme-dependent NMF for speech
enhancement in monaural mixtures. 12th Annual Conference of the International Speech
Communication Association, Florence, Italy, 2011.
- K. Mahkonen, A. Hurmalainen, T. Virtanen, and J. Gemmeke. Mapping Sparse
Representation to State Likelihoods in Noise-Robust Automatic Speech
Recognition, 12th Annual Conference of the International Speech
Communication Association, Florence, Italy, 2011.
- H. Kallasjoki, U. Remes, J. F. Gemmeke, T. Virtanen, and K. J. Palomäki.
Uncertainty measures for improving exemplar-based source separation.
12th Annual Conference of the International Speech
Communication Association, Florence, Italy, 2011.
- J. Nikunen, T. Virtanen, and M. Vilermo. Multichannel audio upmixing
based on non-negative tensor factorization representation. IEEE
Workshop on Applications of Signal Processing to Audio and Acoustics,
New Paltz, NY, 2011. Accepted for publication.
- J. F. Gemmeke, A. Hurmalainen, T. Virtanen, and Yang Sun.
Toward a Practical Implementation of Exemplar-Based Noise Robust
ASR. To be presented in 19th European Signal Processing
Conference 2011, Barcelona, Spain
- A. Hurmalainen, K. Mahkonen, J. F. Gemmeke, and T. Virtanen.
Exemplar-Based Recognition of Speech in Highly Variable Noise.
International Workshop on Machine Listening in Multisource
Environments, Florence, Italy, 2011.
- T. Heittola, A. Mesaros, T. Virtanen, and A. Eronen.
Sound Event Detection in Multisource Environments Using Source
Separation.
International Workshop on Machine Listening in Multisource
Environments, Florence, Italy, 2011.
- J. F. Gemmeke, T. Virtanen, and A. Hurmalainen. Exemplar-Based Speech
Enhancement and its Application to Noise-Robust Automatic Speech
Recognition.
International Workshop on Machine Listening in Multisource
Environments, Florence, Italy, 2011.
-
A. Hurmalainen, J. Gemmeke, and T. Virtanen. Non-negative
matrix deconvolution in noise robust speech recognition, to be presented in ICASSP 2011.
-
T. Virtanen, J. Gemmeke, and A. Hurmalainen. State-based
labelling for a sparse representation of speech and its application to
robust speech recognition, presented in Interspeech 2010.
-
B. Raj, T. Virtanen, S. Chaudhure, and R. Singh.
Non-negative matrix factorization based compensation of music for
automatic speech recognition
, presented in Interspeech 2010.
-
J. Gemmeke and T. Virtanen.
Artificial and online acquired noise dictionaries for noise robust ASR
, presented in Interspeech 2010.
-
A. Mesaros and T. Virtanen. Automatic recognition of lyrics in
singing, EURASIP Journal on Audio, Speech and Music Processing, Volume
2010 Article ID 546047. (online
version and pdf)
- A. Klapuri and T. Virtanen, Representing Musical Sounds with an
Interpolating State Model,, IEEE Trans. Audio, Speech and Language
Processing, vol 18. no. 3, 2010.
- E. Helander, T. Virtanen, J. Nurminen, and M. Gabbouj.
Voice Conversion Using
Partial Least Squares Regression. IEEE Transactions on
Audio, Speech, and Language Processing, 18 (5), 2010.
- M. Helén and T. Virtanen, Audio query by example using similarity
measures between probability density functions of features, EURASIP
Journal on Audio, Speech and Music Processing, Volume 2010, Article ID
179303. (online
version and pdf)
- T. Heittola, A. Mesaros, A. Eronen, and T. Virtanen. Audio
context recognition using audio event histograms, in proc.
2010 European Signal Processing Conference (EUSIPCO-2010)
- A. Mesaros, T. Heittola,A. Eronen, and T. Virtanen. Acoustic
event detection in real life recordings, in proc. 2010 European Signal Processing Conference (EUSIPCO-2010)
- S. Keronen, U. Remes, K. Palomäki, T. Virtanen, and M. Kurimo.
Comparison of Noise Robust Methods in Large Vocabulary Speech
Recognition, in proc.
2010 European Signal Processing Conference (EUSIPCO-2010)
- J. Nikunen and T. Virtanen, Object-Based Audio Coding Using
Non-Negative Matrix Factorization for the Spectrogram
Representation, in proc. 128th Audio Engineering Society
Convention, London, UK, 2010.
- J. F. Gemmeke and T. Virtanen
Noise robust exemplar-based connected digit recognition,
in proc. of the 35th International Conference on Acoustics, Speech, and
Signal Processing (ICASSP),
Dallas, USA, 2010.
- A. Klapuri, T. Virtanen, and T. Heittola. Sound source separation
in monaural music signals using excitation-filter model and EM
algorithm, in proc. of the 35th International Conference on
Acoustics, Speech, and Signal Processing (ICASSP), Dallas, USA, 2010.
- A. Mesaros and T. Virtanen, Recognition of phonemes and words in
singing, in proc. of
the 35th International Conference on Acoustics, Speech, and
Signal Processing (ICASSP),
Dallas, USA, 2010.
- J. Nikunen and T. Virtanen, Noise-to-mask ratio minimization by
weighted non-negative matrix factorization, in proc. of
the 35th International Conference on Acoustics, Speech, and
Signal Processing (ICASSP),
Dallas, USA, 2010.
- T. Heittola, A. Klapuri, and T. Virtanen.
Musical Instrument Recognition in Polyphonic Audio Using Source-Filter
Model for Sound Separation, in Proc. 10th Int. Society for
Music Information Retrieval Conf. (ISMIR 2009), Kobe, Japan, 2009. The
paper won the best paper award of the conference.
- T. Virtanen and T. Heittola.
Interpolating Hidden Markov Model and Its Application to
Automatic Instrument Recognition, in proc. ICASSP 2009.
- A. Mesaros. and T. Virtanen.
Adaptation of a speech recognizer for singing voice
, in EUSIPCO 2009.
- T. Virtanen.
Spectral Covariance in Prior Distributions of Non-Negative Matrix
Factorization Based Speech Separation
, in EUSIPCO 2009.
- M. Myllymäki and T. Virtanen.
Non-Stationary Noise Model Compensation in Voice Activity Detection
, in EUSIPCO 2009.
- T. Virtanen and A. T. Cemgil.
Mixtures of Gamma Priors for Non-Negative Matrix
Factorization Based Speech Separation, in proc. ICA 2009. © Springer-Verlag. The
publication will become available at springerlink.com.
- T. Virtanen, A. Mesaros, M. Ryynänen. Combining
Pitch-Based Inference and
Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic
Music, SAPA 2008.
- T. Virtanen, A. T. Cemgil, and S. J. Godsill. Bayesian
Extensions to Non-negative Matrix Factorisation for Audio Signal
Modelling, ICASSP 2008. This work was carried out in University of Cambridge, Signal Processing and
Communications Laboratory.
- A. Mesaros and T. Virtanen. Automatic
Alignment of Music Audio and Lyrics, DAFX08.
- M. Myllymäki and T. Virtanen. Voice
Activity Detection
in the Presence of Breathing Noise Using Neural Network and Hidden Markov
Model, EUSIPCO
2008.
- M. Ryynänen, T. Virtanen, J. Paulus, and A. Klapuri, Accompaniment
Separation and Karaoke Application Based on Automatic Melody
Transcription, in Proc. 2008 IEEE International Conference on
Multimedia & Expo (ICME'08), Hannover, Germany, June 2008. (demonstrations)
- A. Klapuri and T. Virtanen, Progress towards automatic music
transcription, In Handbook of Signal Processing in Acoustics, David
Havelock, Sonoko Kuwano, and Michael Vorlander (Eds.), Springer-Verlag,
2008.
- Virtanen, Tuomas., Monaural Sound Source Separation by Nonnegative Matrix
Factorization with Temporal Continuity and Sparseness Criteria,
IEEE Transactions on Audio, Speech, and Language Processing, vol 15, no. 3, March 2007.
- Virtanen, T., Helén, M.,
Probabilistic Model Based Similarity Measures for Audio Query-by-Example
, in proc. WASPAA 2007.
- Mesaros, A., Virtanen, T., Klapuri, A.
Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition
Methods,
International Conference on Music Information Retrieval, Vienna, Austria, 2007.
- Helén, M., Virtanen, T., Query
by Example of Audio signals Using Euclidean Distance Between Gaussian
Mixture Models, in proc. ICASSP 2007. Note: two small
errors in equations (8) - (11) have been corrected. The corrections do
not appear in the ICASSP conference proceedings.
- Helén, M., Virtanen, T.,
A Similarity Measure for Audio Query by Example Based on Perceptual Coding and Compression, in proc. 10th International Conference on Digital Audio Effects (DAFx-07), September 10-15. 2007.
- Virtanen, Tuomas, Monaural
Sound Source Separation by Perceptually Weighted Non-Negative Matrix
Factorization, Technical report, Tampere University of Technology,
Institute of Signal Processing, 2007.
- Virtanen, T., Klapuri, A., Analysis of polyphonic audio using source-filter model and non-negative matrix factorization, in Advances in Models for Acoustic Processing, Neural Information Processing Systems Workshop, 2006 (extended abstract).
- Virtanen, Tuomas., Speech Recognition Using Factorial Hidden Markov Models for Separation in the Feature Space, in proc. Interspeech 2006, Pittsburgh, USA. (demonstrations). The second best results among the papers presented in Interspeech 2006 Speech Separation Challenge special session.
- Virtanen, Tuomas. Unsupervised Learning Methods for Source Separation, in "Signal Processing Methods for Music Transcription", eds. Klapuri, A., Davy, M., Springer-Verlag, 2006.
- Helén, M., Virtanen, T., Separation of Drums From Polyphonic Music Using Non-Negative Matrix Factorization and Support Vector Machine, in proc. 13th European Signal Processing Conference Antalaya, Turkey, 2005.
(demonstrations)
- Klapuri, A., Virtanen, T., Helén, M., Modeling musical sounds with an interpolating state model, in proc. 13th European Signal Processing Conference, Antalya, Turkey, 2005.
- Paulus, J., Virtanen, T., Drum Transcription with Non-negative Spectrogram Factorisation, in proc. 13th European Signal Processing Conference Antalaya, Turkey, 2005
(demonstrations)
- Virtanen, Tuomas,
Separation of Sound Sources by Convolutive Sparse Coding,
ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, SAPA 2004.(demonstrations)
-
M.Helén, T.Virtanen, Perceptually Motivated Parametric Representation for Harmonic Sounds for Data Compression Purposes, 6th International conference on Digital Audio Effects (DAFx-03), 2003, London, UK.
- Virtanen, Tuomas,
Algorithm for the separation of harmonic sounds with
time-frequency smoothness constraint, in proc. the 6th
International Conference on Digital Audio Effects (DAFx-03), London, UK.
- Virtanen, Tuomas,
Sound Source Separation Using Sparse Coding with Temporal
Continuity Objective, International Computer Music
Conference, ICMC 2003.
(demonstrations)
- Parviainen, M., Virtanen, T.,
Two-channel separation of
speech
using direction-of-arrival estimation and sinusoids plus transients
modeling, IEEE International Symposium on Intelligent Signal
Processing and Communication Systems, ISPACS 2003.
- Virtanen, T., Klapuri A.,
Separation of Harmonic Sounds Using Linear Models for the Overtone
Series, IEEE International Conference on Acoustics, Speech and
Signal Processing, ICASSP 2002.
(demonstrations)
- Virtanen, Tuomas,
Accurate Sinusoidal Model Analysis and Parameter Reduction
by Fusion of Components, 110th Audio Engineering Society Convention,
Amsterdam, Netherlands 2001.
- Virtanen, T., Klapuri A.
Separation of Harmonic Sounds Using Multipitch Analysis and Iterative
Parameter Estimation, Proc. IEEE Workshop on Applications of
Signal Processing to Audio and Acoustics, New Paltz, New York, 2001.
(demonstrations)
- Klapuri, A., Virtanen, T., Holm, J.-M.,
Robust multipitch estimation for the analysis and manipulation of
polyphonic musical signals. In Proc. COST-G6 Conference
on Digital Audio Effects, DAFx-00, Verona, Italy, 2000.
- Sillanpää, J., Klapuri, A., Seppänen, J., Virtanen, T.,
Recognition of acoustic noise mixtures by combined bottom-up and
top-down processing. Proceedings of the European Signal
Processing Conference EUSIPCO, 2000.
- Virtanen, T., Klapuri, A.
Separation of Harmonic Sound Sources Using Sinusoidal Modeling,
IEEE International Conference on Acoustics, Speech and Signal Processing,
ICASSP 2000.
(demonstrations)
- Virtanen, Tuomas,
Audio Signal Modeling with Sinusoids Plus Noise, MSc
thesis, Tampere University of Technology 2001.
(demonstrations 1,
demonstrations 2)
IEEE-Copyrighted Material:
Personal use of this material is permitted. However, permission to
reprint/republish this material for advertising or promotional
purposes or for creating new collective works for resale or
redistribution to servers or lists, or to reuse any copyrighted
component of this work in other works, must be obtained from the
IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service
Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331,
USA. Telephone: +Intl. 908-562-3966.
- Tuomas Virtanen, tuomas.virtanen@tut.fi