Automatic transcription of the singing melody in polyphonic music, examples

Author: Matti Ryynänen, homepage
Last modified: April 2006

Transcription method

M. Ryynänen and A. Klapuri, "Transcription of the Singing Melody in Polyphonic Music," in Proc. 7th International Conference on Music Information Retrieval (ISMIR 2006), Victoria, Canada, October 2006. pdf

This page contains examples of automatic transcription of singing melodies in polyphonic music. The acoustic music signal (e.g., in a WAV file) is automatically converted into a MIDI file. The transcription method is based on multiple-F0 estimation and note onset detection followed by acoustic and musicological modeling. The acoustic modeling employs a model for singing notes and a model for no-melody segments. The musicological model uses key estimation and note bigrams to determine the transition probabilities between the notes. A single Viterbi decoding produces a sequence of notes and rests as a transcription of the singing melody. The method is evaluated using the RWC popular music database for which the recall rate was 63% and precision rate 46%. A significant improvement was achieved compared to the reference method from MIREX05 evaluations.

The transcription method produces a MIDI file containing the transcribed melody. The following sound examples include the acoustic input signal in the left channel and the transcribed melody (synthesized directly from the output MIDI file) in the right channel.

For each example, we show a score image of the transcribed singing melody. The score images are automatically generated from the transcription MIDI files by using Lilypond music score typesetter. Please notice that the notes in singing-melody transcriptions are not temporally quantized so that the score typesetting could be done unambiguously. Therefore, the score images are not very accurate but give some idea on the transcribed melodies.

RWC music database: popular music

All the following transcription examples are from the RWC (Real World Computing) popular music database which contains 100 acoustic recordings of typical pop songs [1]. The same database was used in the development and the evaluation of the transcription method. See RWC music database homepage for additional information on the music performance examples. The authors of the RWC database kindly granted a permission to set the following RWC input music files available.


Transcription examples

Song: Prologue (RWC-MDB-P-2001 No. 7)

Song: Ienai (RWC-MDB-P-2001 No. 11)

Song: Karehairo no Twilight (RWC-MDB-P-2001 No. 14)

Song: Tell me (RWC-MDB-P-2001 No. 25)

Song: Hajimari (RWC-MDB-P-2001 No. 45)

Song: Syodoubutsu (RWC-MDB-P-2001 No. 48)

Song: Modoranai natsu (RWC-MDB-P-2001 No. 51)

Song: Tenshi no utatane (RWC-MDB-P-2001 No. 59)

Song: Miageta sora wa (RWC-MDB-P-2001 No. 70)

Song: Toui machi e (RWC-MDB-P-2001 No. 75)

Song: Don't Lie To Me (RWC-MDB-P-2001 No. 97)