Author: Matti Ryynänen,
Last modified: March 2008
M. Ryynänen, T. Virtanen, J. Paulus, and A. Klapuri, "Accompaniment Separation and Karaoke Application Based on Automatic Melody Transcription," in Proc. 2008 IEEE International Conference on Multimedia & Expo (ICME'08), Hannover, Germany, June 2008.
We propose a method for separating accompaniment from polyphonic music and its karaoke application, both based on automatic melody transcription. First, the method transcribes the lead-vocal melody of an existing polyphonic music piece, where the transcription consists of a MIDI note sequence and detailed fundamental frequency (F0) trajectory for each note. Based on the note F0 trajectories, the method uses sinusoidal modeling to estimate, synthesize, and remove the lead vocals in the piece, thus producing separated accompaniment of the piece.
User sings along with the separated accompaniment similar to karaoke while the user singing can be tuned to the transcribed melody. This will help non-professional singers to produce more appealing karaoke performances. The quality of separated accompaniments was quantitatively evaluated with approximately one hour of polyphonic music, including material from a commercial karaoke DVD.
The following examples demonstrate the accompaniment separation on audio clips from commercial music recordings. The method takes the original song as an input and produces the separated accompaniment as an output, i.e., a version where the lead vocal melody has been suppressed.
|Song||Original song (input)||Separated accompaniment (output)|
|Audioslave: Shadow on the Sun||mp3||mp3|
|Gary Barlow: Love Won't Wait||mp3||mp3|
|Dredg: 18 People Live in Harmony||mp3||mp3|
|Jamiroquai: When You Gonna Learn||mp3||mp3|
|Laura Pausini: Surrender||mp3||mp3|
|Kelly Rowland: Stole||mp3||mp3|
|Roxette: It Must Have Been Love||mp3||mp3|
|Rush: Ghost Rider||mp3||mp3|
|Verve: Bitter Sweet Symphony||mp3||mp3|
The video (wmv, 4.8 MB) illustrates a prototype of the karaoke application. The user sings along with the separated accompaniment and the user pitch is illustrated with the green line. The red line shows the melody transcribed from the original recording. The user singing can be tuned to the exact pitch of the original melody, or to the nearest octave to avoid unnatural (over one octave) pitch shifts.