Accompaniment Separation and Karaoke Application Based on Automatic Melody Transcription, examples

Author: Matti Ryynänen, homepage
Last modified: March 2008

M. Ryynänen, T. Virtanen, J. Paulus, and A. Klapuri, "Accompaniment Separation and Karaoke Application Based on Automatic Melody Transcription," in Proc. 2008 IEEE International Conference on Multimedia & Expo (ICME'08), Hannover, Germany, June 2008.

We propose a method for separating accompaniment from polyphonic music and its karaoke application, both based on automatic melody transcription. First, the method transcribes the lead-vocal melody of an existing polyphonic music piece, where the transcription consists of a MIDI note sequence and detailed fundamental frequency (F0) trajectory for each note. Based on the note F0 trajectories, the method uses sinusoidal modeling to estimate, synthesize, and remove the lead vocals in the piece, thus producing separated accompaniment of the piece.

User sings along with the separated accompaniment similar to karaoke while the user singing can be tuned to the transcribed melody. This will help non-professional singers to produce more appealing karaoke performances. The quality of separated accompaniments was quantitatively evaluated with approximately one hour of polyphonic music, including material from a commercial karaoke DVD.

Example from Evaluation

Song with vocals mp3, i.e., the input signal for the method.
Estimated vocals in the input signal mp3.
Separated accompaniment mp3. This is the output of the method, produced by removing the estimated vocals from the input.
Original karaoke version mp3. The separated accompaniment is compared with this signal for evaluation.

Accompaniment Separation Examples on Commercial Recordings

The following examples demonstrate the accompaniment separation on audio clips from commercial music recordings. The method takes the original song as an input and produces the separated accompaniment as an output, i.e., a version where the lead vocal melody has been suppressed.

SongOriginal song (input)Separated accompaniment (output)
Audioslave: Shadow on the Sunmp3 mp3
Gary Barlow: Love Won't Wait mp3 mp3
Dredg: 18 People Live in Harmony mp3 mp3
Jamiroquai: When You Gonna Learn mp3 mp3
Laura Pausini: Surrender mp3 mp3
Kelly Rowland: Stole mp3 mp3
Roxette: It Must Have Been Love mp3 mp3
Rush: Ghost Rider mp3 mp3
Tool: Pot mp3 mp3
Verve: Bitter Sweet Symphony mp3 mp3

Example Video of the Karaoke Application

The video (wmv, 4.8 MB) illustrates a prototype of the karaoke application. The user sings along with the separated accompaniment and the user pitch is illustrated with the green line. The red line shows the melody transcribed from the original recording. The user singing can be tuned to the exact pitch of the original melody, or to the nearest octave to avoid unnatural (over one octave) pitch shifts.