Author: Matti Ryynänen,
Last modified: October 2007
Query by humming (QBH) refers to music information retrieval systems where short audio clips of singing or humming act as queries. In a normal use case of QBH, a user wants to find a song from a large database of music recordings. If the user does not remember the name of the artist or the song to make a metadata query, a natural option is to sing, hum, or whistle a part of the melody of the song into a microphone and let a QBH system to retrieve the song.
M. Ryynänen and A. Klapuri, "Query by Humming of MIDI and Audio Using Locality Sensitive Hashing," in Proc. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'08), Las Vegas, Nevada, USA, April 2008.
We propose a query by humming method based on locality sensitive hashing (LSH). The method constructs an index of melodic fragments by extracting pitch vectors from a database of melodies. In retrieval, the method automatically transcribes a sung query into notes and then extracts pitch vectors similarly to the index construction. For each query pitch vector, the method searches for similar melodic fragments in the database to obtain a list of candidate melodies. This is performed efficiently by using LSH. The candidate melodies are ranked by their distance to the entire query and returned to the user. To retrieve audio signals, we apply an automatic melody transcription method to construct the melody database directly from music recordings.
The following examples represent QBH of music recordings with the proposed method. The method processes the user query and searches for similar melody segments in a database of 427 full music pieces. The retrieved melody segments are ranked according to their similarity with the query. The examples list the top-three retrieved melody segments (the correct answer in bold face). In our experiments, the correct answer was returned first for 52% of 159 queries and in top-three for 58% of the queries. Due to melody transcription errors in the database construction, some of the retrieved melody segments may sound quite different and surprising to humans.Song searched for: Madonna - Frozen