Computational models of musical meter recognition

TAMPERE UNIVERSITY OF TECHNOLOGY
Department of information technology
Signal processing laboratory
SEPPÄNEN, JARNO: Computational models of musical meter recognition
Master of Science thesis, 61 pages, 11 enclosure pages
Examiners: Prof. Petri Haavisto, M.Sc. Anssi Klapuri, and M.Sc. Matti Hämäläinen
Funding: Nokia Oyj
November 2001
Keywords: music analysis, rhythm, meter, beat, tatum, phenomenal accent

Abstract

The thesis proposes an algorithm for the recognition of musical meter from acoustic signals of music. Musical meter is a part of rhythm that is constantly present in music, as it spans the musical time base. The proposed model is capable of finding metrical levels, including the beat and the tatum, in real time from a musical audio signal. The~model comprises four main components: an~onset detector, a~tatum estimator, a~phenomenal accent model, and a~beat estimator. The onset detector finds distinct sound onsets from an acoustic signal, using multiband signal processing. After this, the tatum, which is the lowest metrical level, is computed from onset times. Phenomenal accents are computed from a set of 16 acoustic signal features using Bayesian pattern recognition. The tatum and the accents then yield the beat. The~proposed model operates causally and is able to respond to tempo changes. The design of the model aims at generality in regard to musical genres, and thus the model is trained and tested using 330 music excerpts from multiple genres. The model performance varies according to the rhythmic difficulty of the input signal. Most pop/rock music poses no problems for the algorithm, while classical music and expressive jazz pieces are intractable. The model produces more errors than Eric Scheirer's beat tracker, but at the same time it follows more metrical levels than Scheirer's model. The results of this thesis are directly applicable in music production and post-processing. The access to musical time enables new levels of productivity and automation in both music software and hardware. Meter-synchronized comparison, mixing, and editing of pieces of music is possible. Robust meter recognition is a vital component of music information retrieval applications.

Documents

Jarno Seppänen, "Computational models of musical meter recognition", M.Sc. thesis, Tampere University of Technology, Tampere, Finland, November 2001.
(in PDF; in gzipped PostScript)



Tatum Grid Analysis of Musical Signals

Jarno Seppänen, jams@cs.tut.fi
Tampere University of Technology
Signal Processing Laboratory
Audio Research Group

Abstract

An~algorithm for analyzing the rhythmic content of acoustic signals of polyphonic and multitimbral Western music is presented. The analysis consists of detecting sound onsets, computing an inter-onset interval (IOI) histogram, and estimating the duration of the shortest notes, i.e., the tatum period from the histogram. Robustness against tempo changes has been explicitly built into the system by using short-term memory for the tatum grid estimation. The results are directly applicable to computational music processing for making a musically useful segmentation and computing a musical time base. The proposed algorithm works causally and a real-time software implementation is available on-line. The performance of the system was validated for 50 musical excerpts, and the algorithm was found to be capable of finding the tatum grid from music with a regular rhythm.

Introduction

Metrical structure is a fundamental property of music. It refers to the hierarchy of regular pulses that a listener intuitively attempts to infer from the timings and accents of perceived musical events. By~far the most significant metrical level is the beat, or ``tactus''~\cite{Lerdahl-Jackendoff1983a}, and tapping along to the beat is a fundamental musical skill.

I~intend to demonstrate that the tatum, or the lowest metrical level, is the next important metrical level for computational music processing applications after the beat. The pulse intervals on all other metrical levels, including the beat, are integral multiples of the tatum, and this makes the tatum an ideal short-time segmentation for musical signals. The tatum may also work as a robust starting point for computational beat induction.\footnote{In some rudimentary cases the tatum may be equal to the beat.}

The term \emph{metrical grid} is used to refer to a symbolic transcription of the whole metrical structure~\cite{Lerdahl-Jackendoff1983a}. The term \emph{tatum grid} refers to the train of pulses on the lowest metrical level, and the term \emph{tatum} refers to the period of the lowest-level pulse, i.e., the shortest notes present~\cite{Bilmes1993b}.

This paper proposes a system for finding the tatum grid from acoustic musical signals. Prior to tatum estimation the audio signal is preprocessed with a sound onset detector. The onset detector tracks changes in the root-mean-square (RMS) amplitude envelope on multiple frequency bands and emits onset events at points of rapid level increase. The tatum estimator processes the stream of onsets causally, enabling the tracking of accelerandos and ritardandos by using an exponentially decaying window for past data. Rubatos and tempo changes are detected after a latency time dictated by the observation window length.

The availability of the tatum grid makes it feasible to automatically measure musical time from a piece of music by using the tatum grid or some multiple of it as the time base. This enables applications which compute and handle metrical intervals between musical events in addition to computing absolute time intervals.

The extraction of the tatum grid from a musical audio signal or an onset stream has not been discussed \emph{per se} in any previous literature. Bilmes discusses creating a tatum grid that matches a performance, given complete metrical knowledge of the piece~\cite{Bilmes1993b}. The rule-based meter perception models of Lee~\cite{Lee1991a}, Rosenthal~\cite{Rosenthal1992a}, and Temperley and Sleator~\cite{Temperley-Sleator1999a} produce a transcription of the lowest metrical level as a by-product, but the principal focus of the models and the publications is on the beat.

The proposed system aims at generality in regard to musical genres. Experimental data comprising of excerpts from jazz, rock, techno, classical, big band, and pop music was used to verify the performance of the system by using manually annotated beat positions as a metrical reference. Beat annotation was used, because it is impossible to annotate the tatum grid in a real-time listening test.

Documents

Jarno Seppänen, "Tatum grid analysis of musical signals", Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, Oct. 21--24, 2001. (in PDF)

Demonstrations

Please notice that the system is causal: a few beginning clicks do not match, until the system locks to the correct grid. At the points of e.g. tempo changes a slight adaptation slowness appears.

Original signal Tatum added Other
01orig1.wav 08click1.wav
03orig3.wav 10click3.wav
04orig4.wav 11click4.wav
05orig5.wav 12click5.wav
06orig6.wav 13click6.wav
07orig7.wav 14click7.wav Remix!

Real-time implementation

The algorithm has been implemented in real time for the Pd computer music platform. You can download the real-time implementation from ftp://iem.kug.ac.at/pd/Externals/RHYTHM/, thanks to the IEM Graz team.



Last modified: Mon Feb 4 14:36:33 EET 2002 - Jarno Seppänen jams@cs.tut.fi