Research - Annamaria Mesaros

Sound event detection in multisource environments

Active research topic 2010 onwards

The aim of sound event detection and classification is to recognize individual sounds in audio that allow understanding of what is happening at the scene. Individual sounds are referred to as sound events, because they explain events taking place: bird singing, car passing by. As an audio content analysis problem, sound event detection aims to recognize what is happening and when it is happening within an audio recording. The common approach in my work for this problem is supervised learning, where learning is based on annotations provided in the form of a textual label and associated onset and offset for each sound instance. This constitutes what is called a strong label, and the system is expected to provide similar information for the test data. Methods I developed for sound event detection include use of GMMs, HMMs, NMF, and recently deep learning.

Semantics and annotation of sound scenes and events

Active research topic 2010 onwards

The audio annotation process is often slow, and depending on the required output it can be very difficult. For example producing polyphonic strong labels requires repeated listening of the audio sample. Also, because humans are subjective with their choice of labels and time positioning, the resulting annotation has a large variety of labels and inconsistent temporal boundaries for the annotated sound instances.

Acoustic Scene Classification

Active research topic 2010 onwards

Singing voice recognition, singer identification

Active research topic during 2006-2010