Open positions
Publications
Presentations
Automatic sound event detection aims to find the sound events in a continuous audio recording, label each instance and mark its onset and offset. Real-life recordings contain a large variety of sounds, often overlapping, making sound event detection a very challenging task.
Data collection and annotation is a delicate phase in the development of methods, as machine learning depends heavily on the type of data used in training. Different alternatives exist, from experts doing manual annotation with strong and weak labels, to crowdsourcing, and active learning, using a small number of target classes or a large, unrestricted number of freely chosen labels.
Singing voice is a central component of music, and when present, it immediately attracts the attention of the listener. Voice conveys a variety of information, including identity and semantic content, along with emotional and aesthetic elements. Music information retrieval tackles a variety of voice related tasks, including singer identification and recognition of lyrics from singing.
Automatic classification of acoustic scenes aims at labeling a test audio recording to a category that explains its geographical or social context, for example park, street, meeting. While the classification problem itself is not highly difficult, challenges brought by mismatching recording devices and open set classification make this a very active research topic.