Annamaria Mesaros

Academy Research Fellow
(Assistant Professor)

My research area is Machine Listening, with focus on Detection and Classification of Acoustic Scenes and Events. My work includes different environmental sound detection and classification tasks, data collection and annotation procedures, evaluation methodology and metrics. I have been actively involved into promoting DCASE Challenge and Workshop, both through coordination work and publication of open data and research.

More

Research Pubs Data

Connect

Research

Organizing

Activity

RECRUITING!!

Open positions

I am recruiting one postdoc and two PhD researchers. Click on the news article for the link to more details and application information.

New publication and dataset for audio captioning

Publications

Our new paper “Diversity and Bias in Audio Captioning Datasets” by Irene Martin Morato and Annamaria Mesaros will be presented at DCASE 2021 Workshop in November. The dataset is also available.

Presentation on reproducible research

Presentations

I had the pleasure of giving a presentation on the role of evaluation in reproducible research for a seminar in CNRS. Slides are available.

Latest publications

Audio-Visual Scene Classification: Analysis of DCASE 2021 Challenge Submissions

Conf

Shanshan Wang, Annamaria Mesaros, Toni Heittola and Tuomas Virtanen

Diversity and Bias in Audio Captioning Datasets

Conf

Irene Martin and Annamaria Mesaros

Low-Complexity Acoustic Scene Classification for Multi-Device Audio: Analysis of DCASE 2021 Challenge Systems

Conf

Irene Martin, Toni Heittola, Annamaria Mesaros and Tuomas Virtanen

Towards Sonification in Multimodal and User-friendly Explainable Artificial Intelligence

Conf

Björn Schuller, Tuomas Virtanen, Maria Riveiro, Georgios Rizos, Jing Han, Annamaria Mesaros and Konstantinos Drossos

Crowdsourcing Strong Labels for Sound Event Detection

Conf

Irene Martin Morato, Manu Harju and Annamaria Mesaros

All publications

Research

Sound event detection in multisource environments

Automatic sound event detection aims to find the sound events in a continuous audio recording, label each instance and mark its onset and offset. Real-life recordings contain a large variety of sounds, often overlapping, making sound event detection a very challenging task.

Semantics and annotation of sound scenes and events

Data collection and annotation is a delicate phase in the development of methods, as machine learning depends heavily on the type of data used in training. Different alternatives exist, from experts doing manual annotation with strong and weak labels, to crowdsourcing, and active learning, using a small number of target classes or a large, unrestricted number of freely chosen labels.

Singing voice recognition, singer identification

Singing voice is a central component of music, and when present, it immediately attracts the attention of the listener. Voice conveys a variety of information, including identity and semantic content, along with emotional and aesthetic elements. Music information retrieval tackles a variety of voice related tasks, including singer identification and recognition of lyrics from singing.

Acoustic scene classification

Automatic classification of acoustic scenes aims at labeling a test audio recording to a category that explains its geographical or social context, for example park, street, meeting. While the classification problem itself is not highly difficult, challenges brought by mismatching recording devices and open set classification make this a very active research topic.

More about research

Annamaria Mesaros

Academy Research Fellow (Assistant Professor)

Activity

RECRUITING!!

New publication and dataset for audio captioning

Presentation on reproducible research

Latest publications

Audio-Visual Scene Classification: Analysis of DCASE 2021 Challenge Submissions

Diversity and Bias in Audio Captioning Datasets

Low-Complexity Acoustic Scene Classification for Multi-Device Audio: Analysis of DCASE 2021 Challenge Systems

Towards Sonification in Multimodal and User-friendly Explainable Artificial Intelligence

Crowdsourcing Strong Labels for Sound Event Detection

Research

Sound event detection in multisource environments

Semantics and annotation of sound scenes and events

Singing voice recognition, singer identification

Acoustic scene classification

Academy Research Fellow
(Assistant Professor)