Download

Audio dataset

Development dataset are currently available. Evaluation datasets without ground truth will be released shortly before the submission deadline.

1. Acoustic scene classification

In case you are using the provided baseline system, there is no need to download the dataset as the system will automatically download needed dataset for you.

In publications using the datasets, cite as:

Annamaria Mesaros, Toni Heittola, and Tuomas Virtanen, Tut database for acoustic scene classification and sound event detection, In 24rd European Signal Processing Conference 2016 (EUSIPCO 2016). Budapest, Hungary, 2016. PDF

2. Sound event detection in synthetic audio

3. Sound event detection in real life audio

In case you are using provided baseline system, there is no need to download dataset as the system will automatically download needed datasets for you.

In publications using the datasets, cite as:

Annamaria Mesaros, Toni Heittola, and Tuomas Virtanen, Tut database for acoustic scene classification and sound event detection, In 24rd European Signal Processing Conference 2016 (EUSIPCO 2016). Budapest, Hungary, 2016. PDF

4. Domestic audio tagging

In case you are using provided baseline system, there is no need to download dataset as the system will automatically download needed datasets for you.

Submissions

All the submissions (system outputs and technical reports describing the systems) are published as a one package. This package is meant to archive the DCASE0216 Challenge outcome and enable later evaluation with additional evaluation metrics.

Baseline systems

The system is meant to implement basic approach for sound event detection, and provide some comparison point for the participants while developing their systems.

1+3. Acoustic scene classification and Sound event detection in real life audio

The baseline systems for task 1 and task 3 share the code base, and implements quite similar approach for both tasks. The baseline systems will download automatically the needed datasets and produce the reported baseline results when ran with the default parameters.

Baseline systems are provided for both Python and Matlab platforms. Python implementation is regarded as the main implementation. The Matlab implementation replicates the code structure of the main baseline to allow easy switching between platforms. The implementations are not intended to produce exactly the same results. The differences between implementations are due to the used libraries for MFCC extraction (RASTAMAT vs Librosa) and for GMM modeling (VOICEBOX vs scikit-learn).

Participants are allowed to build their system on top of the given baseline systems. The systems have all needed functionality for dataset handling, storing / accessing features and models, and evaluating the results, making the adaptation for one's needs rather easy. The baseline systems are also good starting point for entry level researchers.

The baseline systems provide also reference implementation of evaluation metrics.

In publications using the datasets, cite as:

Annamaria Mesaros, Toni Heittola, and Tuomas Virtanen, Tut database for acoustic scene classification and sound event detection, In 24rd European Signal Processing Conference 2016 (EUSIPCO 2016). Budapest, Hungary, 2016. PDF

Python implementation

Latest release (version 1.0.6) (.zip)

Matlab implementation

Latest release (version 1.0.5) (.zip)

2. Synthetic audio sound event detection

Matlab implementation

Latest release (version 1.0.2)

4. Domestic audio tagging

Python implementation

Evaluation metric code

1. Acoustic scene classification

Code is available with the baseline system:

  • Python implementation from src.evaluation import DCASE2016_SceneClassification_Metrics.
  • Matlab implementation, use class src/evaluation/DCASE2016_SceneClassification_Metrics.m.

2. Sound event detection in synthetic audio

Code is available with the baseline system. Use classes:

  • metrics/DCASE2016_EventDetection_SegmentBasedMetrics.m
  • metrics/DCASE2016_EventDetection_EventBasedMetrics.m

3. Sound event detection in real life audio

Code is available with the baseline system:

  • Python implementation from src.evaluation import DCASE2016_EventDetection_SegmentBasedMetrics and from src.evaluation import DCASE2016_EventDetection_EventBasedMetrics.
  • Matlab implementation, use classes src/evaluation/DCASE2016_EventDetection_SegmentBasedMetrics.m and src/evaluation/DCASE2016_EventDetection_EventBasedMetrics.m.

sed_eval - Evaluation toolbox for Sound Event Detection

sed_eval contains same metrics as baseline system, and they are tested to give same values.

4. Domestic audio tagging

Equal Error rate (EER).

Repository (Python & Matlab)

Toolboxes

sed_eval - Evaluation toolbox for Sound Event Detection

sed_eval contains same metrics as baseline system, and they are tested to give same values. Use parameters time_resolution=1 and t_collar=0.250 to align it with the baseline system results.

sed_vis - Visualization toolbox for Sound Event Detection

sed_vis is a toolbox for visually inspecting sound event annotations and playing back the audio while following the annotations. The annotations are visualized with an event-roll.