FAQ


Frequently Asked Questions

DCASE2016 Challenge

Organization

No, the workshop is independent event. The results will be presented there, but challenge participants do not have to attend workshop. Results will be published also on this website.
No, the same paper can be submitted as technical report to the DCASE2016 challenge.
It seems there is always room for a small extension. However, due to interlaced scheduling of the challenge and the workshop, and rather tight scheduling of the DCASE2016 workshop review process, there will not be long extension.

Datasets

No, only the provided development datasets can be used for training the system. We impose this in order to have a fair comparison of the algorithms.
The use of real recordings to augment the training datasets is not allowed, as this would put participants in very different positions (some participants may have access to larger datasets than others). However, sampling data from a pdf (e.g. white noise, pink noise) and mixing this to the given training recordings is allowed.
No, participants should provide only a single output per tested audio recording. If your system is reporting output from channels separately, you have to combine them in to one single output for the submission.
Reference meta data has been published for all tasks. The data is available on download page.

TUT Acoustic scenes 2016

Not necessarily, but we recommend using it.

The original data collected for the dataset was 3-5 minute long audio recordings, and these recordings were cut into 30 second long segments. The partitioning of the data into the cross-validation subsets was done based on the location of the original recordings. In the provided setup, all segments obtained from the same original recording were included into a single subset - either train subset or test subset.

If you create folds without accounting for this, you may obtain over-optimistic results, because the system learns the acoustic characteristic of the scene at the specific location.

The file names for the segments indicate which files are part of the same longer recording, so if you use different cross-validation splits, you should partition the data based on this information.

These errors affect very small proportion of the total duration of the audio. However, in our tests with the development set folds, there is no statistically significant difference in performance when the baseline system was trained using the clean audio (excluding audio error regions) and tested on clean audio (excluding audio error regions when feeding the test sample to the system).

Annotations of these errors are provided with the development dataset, and you are allowed to take them into account when training your system. The errors are radio interference from mobile phone and temporary microphone failures, and affect approximately 4% of the number of files. Time-wise however, they affect only approximately 1% of the total duration of the audio.

The evaluation data is selected such that there will be no errors in the audio.


DCASE2016 Workshop

Organization

No, the workshop is independent event. The results will be presented there, but challenge participants do not have to attend workshop. Results will be published also on this website.