The list of datasets is currently maintained under DCASE Datalist
Introduction
Audio data collection and manual data annotation both are tedious processes, and the lack of a proper development dataset limits fast development in environmental audio research. These tables collected datasets suitable for environmental audio research at time when there was not many datasets available (before 2020). In addition to the freely available dataset, also proprietary and commercial datasets were listed here for completeness.
Environmental audio
The datasets are divided into two tables: Sound events table contains datasets suitable for research in the field of automatic sound event detection and automatic sound tagging. Acoustic scenes table contains datasets suitable for research involving the audio-based context recognition and acoustic scene classification.
Sound events
Provider | Name |
Audio type |
Data source |
Acoustic scenes |
Annotation type |
Context count |
Event classes |
Event count |
Instances per class |
Audio files |
Mins |
Context classes |
Licence | Link | Pub |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Dares | G1 | Live | Authors | Mixed | Events | 28 | 761 | 3214 | 4.2 | 123 | 123 | Free | http://www.daresounds.org/ | http://ieeexplore.ieee.org/document/6637761;Mesaros2013a, http://ieeexplore.ieee.org/document/6616142;Mesaros2013b | |
Dares | Amstel | Live | Authors | Single | Events | 1 | 13 | 1002 | 77.1 | 40 | 54 | train station | Free | http://www.daresounds.org/ | |
Sound Ideas | BBC Sound Effects Library | Isolated | Authors | Mixed | Description | 0 | 0 | 1655 | 1655 | 0 | Commercial | http://www.sound-ideas.com/sound-effects/bbc-1-40-cds-sound-effects-library.html | http://ieeexplore.ieee.org/document/6269853;Kim2012a, http://journals.cambridge.org/action/displayFulltext?type=6&fid=8779081&jid=SIP&volumeId=1&issueId=-1&aid=8779080&fulltextType=RA&fileId=S2048770312000078;Kim2012b, http://sail.usc.edu/~malandra/files/papers/interspeech2013.pdf;Malandrakis2013 | ||
TUT | CASA 2009 | Live | Authors | Mixed | Events | 10 | 208 | 10326 | 49.6 | 103 | 1133 | basketball game, beach, hallway, inside a bus, inside a car, office, restaurant, stadium with track and field event, street, supermarket | Proprietary | http://ieeexplore.ieee.org/document/6639360;Heittola2013a, http://asmp.eurasipjournals.com/content/2013/1/1/abstract;Heittola2013b, http://www.cs.tut.fi/~heittolt/pubs/chime2011_heittola.pdf;Heittola2011, http://www.cs.tut.fi/~mesaros/pubs/plsa.pdf;Mesaros2011, http://www.cs.tut.fi/~mesaros/pubs/acoustic_event_detection_1406.pdf;Mesaros2010, http://www.cs.tut.fi/~heittolt/pubs/eusipco2010_heittola.pdf;Heittola2010 | |
TUT | CASA 2010 | Live | Authors | Mixed | Events | 16 | 289 | 4173 | 14.4 | 160 | 535 | Proprietary | |||
ELRA | CHIL 2007 Evaluation Package | Live | Authors | Single | Events | 1 | 0 | 0 | 0 | 0 | meeting room | Commercial | http://catalog.elra.info/product_info.php?products_id=1092 | http://dl.acm.org/citation.cfm?id=1840231;Zhuang2010 | |
IEEE AASP Challenge 2013 | Event isolated | Isolated | Authors | Single | Tags | 1 | 16 | 639 | 39.9 | 320 | 19 | office | Free | http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/description.html | http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/;Results |
IEEE AASP Challenge 2013 | Event live | Live | Authors | Single | Events | 1 | 16 | 205 | 12.8 | 3 | 5 | office | Free | http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/description.html | http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/;Results |
IEEE AASP Challenge 2013 | Event synthetic | Live | Authors | Single | Events | 1 | 15 | 310 | 20.7 | 9 | 14 | office | Free | http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/description.html | http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/;Results |
NII-SRC | RWCP Sound Scene Database | Isolated | Authors | Mixed | Tags | 0 | 14 | 9722 | 694.4 | 9722 | 0 | Research and development use only | http://www.openslr.org/13/ | http://ieeexplore.ieee.org/document/1035570;Nishiura2002, http://link.springer.com/article/10.1007%2Fs00779-005-0045-4;Smith2006, http://ieeexplore.ieee.org/document/6637759;Dennis2013 | |
QMUL | Freefield1010 | Live | Freesound | Mixed | Tags | 0 | 0 | 7690 | 7690 | 1282 | Creative Commons | http://c4dm.eecs.qmul.ac.uk/rdr/handle/123456789/35 | http://arxiv.org/abs/1309.5275;Stowell2014 | ||
INRIA | NAR | Isolated | Authors | Mixed | Tags | 4 | 41 | 42 | 1.0 | 852 | 8 | Creative Commons | https://team.inria.fr/perception/nard/ | https://hal.inria.fr/hal-00768767/en;Janvier2014,https://hal.inria.fr/hal-00952092/en;Janvier2012 | |
NYU | UrbanSound | Live | Freesound | Single | Events | 1 | 10 | 3075 | 307.5 | 1302 | 1620 | street | Creative Commons | http://urbansounddataset.weebly.com/urbansound.html | http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf;Salamon2014 |
NYU | UrbanSound8K | Live | Freesound | Street | Tags | 1 | 10 | 8732 | 873.2 | 8732 | 525 | street | Creative Commons | http://urbansounddataset.weebly.com/urbansound8k.html | http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf;Salamon2014 |
ELRA | FBK-Irst database of isolated meeting-room acoustic events | Isolated | Authors | Single | Tags | 1 | 16 | 576 | 36.0 | 288 | 63 | meeting room | Free | http://catalog.elra.info/product_info.php?products_id=1093 | http://www.ee.columbia.edu/~dpwe/pubs/CottonE11-spectrotemporal.pdf;Cotton2011 |
ELRA | UPC-TALP database of isolated meeting-room acoustic events | Live | Authors | Single | Events | 1 | 14 | 1026 | 73.3 | 3 | 0 | meeting room | Commercial | http://catalog.elra.info/product_info.php?products_id=1053 | |
TU Dortmund | Acoustic event dataset | Isolated | Authors | Single | Events | 1 | 12 | 235 | 19.6 | 23 | 34 | meeting room | Free | http://patrec.cs.tu-dortmund.de/files/icassp2014aed_dortmund.zip | http://patrec.cs.tu-dortmund.de/pubs/papers/Plinge2014-BOF.pdf;Plinge2014 |
CICESE | Sound Events | Isolated | Authors | Single | Tags | 1 | 20 | 1367 | 68.3 | 1367 | 92 | Free | http://sound.natix.org/databases/allSounds.zip | http://www.sciencedirect.com/science/article/pii/S0167865515002925;Beltran2015 | |
MIVIA | Audio Events Data Set for Surveillance Applications | Live | Authors | Single | Events | 1 | 3 | 6000 | 2000.0 | 760 | 2279 | Free | http://mivia.unisa.it/datasets/audio-analysis/mivia-audio-events/ | http://www.sciencedirect.com/science/article/pii/S0167865515001981;Foggia2015 | |
MIVIA | Audio Events Data Set for Road Surveillance Applications | Live | Authors | Single | Events | 1 | 2 | 400 | 200.0 | 57 | 60 | Free | http://mivia.unisa.it/datasets/audio-analysis/mivia-road-audio-events-data-set/ | http://ieeexplore.ieee.org/document/6918643;Foggia2014 | |
Freiburg | Freiburg-106, Audio Data Set for Human Activity Recognition | Live | Authors | Single | Tags | 1 | 24 | 1524 | 63.5 | 1524 | 54 | kitchen | Free | http://www.csc.kth.se/~jastork/pages/datasets.html | http://ieeexplore.ieee.org/document/6343802;Stork2012, http://www.eurasip.org/Proceedings/Eusipco/Eusipco2015/papers/1570103447.pdf;Phan2015 |
ESC | ESC-50 | Live | Freesound | Mixed | Tags | 0 | 50 | 2000 | 40.0 | 2000 | 166 | Free | https://github.com/karoldvl/ESC-50 | http://karol.piczak.com/papers/Piczak2015-ESC-Dataset.pdf;Piczak2015, http://karol.piczak.com/papers/Piczak2015-ESC-ConvNet.pdf;Piczak2015b | |
ESC | ESC-10 | Live | Freesound | Mixed | Tags | 0 | 10 | 400 | 40.0 | 400 | 33 | Free | https://github.com/karoldvl/ESC-10 | http://karol.piczak.com/papers/Piczak2015-ESC-Dataset.pdf;Piczak2015, http://karol.piczak.com/papers/Piczak2015-ESC-ConvNet.pdf;Piczak2015b | |
ESC | Dataset for Environmental Sound Classification | Live | Freesound | Mixed | Tags | 0 | 50 | 2000 | 40.0 | 250000 | 20833 | Free | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YDEPUT | http://karol.piczak.com/papers/Piczak2015-ESC-Dataset.pdf;Piczak2015a, http://karol.piczak.com/papers/Piczak2015-ESC-ConvNet.pdf;Piczak2015b | |
TUT | TUT Sound events 2016, Development dataset | Live | Authors | Mixed | Events | 2 | 18 | 954 | 53.0 | 22 | 78 | home, residential area | Free | https://zenodo.org/record/45759 | |
TU Dortmund | Multi-channel acoustic event dataset | Isolated | Authors | Single | Events | 1 | 20 | 437 | 21.9 | 37 | 90 | conference room | Free | http://patrec.cs.tu-dortmund.de/files/datasets/dcase2016_multichannel-aed_dortmund.7z | http://patrec.cs.tu-dortmund.de/pubs/papers/Kuerby2016-BAE;Kurby2016 |
TUT | TUT-SED Synthetic 2016 | Synthetic | BBC Sound Effects | Single | Events | 1 | 16 | 100 | 566 | synthetic | Free | http://www.cs.tut.fi/sgn/arg/taslp2017-crnn-sed/tut-sed-synthetic-2016 | https://arxiv.org/pdf/1702.06286;Cakir2017,http://www.cs.tut.fi/~tuomasv/papers/end_to_end_sed_with_crnn_ijcnn_2018.pdf;Cakir2018 | ||
TUT | TUT Sound events 2017, Development dataset | Live | Authors | Mixed | Events | 2 | 6 | 729 | 121.5 | 24 | 92 | street | Free | https://zenodo.org/record/400516 | |
TUT | TUT Rare sound events 2017, Development dataset | Synthetic | Freesound, TUT Acoustic scenes 2016 | Single | Events | 1 | 3 | 1281 | 625 | synthetic | Free | https://zenodo.org/record/401395#.W9muKlVfi7A | |||
TUT | TUT Rare sound events 2017, Evaluation dataset | Synthetic | Freesound, TUT Acoustic scenes 2016 | Single | Events | 1 | 3 | 1500 | 1500.0 | 3000 | 1500 | synthetic | Free | https://zenodo.org/record/1160455#.W9muM1Vfi7A | |
AudioSet | Live | Youtube | Mixed | Tags | 632 | 2084320 | 3297.9 | 2084320 | 347386 | Free | https://research.google.com/audioset/ | https://research.google.com/pubs/pub45857.html;Gemmeke2017 | |||
NYU | URBAN-SED | Synthetic | Freesound | Mixed | Events | 10 | 50000 | 5000.0 | 10000 | 1800 | Free | http://urbansed.weebly.com/ | http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_scaper_waspaa_2017.pdf;Salamon2017 | ||
KU Leuven | SINS database, DCASE 2018 task 5 development dataset | Live | Authors | Home | Tags | 9 | 72984 | 8109.0 | 72984 | 12000 | Free | https://zenodo.org/record/1247102#.W9mjfFVfiEI | http://dcase.community/documents/workshop2017/proceedings/DCASE2017Workshop_Dekkers_141.pdf;Dekkers2017 | ||
KU Leuven | SINS database, DCASE 2018 task 5 evaluation dataset | Live | Authors | Home | Tags | 9 | 72972 | 8108.0 | 72972 | 12000 | Free | https://zenodo.org/record/1291760#.W9mnelVfiEI | http://dcase.community/documents/workshop2017/proceedings/DCASE2017Workshop_Dekkers_141.pdf;Dekkers2017 | ||
DCASE | DCASE2017 task 4 development dataset | Live | Youtube | Mixed | Events, Tags | 17 | 56737 | 3337.5 | 51660 | 8460.3 | Youtube | https://github.com/ankitshah009/Task-4-Large-scale-weakly-supervised-sound-event-detection-for-smart-cars | |||
DCASE | DCASE2017 task 4 evaluation dataset | Live | Youtube | Mixed | Events, Tags | 17 | 1350 | 79.4 | 1103 | 183.8 | Youtube | http://dcase.community/challenge2017/task-large-scale-sound-event-detection#audio-dataset | |||
ETH | Acoustic Event Dataset | Live | Freesound | Mixed | Tags | 28 | 5223 | 186.5 | 5223 | 768.4 | Creative Commons | https://data.vision.ee.ethz.ch/cvl/ae_dataset/ | https://arxiv.org/abs/1604.07160#;Takahashi2016 |
Instructions to use the table
Sort the table by clicking headers. Select more fields to be shown from -menu. The table can be filtered by selection condition from the select boxes. Filtering can be cleared with -button. Numerical data is also shown in bar-chart. The chart can be hiden and shown from Chart-button. Pagination can be enabled and disabled with -button.
Notation
Audio type is denoted as follows:
- Isolated, only one sound event is active per sample (no overlapping sounds).
- Live, real recordings where overlapping sounds may be presents, start and end times annotated.
Annotation type is denoted as follows:
- Sound Events, timestamps (onset and offset times) are indicated
- Tags, sample-wide (usually one word or short) textual label
- Description, sample-wide textual description of sounds / sound scene
License type is denoted as follows:
- Free, free for academic usage (non-commercial), usually released under university specific EULA
- Creative Commons, released under creative commons licence
- Commercial, commercial license
- Proprietary, dataset has not been published, but there is publications reporting results using it
- Research and development use only
Acoustic scenes
Provider | Name |
Context count |
Audio files |
Instances per class |
Mins |
Context classes |
Licence | Link | Pub |
---|---|---|---|---|---|---|---|---|---|
UEA | Noise DB / Series 1 | 10 | 10 | 1.0 | 40 | Free | http://lemur.cmp.uea.ac.uk/Research/noise_db/ | http://link.springer.com/article/10.1007%2Fs00779-005-0045-4;Smith2006 | |
UEA | Noise DB / Series 2 | 12 | 35 | 2.9 | 175 | Free | http://lemur.cmp.uea.ac.uk/Research/noise_db/ | http://link.springer.com/article/10.1007%2Fs00779-005-0045-4;Smith2006 | |
IEEE AASP Challenge 2013 | Scene | 10 | 100 | 10.0 | 50 | Free | http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/description.html | http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/;Results | |
TUT | CASA2009 | 10 | 103 | 10.3 | 1133 | Proprietary | http://www.cs.tut.fi/~heittolt/pubs/eusipco2010_heittola.pdf;Heittola2010 | ||
TUT | CASA2010 | 13 | 160 | 12.3 | 533 | Proprietary | |||
TUT | CASR | 27 | 225 | 8.3 | 0 | Proprietary | http://ieeexplore.ieee.org/document/1561288/;Eronen2006 | ||
Dares | G1 | 28 | 123 | 4.4 | 123 | Free | http://www.daresounds.org/ | ||
Inria | DEMAND | 0 | 18 | 0.0 | 0 | Creative Commons | http://parole.loria.fr/DEMAND/ | ||
LITIS | Rouen audio scene dataset | 19 | 3026 | 159.3 | 1513 | Free | https://sites.google.com/site/alainrakotomamonjy/home/audio-scene | http://ieeexplore.ieee.org/document/6971128;Rakotomamonjy2015 | |
TUT | TUT Acoustic scenes 2016 Development dataset | 15 | 1170 | 78.0 | 585 | Free | https://zenodo.org/record/45739 | http://www.cs.tut.fi/~mesaros/pubs/mesaros_eusipco2016-dcase.pdf;Mesaros2016 | |
TUT | TUT Acoustic scenes 2016 Evaluation dataset | 15 | 390 | 26.0 | 195 | Free | https://zenodo.org/record/165995 | http://www.cs.tut.fi/~mesaros/pubs/mesaros_eusipco2016-dcase.pdf;Mesaros2016 | |
TUT | TUT Acoustic scenes 2017 Development dataset | 15 | 4680 | 312.0 | 780 | Free | https://zenodo.org/record/400515 | http://dcase.community/documents/workshop2017/proceedings/DCASE2017Workshop_Mesaros_100.pdf; Mesaros2017 | |
TUT | TUT Acoustic scenes 2017 Evaluation dataset | 15 | 1620 | 108.0 | 270 | Free | https://zenodo.org/record/1040168 | http://dcase.community/documents/workshop2017/proceedings/DCASE2017Workshop_Mesaros_100.pdf; Mesaros2017 | |
AucoDefr07 | AucoDefr07 | 4 | 16 | 63.0 | 252 | Free | https://archive.org/details/defreville-Aucouturier_urbanDb | https://hal.archives-ouvertes.fr/hal-01082501v2;Lagrange2015 | |
UCSD | ExtraSensory Dataset | 51 | 302177 | 0.0 | 0 | Free | http://extrasensory.ucsd.edu/~datasets/extrasensory/ | http://extrasensory.ucsd.edu/~datasets/extrasensory/papers/vaizman2017a_pervasiveAcceptedVersion.pdf;Vaizman2017 | |
TUT | TUT Urban Acoustic Scenes 2018 Development dataset | 10 | 8640 | 864.0 | 1440 | Free | https://zenodo.org/record/1228142 | https://arxiv.org/pdf/1807.09840.pdf;Mesaros2018 | |
TUT | TUT Urban Acoustic Scenes 2018 Mobile Development dataset | 10 | 10080 | 1008.0 | 1680 | Free | https://zenodo.org/record/1228235 | https://arxiv.org/pdf/1807.09840.pdf;Mesaros2018 | |
TUT/TAU | TAU Urban Acoustic Scenes 2019 Development dataset | 10 | 14400 | 1440.0 | 2400 | Free | https://zenodo.org/record/2589280 | ||
TUT/TAU | TAU Urban Acoustic Scenes 2019 Mobile Development dataset | 10 | 16560 | 1656.0 | 2760 | Free | https://zenodo.org/record/2589332 | ||
TUT/TAU | TAU Urban Acoustic Scenes 2019 Openset, Development datase | 14 | 15850 | 1440.0 | 2642 | Free | https://zenodo.org/record/2591503 | ||
Whisper/MERL | WSJ0 Hipster Ambient Mixtures noise dataset (WHAM) | 0 | 28000 | 28000.0 | 4800 | Creative Commons | https://storage.googleapis.com/whisper-public/wham_noise.zip | https://arxiv.org/pdf/1907.01160.pdf;Wichern2019 |
Online services
Isolated sounds
- Freesound, isolated sounds, tagged, creative commons
- BBC Sound Effects, isolated sounds, textual description, free for research purposes
- Findsounds, isolated sounds, tagged, mixed licensing
- British Library Sound Archive, isolated sounds and Live recordings, only available for UK universities, restricted licensing
Geotagged recordings
Related datasets
- Speech Datasets, ISCA Special Interest Group on Robust Speech Recognition
Libraries
Free sound effect libraries by commercial provider:
Tools
Annotation tools
Software
- Audacity, audio software with basic annotation capabilities. Use label tracks for the annotations, see more info here.
- Audio Labeler App in Matlab, Audio annotation tool introduced in Matlab version R2018b.
- Audio Annotator, Javascript web interface for annotating audio data.
- ELAN, a linguistic annotation tool to create the textual annotations for audio and video files
Prototypes
- Soundscape, a tool for soundscape annotation
- I-SED, an interactive sound event detector, see [Kim2017]
- BAT, BMAT Annotation Tool, see [Melendez-Catalan2017]
- audio-annotator, Audio-annotator, see [Cartwright2017]
Audio management tools
- Pumilio, a Web-Based Management System for Ecological Recordings
Audio augmentation tools
- muda, Annotation-aware musical data augmentation, partly applicable for environmental audio (pitch shifting, time stretching). Documentation
- librosa, See time stretching and pitch shifting effects.
- TSM toolbox, MATLAB implementations of various classical time-scale modification (TSM) algorithm.