Kaggle In-Class competition on acoustic scene classification

, Toni Heittola

We have launched a Kaggle in Class competition on acoustic scene classification as part of TUT course (SGN-41007 Pattern Recognition and Machine Learning.). Competition is open for everybody whether you are participating the TUT course or not.

The competition data is similar to DCASE2017 Task 1 data, audio material from 15 scene classes split into 10-second segments. Contrary to the DCASE2107, were audio signals were released, only acoustic feature matrices (mel-energies) extracted for the audio signals are released. This shifts the research focus more on the machine learning side of the problem and makes the problem more approachable to the Kaggle community.


The results from the competition will be published in International Workshop on Machine Learning for Signal Processing 2018 (MLSP2018):


Shayan Gharib, Honain Derrar, Daisuke Niizumi, Tuukka Senttula, Janne Tommola, Toni Heittola, Tuomas Virtanen, and Heikki Huttunen. Acoustic scene classification: a competition review. In 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), 1–6. September 2018. doi:10.1109/MLSP.2018.8517000.


Acoustic Scene Classification: a competition review


In this paper we study the problem of acoustic scene classification, i.e., categorization of audio sequences into mutually exclusive classes based on their spectral content. We describe the methods and results discovered during a competition organized in the context of a graduate machine learning course; both by the students and external participants. We identify the most suitable methods and study the impact of each by performing an ablation study of the mixture of approaches. We also compare the results with a neural network baseline, and show the improvement over that. Finally, we discuss the impact of using a competition as a part of a university course, and justify its importance in the curriculum based on student feedback.


Acoustic Scene Classification, Data Augmentation, Kaggle, DCASE


Dataset used in this work has been published as open dataset: