Method for Creating Location-Specific Audio Textures

Publication

Toni Heittola, Annamaria Mesaros, Dani Korpi, Antti Eronen, and Tuomas Virtanen. Method for creating location-specific audio textures. EURASIP Journal on Audio, Speech and Music Processing, 2014.

PDF

Method for creating location-specific audio textures

Abstract

An approach is proposed for creating location-specific audio textures for virtual location-exploration services. The presented approach creates audio textures by processing a small amount of audio recorded at a given location, providing a cost-effective way to produce a versatile audio signal that characterizes the location. The resulting texture is non-repetitive and conserves the location-specific characteristics of the audio scene, without the need of collecting large amount of audio from each location. The method consists of two stages: analysis and synthesis. In the analysis stage, the source audio recording is segmented into homogeneous segments. In the synthesis stage, the audio texture is created by randomly drawing segments from the source audio so that the consecutive segments will have timbral similarity near the segment boundaries. Results obtained in listening experiments show that there is no statistically significant difference in the audio quality or location-specificity of audio when the created audio textures are compared to excerpts of the original recordings. Therefore, the proposed audio textures could be utilized in virtual location-exploration services. Examples of source signals and audio textures created from them are available at www.cs.tut.fi/~heittolt/audiotexture.

PDF Web publication

Demo Listening test samples

Abstract

An approach for creating location-specific audio textures by reusing audio recorded at the given location is proposed. The approach provides a cost-effective way to produce a versatile audio signal to represent the location for virtual location-exploration services. This avoids the problem of collecting a lot of audio from each location, which is required in order that the audio is non-repetitive and conserves the location-specific characteristics of each auditory scene. The method consists of two main steps: analysis and synthesis. In the analysis stage, the source audio recording is segmented into homogeneous segments. In the synthesis stage, the audio texture is created by randomly drawing segments from the source audio so that the consecutive segments will have timbral similarity near the segment boundaries. Results obtained in listening experiments show that there is no statistically significant difference in the audio quality or location-specificity of audio when the created audio textures are compared to excerpts of the original recordings.

Audio Texture

By an audio texture we mean a new, unique, audio signal created based on a source audio signal acquired from a certain location. The texture should provide an audio representation of the location from which the source signal comes, by having the same generic acoustic properties and types of sound events that are characteristic of the location.

The proposed audio texture creation method consists of two stages: analysis of the source audio signal and synthesis of the audio texture. The analysis stage performs clustering and segmentation of the source audio in an unsupervised manner. The goal of the clustering analysis is to automatically find homogeneous audio segments from the source signal. Ideally, the segments would be representative of individual sound events from the scene. In the synthesis stage, an audio texture is constructed by shuffling and concatenating these segments. The shuffling will be done in a way that takes into account timbral continuity at the segment boundaries.

Demonstration

Examples of generated audio texture for various locations are presented. Open demonstration by clicking image on the left. Segments used for the syhthesis are presented in the lower panel.

Listening Tests

Listening tests were conducted to evaluate the quality of the synthesized audio textures and the possibility of using synthesized audio textures for representing the auditory scene of various locations.

Examples of audio samples used in the listening tests are available below.