DCASE Datalist

This data listing is a DCASE Community effort to collect curated meta-information about DCASE related datasets into a uniform structure. The DCASE is a community for research on Detection and Classification of Acoustic Scenes and Events, and the community offers a platform for discussion of the different perspectives and approaches, from algorithm development to practical applications and their commercial value.

The list focuses specifically on pre-packaged datasets rather than online data repositories. Datasets included in the list are well documented, packaged for easy usage, and have a free or open license. Many of the listed datasets have been used in DCASE Challenges or peer-reviewed academic papers.

Datasets are placed roughly into a couple of data collections at the high level based on the audio content analysis type they are mainly focusing on. Some datasets can be used for multiple content analysis tasks, and in these cases, they are placed into multiple collections.

Contributing

The data listing is maintained through a Github repository. In case you notice datasets missing, errors, or you want to contribute otherwise to the data listings, you can raise issues in the repository with a link to a new dataset or fork the repository and make a pull request with your edits. Proposals for new data collections are welcomed as well.

The list is maintained by Toni Heittola.

Datalist

datasets

This collection pools together all task-specific collections into dynamic table to ease the data search across collections.

datasets

All datasets collected into single static table with links to dataset information pages.

Task-specific data collections

Acoustic scenes

datasets

An acoustic scene is a descriptor for the surrounding acoustic environment defined by physical and social situations in the scene. The acoustic scene is identified by scene label, for example, “outdoor market”, “busy street”, and “office”. The goal of automatic acoustic scene classification is to classify a test recording into one of the predefined classes that characterize the environment in which it was recorded.

Anomalous sounds

datasets

This data list pulls together various type of datasets containing anomalous sounds. These datasets are suitable for research focusing on anomalous sound detection (ASD) where the task is to identify whether the sound emitted from a target device/machine is normal or anomalous.

Audio captions

datasets

Audio captions are free-text descriptions of the audio recordings' content using natural language.

Everyday sounds

datasets

This data list pulls together various type of datasets containing everyday sounds. These datasets are suitable for research focusing on sound event detection, sound event detection and localization, or audio tagging. A sound event corresponds to an audio segment that is attributed to a specific sound source and that is perceived as an entity. Sound event has start and end timestamps along with a textual label that is related to the sound source. Some datasets in this list contains either the strong annotations, annotations with start and end timestamps, or weak annotations, annotations with sound presence at clip/time-segment level.

Other resources

Video

Awesome-Video-Datasets list maintained by Yunhua Zhang

Datalist

Task-specific data collections

Acoustic scenes

Anomalous sounds

Audio captions

Everyday sounds

Other resources

Bioacoustics

Speech

Computer vision

Video