Isolated urban sound database

sounds

Download Gloaguen2018

Label				Value	Description
General information
	Name			Isolated urban sound database	Full dataset name
	ID			sounds/iusd	Datalist id for external indexing
	Abbreviation			IUSD	Official dataset abbreviation, e.g. one used in the original paper
	Year			2018	Dataset release year
	Modalities			Audio	Data modalities included in the dataset
	Collection name			IUSD	Common name for all related datasets, used to group datasets coming from same source
	Related datasets name			UrbanSound8K
	License			Creative Commons, CC BY 4.0
	Download			Download (2.3GB)
	Citation			[Gloaguen2018] Creation of a corpus of realistic urban sound scenes with controlled acoustic properties
Audio
Label				Value	Description
	Data
		Data type		Audio	Possible values: Audio \| Features
		File format
			File format type	Constant	Possible values: Constant \| Variable
			File format	wav	Possible value: wav \| aiff \| flac \| mp3 \| aac \| ogg
			Lossy compression	No	is audio compressed in a lossy manner
			Bit rate	16	Bit depth of audio, possible values: 8 \| 16 \| 24 \| 32
			Sampling rate (kHz)	44.1 kHz	Sampling rate in kHz, possible values: 8 \| 16 \| 22.05 \| 32 \| 44.1 \| 48
		Channels
			Setup	Variable	Possible values: Mono \| Stereo \| Binaural \| Ambisonic \| Array \| Multi-Channel \| Variable
		Material
			Source	UrbanSound8K Freesound	Possible values: Original \| Youtube \| Freesound \| Online \| Crowdsourced \| [Dataset name]
	Content
		Content type		Isolated	Possible values: Freefield \| Synthetic \| Isolated
	Recording
		Setup		Unknown	Possible values: Near-field \| Far-field \| Mixed \| Uncontrolled \| Unknown
	Files
		Count		399 files	Total number of files
		Total duration (minutes)		288 min	Total duration of the dataset in minutes
		File length		Variable	Characterization of the file lengths, possible values: Constant \| Quasi-constant \| Variable
		File length (seconds)		1-60 sec	Approximate length of files
Meta
Label				Value	Description
	Types			Tag	List of meta data types provided for the data, possible values: Event, Tag, Scene, Caption, Geolocation, Spatial location, Annotator, Timestamp, Presence, Proximity, etc.
	Scene
		Classes		10	Number of scene classes
		Classes		False	Possible values: True \| False \| Almost
		Classes		bird construction site crowd fountain park rain school yard traffic ventilation wind tree
	Event
		Classes		21	Number of event classes
		Classes		False	Possible values: True \| False \| Almost
		Classes		bell whistle bird sweeping broom car horn car hammer and drill coughing dog barking car door house door plane siren foot step thunder street noise suitcase rolling train passing tramway passing truck voice
		Annotation
			Type	Weak	Possible values: Strong \| Weak \| Location \| None
			Source	Experts	Possible values: Experts \| Crowdsourced \| Synthetic \| Metadata \| Automatic
			Annotations per item	1	How many annotations there are available per item (possible multi-annotator setup)
			Labelled amount (%)	100 %	Percentage of all data, amount of data which is labelled
			Validated amount (%)	100 %	Percentage of all data, amount of data which is validated by human
			Strong annotations amount (%)	0 %	Percentage of all data, amount of data which has strong annotations
			Overlapping event instances	No
		Labeling
			Hierarchical	No
		Instance
			Count	231	Count of all event instances in the dataset
			Average instances per class	11.55	Average per class instance count
Cross-validation setup
Label				Value	Description
		Provided		No

Isolated urban sound database

General information

Audio

Data

File format

Channels

Material

Content

Recording

Files

Meta

Scene

Event

Annotation

Labeling

Instance

Cross-validation setup