| Info page | Name | Collection name | Related datasets | Provider | Abbreviation | D | S | License | Size | Year | Cite | Paper title | General domains | Data modalities | Total duration (min) | Files | Length consistency | File length (sec) | Content type | Scene content | Unique event instances in synthetic mixtures | Recording setup type | Recording setups | Recording spot type | Data type | Material source | Variability source | Audio type | Format | Lossy compression | Bit rate | Sampling rate | Channel setup | Channels | Meta types | Caption annotation source | Captions per item | Caption annotator count | Caption instance count | Caption languages | Scene classes | Class balance | Class list | Event classes | Event instance count | Event instance per class | Event class balance | Event annotation type | Event annotation source | Event annotations labelled (%) | Event annotations validated (%) | Event annotations strong (%) | Event labeling / hierarchical | Event labeling / ontology | Data split | Split sets | Split folds | Baseline | Baseline cite | Evaluation campaigns | Comments | 
|---|
Audio captions are free-text descriptions of the audio recordings' content using natural language.