| General information | |||||
| Label | Value | Description | |||
|---|---|---|---|---|---|
| Name | ExtraSensory Dataset | Full dataset name | |||
| ID | scenes/extrasensory | Datalist id for external indexing | |||
| Abbreviation | ExtraSensory | Official dataset abbreviation, e.g. one used in the original paper | |||
| Provider | UCSD | ||||
| Year | 2017 | Dataset release year | |||
| Modalities | Audio | Data modalities included in the dataset | |||
| Collection name | ExtraSensory | Common name for all related datasets, used to group datasets coming from same source | |||
| Research domain | ASC Mobile devices Multiple sensors | Related domains, e.g., Scenes, Mobile devices, Audio-visual, Open set, Ambient noise, Unlabelled, Multiple sensors, SED, SELD, Tagging, FL, Strong annotation, Weak annotation, Unlabelled, Multi-annotator | |||
| License | Free | ||||
| Download | Download (215MB) | ||||
| Companion site | Site | Link to the companion site for the dataset | |||
| Citation | [Vaizman2017] Recognizing Detailed Human Context In-the-Wild from Smartphones and Smartwatches | ||||
| Audio | |||||
| Label | Value | Description | |||
| Data | |||||
| Data type | Features | Possible values: Audio | Features | |||
| File format | |||||
| File format type | Constant | Possible values: Constant | Variable | |||
| Sampling rate (kHz) | 22.05 kHz | Sampling rate in kHz, possible values: 8 | 16 | 22.05 | 32 | 44.1 | 48 | |||
| Channels | |||||
| Setup | Mono | Possible values: Mono | Stereo | Binaural | Ambisonic | Array | Multi-Channel | Variable | |||
| Number of channels | 1 | ||||
| Material | |||||
| Source | Original | Possible values: Original | Youtube | Freesound | Online | Crowdsourced | [Dataset name] | |||
| Content | |||||
| Content type | Freefield | Possible values: Freefield | Synthetic | Isolated | |||
| Scene | Constant | Is the scene class constant for single recording, possible values: Constant | Variable | |||
| Recording | |||||
| Setup | Uncontrolled | Possible values: Near-field | Far-field | Mixed | Uncontrolled | Unknown | |||
| Setup count | 1 | Amount of different recording setups (microphone and recording device) used | |||
| Spot type | Moving | Possible values: Fixed | Moving | Unknown | |||
| Files | |||||
| Count | 302177 files | Total number of files | |||
| Meta | |||||
| Label | Value | Description | |||
| Types | Scene Accelerometer Gyroscope Magnetometer Location Phone state Air pressure Proximity Temperature Humidity Light | List of meta data types provided for the data, possible values: Event, Tag, Scene, Caption, Geolocation, Spatial location, Annotator, Timestamp, Presence, Proximity, etc. | |||
| Scene | |||||
| Classes | 51 | Number of scene classes | |||
| Classes | False | Possible values: True | False | Almost | |||
| Annotation | |||||
| Type | Weak | Possible values: Strong | Weak | None | |||
| Source | Crowdsourced | Possible values: Experts | Crowdsourced | Synthetic | Location | |||
| Annotations per item | 1 | How many annotations there are available per item (possible multi-annotator setup) | |||
| Labelled amount (%) | 100 % | Percentage of all data, amount of data which is labelled | |||
| Labeling | |||||
| Hierarchical | No | ||||
| Instance | |||||
| Count | 302177 | Count of all scene instances in the dataset | |||
| Info | |||||
| Label | Value | Description | |||
| Comments | Behavioral context recognition in-the-wild from mobile sensors | ||||