TAU Urban Audio-Visual Scenes 2021, Development

Audio-Visual Scene Classification

Results table for "Audio-Visual Scene Classification" task with "TAU Urban Audio-Visual Scenes 2021, Development" dataset.

The results table can be sorted by clicking column headers (the second click will change the sorting order). Detailed information about the publication where results were published can be seen by clicking the item in the "Publication" column.

Rank Publication Year ID Publication / Title Model identifier accuracy logloss
Naranjo-Alcazar2021 2021 c8fb7b05910bbbe89aa00c8b3b00ef2c Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification Multi-Modal (Late Fusion), Gammatone 90.0
Naranjo-Alcazar2021 2021 86e05ed25150c29251245e6eda38b2dc Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification Multi-Modal (Early Fusion), Gammatone 89.2
Naranjo-Alcazar2021 2021 af239bc38de7f2dfea9c2481b1fce164 Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification Multi-Modal (Late Fusion), log-Mel 88.7
Naranjo-Alcazar2021 2021 3977cb2075a33f63981b7c7d50af363d Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification Multi-Modal (Early Fusion), log-Mel 88.5
Naranjo-Alcazar2021 2021 005d74f476c41f13fd7b3cfe9d3e6748 Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification Visual-Only, log-Mel 87.0
Naranjo-Alcazar2021 2021 4e0a72eb73e91540742d968bcb9c4790 Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification Audio-Only, log-Mel 68.4
Naranjo-Alcazar2021 2021 70734aaf94506a2df347d4fd9fbe69e8 Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification Visual-Only, Gammetone 87.0
Naranjo-Alcazar2021 2021 549060d6b669175ae0d87b6b69039d10 Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification Audio-Only, Gammetone 69.0
Okazaki2021 2021 069fa08bfe899e3354f42eed0cfd5f67 A Multi-Modal Fusion Approach for Audio-Visual Scene Classification Enhanced by CLIP Variants E02 96.1 0.149
Okazaki2021 2021 82a18b92c57e21f04fe642d3a0d460e2 A Multi-Modal Fusion Approach for Audio-Visual Scene Classification Enhanced by CLIP Variants E01 95.8 0.238