AVQA: A Dataset for Audio-Visual Question Answering on Videos.

AllBooks News Shopping Images Maps Videos

Any time

Verbatim

All results
Verbatim

[PDF] AVQA: A Dataset for Audio-Visual Question Answering on Videos

Formally, given a stream of video, audio-visual question answering aims to answer natural language questions by integrating information from both audio and ...

AVQA: A Dataset for Audio-Visual Question Answering on Videos

dl.acm.org › doi

Oct 10, 2022 · Audio-visual question answering aims to answer questions regarding both audio and visual modalities in a given video, and has drawn ...

ABSTRACT · References

AlyssaYoung/AVQA: ACM MM 2022 paper_AVQA - GitHub

github.com › AlyssaYoung › AVQA

AVQA is an audio-visual question answering dataset for the multimodal understanding of audio-visual objects and activities in real-life scenarios on videos.

AVQA Dataset - Papers With Code

paperswithcode.com › dataset › avqa

AVQA is an audio-visual question answering dataset for the multimodal understanding of audio-visual objects and activities in real-life scenarios on videos.

AVQA Dataset

mn.cs.tsinghua.edu.cn › avqa

Oct 9, 2022 · Audio-visual question answering aims to answer questions regarding both audio and visual modalities in a given video. For example, given a video ...

MUSIC-AVQA Dataset - Papers With Code

paperswithcode.com › dataset › music-av...

The large-scale MUSIC-AVQA dataset of musical performance contains 45,867 question-answer pairs, distributed in 9,288 videos for over 150 hours.

AVQA: A Dataset for Audio-Visual Question Answering on Videos

www.connectedpapers.com › main › graph

Apr 24, 2024 · Connected Papers is a visual tool to help researchers and applied scientists find academic papers relevant to their field of work.

Pano-AVQA - ICCV 2021 Open Access Repository

openaccess.thecvf.com › ICCV2021 › html

We propose a novel benchmark named Pano-AVQA as a large-scale grounded audio-visual question answering dataset on panoramic videos. Using 5.4K 360deg video ...

Object-aware Adaptive-Positivity Learning for Audio-Visual Question ...

arxiv.org › cs

Dec 20, 2023 · This paper focuses on the Audio-Visual Question Answering (AVQA) task that aims to answer questions derived from untrimmed audible videos. To ...

GeWu-Lab/MUSIC-AVQA - GitHub

github.com › GeWu-Lab › MUSIC-AVQA

The large-scale MUSIC-AVQA dataset of musical performance, which contains 45,867 question-answer pairs, distributed in 9,288 videos for over 150 hours. All QA ...