Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Past week
  • Any time
  • Past hour
  • Past 24 hours
  • Past week
  • Past month
  • Past year
All results
2 days ago · Activitynet-qa: A dataset for understanding complex web videos via question answering. ... Pano-avqa: Grounded audio-visual question answering on 360deg videos.
2 days ago · We introduce a novel visual question answering dataset comprising videos captured with a wearable 360-degree camera, aiming to address common challenges ...
4 days ago · Audio-visual question answering (AVQA) is a challenging task that requires multistep spatio-temporal reasoning over multimodal contexts. Recent works rely on ...
5 days ago · Pano-AVQA: Grounded Audio-Visual Question Answering on 360° Videos ... Datasets for Audio-Visual Video Representation Learning ... YouTube-8M is the largest video ...
In order to show you the most relevant results, we have omitted some entries very similar to the 4 already displayed. If you like, you can repeat the search with the omitted results included.