Multi-granularity Generator for Temporal Action Proposal.

In this paper, we propose a multi-granularity generator (MGG) to perform the temporal action proposal from different granularity perspectives, relying on the video visual features equipped with the position ... Temporal action proposal generation is an important task, aiming to localize the video segments containing human actions in an untrimmed video. ... This work was supported in part by the Natural Science Foundation of Jiangsu under Grant BK20151102, in part by the State Key Laboratory for Novel Software Technology, Nanjing University under Grant KFKT2017B17 ...

arXiv:1811.11524v2 fatcat:owd62korfrdkrem4odekuzuzke

Open Access Multiple Versions

To address the aforementioned problems, we propose a novel multi-granular spatio-temporal graph network for skeleton-based action classification that jointly models the coarse- and fine-grained skeleton ... The task of skeleton-based action recognition remains a core challenge in human-centred scene understanding due to the multiple granularities and large variation in human motion. ... ACKNOWLEDGMENTS This research is funded through the EPSRC Centre for Doctoral Training in Digital Civics (EP/L016176/1). ...

arXiv:2108.04536v1 fatcat:yjzuazukyrbdrlimtlkopjz6bm

Open Access

In this paper, we make the most of different granular classifiers and propose to detect action from fine to coarse granularity, which is also in line with the people's detection habits. ... Temporal action detection in long, untrimmed videos is an important yet challenging task that requires not only recognizing the categories of actions in videos, but also localizing the start and end times ... Then, we propose a discriminative temporal selective search to generate the action proposal candidates with variable lengths. ...

doi:10.3390/app8101924 fatcat:fnkew4ohkbbxdarbynmpgn6b6u

DOAJ

Specifically, we model each granularity as a single stream by 2D (for frame and motion streams) or 3D (for clip and video streams) convolutional neural networks (CNNs). ... The framework therefore consists of multi-stream 2D or 3D CNNs to learn both the spatial and temporal representations. ... Figure 2 shows the overall framework of our proposed multi-granular architecture for action recognition. ...

doi:10.1145/2911996.2912001 dblp:conf/mir/LiQYMRL16 fatcat:hxhwr2qegfh2dfhsasuc5lpucy

Built on a bidirectional LSTM network, the proposed method possesses between granularities links which encourage feature sharing as well as cross-feature consistency between both global and local granularity ... In contrast, in this work, we propose a multigranularity interaction prediction network which integrates both global motion and detailed local action. ... [8] proposed to use multi-granularity topics to generate features for short text, and this method can significantly reduce the classification errors. ...

doi:10.1109/cvpr.2018.00239 dblp:conf/cvpr/YaoWNWY18 fatcat:cefmhqzxcbc2hm45a2v3shpcfa

However, candidate moments generated with a fixed temporal granularity may be suboptimal to handle the large variation in moment lengths. ... Specifically, each stage of PLN has a localization branch, and focuses on candidate moments that are generated with a specific temporal granularity. ... For multi-stage approaches, they split the localization task into multiple steps mainly involving proposal generation, classifying whether actions of interest happen in proposals, and proposal boundary ...

arXiv:2102.01282v2 fatcat:7i2dm6t2y5bzzjgoz6wmd67bja

Multiple Versions

The coarse stream captures varied temporal dynamics by modeling multi-granularity temporal contexts. ... The fine stream achieves complex plots understanding by reasoning the dependency between the multi-granularity temporal contexts from the coarse stream and adaptively integrates them into fine-grained ... We notice that the coarse-fine two stream architecture for adaptively multi-granularity temporal dynamics reasoning can also benefit the accurate temporal action localization, improving R@0.7 from 11.4% ...

arXiv:2208.01954v1 fatcat:yesdc65rkzbsbjn7umhuyo5xly

Secondly, deviating from existing hierarchical approaches (individual-to-social-to-global activity), we introduce a dual-path architecture for multi-granular activity recognition. ... PAR presents two major challenges: 1) recognizing the nuanced interactions among numerous individuals and 2) understanding multi-granular human activities. ... In contrast to these works that focus on identifying collective activities, we propose Dual Path Activity Transformer (DPATr) for recognizing multi-granular activities including individual actions, social ...

arXiv:2403.14113v1 fatcat:liicsm7wjnbbtosmgtendspopq

In this paper, to train a supervised temporal action localizer, we adopt Temporal Context Aggregation Network (TCANet) to generate high-quality action proposals through "local and global" temporal context ... As for the WSTAL, a novel framework is proposed to handle the poor quality of CAS generated by simple classification network, which can only focus on local discriminative parts, rather than locate the ... For the weakly supervised learning track, we propose a unified network named as transferable knowledge based Multi-Granularity Fusion Network (KT-MGFN) for WSTAL. ...

arXiv:2107.12618v1 fatcat:c5xm6eixyjfojobuegdam3duli

In this paper, we propose a generalized pyramid matching kernel (GPMK) for recognizing human actions in realistic videos, based on a multi-channel "bag of words" representation constructed from local spatial-temporal ... As an extension to the spatial-temporal pyramid matching (STPM) kernel, the GPMK leverages heterogeneous visual cues in multiple feature descriptor types and spatial-temporal grid granularity levels, to ... descriptor types and spatial-temporal grid granularity levels. ...

doi:10.3390/s131114398 pmid:24284771 pmcid:PMC3871056 fatcat:czvyitfno5csfiqcszbhulmrui

DOAJ

In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-aided Attentive Feature Aggregation (MG-RAFA), to delicately aggregate spatio-temporal features into ... Moreover, to exploit the semantics of different levels, we propose to learn multi-granularity attentions based on the relations captured at different granularities. ... Different from action recognition where the motion/temporal evolution is important, the temporal motion and evolution in general has no discriminative information for person ReID while the appearances ...

arXiv:2003.12224v1 fatcat:lcucrn7k6faixbw5xpssfj7zcu

In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Referenceaided Attentive Feature Aggregation (MG-RAFA), to delicately aggregate spatio-temporal features into ... Moreover, to exploit the semantics of different levels, we propose to learn multi-granularity attentions based on the relations captured at different granularities. ... of the proposed multi-granularity design. ...

doi:10.1109/cvpr42600.2020.01042 dblp:conf/cvpr/ZhangLZ020 fatcat:7wid2yqrzfb5bf6bjex2lznj3m

In this work, we propose a novel graph-based framework, namely Multi-Granular Hypergraph (MGH), to pursue better representational capabilities by modeling spatiotemporal dependencies in terms of multiple ... In each hypergraph, different temporal granularities are captured by hyperedges that connect a set of graph nodes (i.e., part-based features) across different temporal ranges. ... . 2) Multi-granularity of temporal clues. ...

doi:10.1109/cvpr42600.2020.00297 dblp:conf/cvpr/YanQC0ZT020 fatcat:6aqwoqvbtfgqpep375jnncsecu

In this paper, we propose a Long-term Multi-granularity Deep Framework to detect driver drowsiness in driving videos containing the frontal faces. ... granularities, and extracts facial representations effectively for large variation of head pose, furthermore, it can flexibly fuse both detailed appearance clues of the main parts and local to global ... [21] proposed temporal multi-granularity approach of action recognition. ...

arXiv:1801.02325v1 fatcat:ddinzxqr6feehloadeffojizni

This embedded module enhances the RapNet in terms of its multi-granularity temporal proposal generation ability, given predefined anchor boxes. ... To this end, we propose a Relation-aware pyramid Network (RapNet) to generate highly accurate temporal action proposals. ... any two temporal locations, which relates the present content to both the past and future for augmenting multi-granularity temporal proposal generation. 3) Our RapNet achieves the state-of-the-art performance ...

doi:10.1609/aaai.v34i07.6711 fatcat:vkwd3xwplzcfhefv36za46imci

Multi-granularity Generator for Temporal Action Proposal [article]

Preserved Fulltext

Other Versions

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition [article]

Preserved Fulltext

Temporal Action Detection in Untrimmed Videos from Fine to Coarse Granularity

Preserved Fulltext

Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation

Preserved Fulltext

Multiple Granularity Group Interaction Prediction

Preserved Fulltext

Progressive Localization Networks for Language-based Moment Localization [article]

Preserved Fulltext

Other Versions

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos [article]

Preserved Fulltext

Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition [article]

Preserved Fulltext

Transferable Knowledge-Based Multi-Granularity Aggregation Network for Temporal Action Localization: Submission to ActivityNet Challenge 2021 [article]

Preserved Fulltext

A Generalized Pyramid Matching Kernel for Human Action Recognition in Realistic Videos

Preserved Fulltext

Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-based Person Re-identification [article]

Preserved Fulltext

Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-Based Person Re-Identification

Preserved Fulltext

Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification

Preserved Fulltext

Long-term Multi-granularity Deep Framework for Driver Drowsiness Detection [article]

Preserved Fulltext

Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid Network

Preserved Fulltext