Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

26,513 Hits in 4.4 sec

Multi-granularity Generator for Temporal Action Proposal [article]

Yuan Liu, Lin Ma, Yifeng Zhang, Wei Liu, Shih-Fu Chang
2019 arXiv   pre-print
In this paper, we propose a multi-granularity generator (MGG) to perform the temporal action proposal from different granularity perspectives, relying on the video visual features equipped with the position  ...  Temporal action proposal generation is an important task, aiming to localize the video segments containing human actions in an untrimmed video.  ...  This work was supported in part by the Natural Science Foundation of Jiangsu under Grant BK20151102, in part by the State Key Laboratory for Novel Software Technology, Nanjing University under Grant KFKT2017B17  ... 
arXiv:1811.11524v2 fatcat:owd62korfrdkrem4odekuzuzke

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition [article]

Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Yu Guan, Xuming He, Errui Ding
2021 arXiv   pre-print
To address the aforementioned problems, we propose a novel multi-granular spatio-temporal graph network for skeleton-based action classification that jointly models the coarse- and fine-grained skeleton  ...  The task of skeleton-based action recognition remains a core challenge in human-centred scene understanding due to the multiple granularities and large variation in human motion.  ...  ACKNOWLEDGMENTS This research is funded through the EPSRC Centre for Doctoral Training in Digital Civics (EP/L016176/1).  ... 
arXiv:2108.04536v1 fatcat:yjzuazukyrbdrlimtlkopjz6bm

Temporal Action Detection in Untrimmed Videos from Fine to Coarse Granularity

Guangle Yao, Tao Lei, Xianyuan Liu, Ping Jiang
2018 Applied Sciences  
In this paper, we make the most of different granular classifiers and propose to detect action from fine to coarse granularity, which is also in line with the people's detection habits.  ...  Temporal action detection in long, untrimmed videos is an important yet challenging task that requires not only recognizing the categories of actions in videos, but also localizing the start and end times  ...  Then, we propose a discriminative temporal selective search to generate the action proposal candidates with variable lengths.  ... 
doi:10.3390/app8101924 fatcat:fnkew4ohkbbxdarbynmpgn6b6u

Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation

Qing Li, Zhaofan Qiu, Ting Yao, Tao Mei, Yong Rui, Jiebo Luo
2016 Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval - ICMR '16  
Specifically, we model each granularity as a single stream by 2D (for frame and motion streams) or 3D (for clip and video streams) convolutional neural networks (CNNs).  ...  The framework therefore consists of multi-stream 2D or 3D CNNs to learn both the spatial and temporal representations.  ...  Figure 2 shows the overall framework of our proposed multi-granular architecture for action recognition.  ... 
doi:10.1145/2911996.2912001 dblp:conf/mir/LiQYMRL16 fatcat:hxhwr2qegfh2dfhsasuc5lpucy

Multiple Granularity Group Interaction Prediction

Taiping Yao, Minsi Wang, Bingbing Ni, Huawei Wei, Xiaokang Yang
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
Built on a bidirectional LSTM network, the proposed method possesses between granularities links which encourage feature sharing as well as cross-feature consistency between both global and local granularity  ...  In contrast, in this work, we propose a multigranularity interaction prediction network which integrates both global motion and detailed local action.  ...  [8] proposed to use multi-granularity topics to generate features for short text, and this method can significantly reduce the classification errors.  ... 
doi:10.1109/cvpr.2018.00239 dblp:conf/cvpr/YaoWNWY18 fatcat:cefmhqzxcbc2hm45a2v3shpcfa

Progressive Localization Networks for Language-based Moment Localization [article]

Qi Zheng, Jianfeng Dong, Xiaoye Qu, Xun Yang, Yabing Wang, Pan Zhou, Baolong Liu, Xun Wang
2022 arXiv   pre-print
However, candidate moments generated with a fixed temporal granularity may be suboptimal to handle the large variation in moment lengths.  ...  Specifically, each stage of PLN has a localization branch, and focuses on candidate moments that are generated with a specific temporal granularity.  ...  For multi-stage approaches, they split the localization task into multiple steps mainly involving proposal generation, classifying whether actions of interest happen in proposals, and proposal boundary  ... 
arXiv:2102.01282v2 fatcat:7i2dm6t2y5bzzjgoz6wmd67bja

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos [article]

Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang
2022 arXiv   pre-print
The coarse stream captures varied temporal dynamics by modeling multi-granularity temporal contexts.  ...  The fine stream achieves complex plots understanding by reasoning the dependency between the multi-granularity temporal contexts from the coarse stream and adaptively integrates them into fine-grained  ...  We notice that the coarse-fine two stream architecture for adaptively multi-granularity temporal dynamics reasoning can also benefit the accurate temporal action localization, improving R@0.7 from 11.4%  ... 
arXiv:2208.01954v1 fatcat:yesdc65rkzbsbjn7umhuyo5xly

Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition [article]

Sumin Lee, Yooseung Wang, Sangmin Woo, Changick Kim
2024 arXiv   pre-print
Secondly, deviating from existing hierarchical approaches (individual-to-social-to-global activity), we introduce a dual-path architecture for multi-granular activity recognition.  ...  PAR presents two major challenges: 1) recognizing the nuanced interactions among numerous individuals and 2) understanding multi-granular human activities.  ...  In contrast to these works that focus on identifying collective activities, we propose Dual Path Activity Transformer (DPATr) for recognizing multi-granular activities including individual actions, social  ... 
arXiv:2403.14113v1 fatcat:liicsm7wjnbbtosmgtendspopq

Transferable Knowledge-Based Multi-Granularity Aggregation Network for Temporal Action Localization: Submission to ActivityNet Challenge 2021 [article]

Haisheng Su, Peiqin Zhuang, Yukun Li, Dongliang Wang, Weihao Gan, Wei Wu, Yu Qiao
2021 arXiv   pre-print
In this paper, to train a supervised temporal action localizer, we adopt Temporal Context Aggregation Network (TCANet) to generate high-quality action proposals through "local and global" temporal context  ...  As for the WSTAL, a novel framework is proposed to handle the poor quality of CAS generated by simple classification network, which can only focus on local discriminative parts, rather than locate the  ...  For the weakly supervised learning track, we propose a unified network named as transferable knowledge based Multi-Granularity Fusion Network (KT-MGFN) for WSTAL.  ... 
arXiv:2107.12618v1 fatcat:c5xm6eixyjfojobuegdam3duli

A Generalized Pyramid Matching Kernel for Human Action Recognition in Realistic Videos

Jun Zhu, Quan Zhou, Weijia Zou, Rui Zhang, Wenjun Zhang
2013 Sensors  
In this paper, we propose a generalized pyramid matching kernel (GPMK) for recognizing human actions in realistic videos, based on a multi-channel "bag of words" representation constructed from local spatial-temporal  ...  As an extension to the spatial-temporal pyramid matching (STPM) kernel, the GPMK leverages heterogeneous visual cues in multiple feature descriptor types and spatial-temporal grid granularity levels, to  ...  descriptor types and spatial-temporal grid granularity levels.  ... 
doi:10.3390/s131114398 pmid:24284771 pmcid:PMC3871056 fatcat:czvyitfno5csfiqcszbhulmrui

Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-based Person Re-identification [article]

Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen
2020 arXiv   pre-print
In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-aided Attentive Feature Aggregation (MG-RAFA), to delicately aggregate spatio-temporal features into  ...  Moreover, to exploit the semantics of different levels, we propose to learn multi-granularity attentions based on the relations captured at different granularities.  ...  Different from action recognition where the motion/temporal evolution is important, the temporal motion and evolution in general has no discriminative information for person ReID while the appearances  ... 
arXiv:2003.12224v1 fatcat:lcucrn7k6faixbw5xpssfj7zcu

Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-Based Person Re-Identification

Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Referenceaided Attentive Feature Aggregation (MG-RAFA), to delicately aggregate spatio-temporal features into  ...  Moreover, to exploit the semantics of different levels, we propose to learn multi-granularity attentions based on the relations captured at different granularities.  ...  of the proposed multi-granularity design.  ... 
doi:10.1109/cvpr42600.2020.01042 dblp:conf/cvpr/ZhangLZ020 fatcat:7wid2yqrzfb5bf6bjex2lznj3m

Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification

Yichao Yan, Jie Qin, Jiaxin Chen, Li Liu, Fan Zhu, Ying Tai, Ling Shao
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
In this work, we propose a novel graph-based framework, namely Multi-Granular Hypergraph (MGH), to pursue better representational capabilities by modeling spatiotemporal dependencies in terms of multiple  ...  In each hypergraph, different temporal granularities are captured by hyperedges that connect a set of graph nodes (i.e., part-based features) across different temporal ranges.  ...  . 2) Multi-granularity of temporal clues.  ... 
doi:10.1109/cvpr42600.2020.00297 dblp:conf/cvpr/YanQC0ZT020 fatcat:6aqwoqvbtfgqpep375jnncsecu

Long-term Multi-granularity Deep Framework for Driver Drowsiness Detection [article]

Jie Lyu, Zejian Yuan, Dapeng Chen
2018 arXiv   pre-print
In this paper, we propose a Long-term Multi-granularity Deep Framework to detect driver drowsiness in driving videos containing the frontal faces.  ...  granularities, and extracts facial representations effectively for large variation of head pose, furthermore, it can flexibly fuse both detailed appearance clues of the main parts and local to global  ...  [21] proposed temporal multi-granularity approach of action recognition.  ... 
arXiv:1801.02325v1 fatcat:ddinzxqr6feehloadeffojizni

Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid Network

Jialin Gao, Zhixiang Shi, Guanshuo Wang, Jiani Li, Yufeng Yuan, Shiming Ge, Xi Zhou
This embedded module enhances the RapNet in terms of its multi-granularity temporal proposal generation ability, given predefined anchor boxes.  ...  To this end, we propose a Relation-aware pyramid Network (RapNet) to generate highly accurate temporal action proposals.  ...  any two temporal locations, which relates the present content to both the past and future for augmenting multi-granularity temporal proposal generation. 3) Our RapNet achieves the state-of-the-art performance  ... 
doi:10.1609/aaai.v34i07.6711 fatcat:vkwd3xwplzcfhefv36za46imci
« Previous Showing results 1 — 15 out of 26,513 results