Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








7,901 Hits in 3.5 sec

Hierarchical spatio-temporal context modeling for action recognition

Ju Sun, Xiao Wu, Shuicheng Yan, Loong-Fah Cheong, Tat-Seng Chua, Jintao Li
2009 2009 IEEE Conference on Computer Vision and Pattern Recognition  
In this paper, we propose to model the spatio-temporal context information in a hierarchical way, where three levels of context are exploited in ascending order of abstraction: 1) point-level context (  ...  Building on the multichannel nonlinear SVMs, we validate this proposed hierarchical framework on the realistic action (HOHA) and event (LSCOM) recognition databases, and achieve 27% and 66% relative performance  ...  Schematic diagram on hierarchical spatio-temporal context modeling.  ... 
doi:10.1109/cvpr.2009.5206721 dblp:conf/cvpr/SunWYCCL09 fatcat:v6ww72arfjfdbokklaxe5qu5vq

Hierarchical spatio-temporal context modeling for action recognition

Ju Sun, Xiao Wu, Shuicheng Yan, Loong-Fah Cheong, Tat-Seng Chua, Jintao Li
2009 2009 IEEE Conference on Computer Vision and Pattern Recognition  
In this paper, we propose to model the spatio-temporal context information in a hierarchical way, where three levels of context are exploited in ascending order of abstraction: 1) point-level context (  ...  Building on the multichannel nonlinear SVMs, we validate this proposed hierarchical framework on the realistic action (HOHA) and event (LSCOM) recognition databases, and achieve 27% and 66% relative performance  ...  Schematic diagram on hierarchical spatio-temporal context modeling.  ... 
doi:10.1109/cvprw.2009.5206721 fatcat:d2qg4be4czfwdbznw7x7oruwum

Multi-stage Factorized Spatio-Temporal Representation for RGB-D Action and Gesture Recognition [article]

Yujun Ma, Benjia Zhou, Ruili Wang, Pichao Wang
2023 arXiv   pre-print
To address the above issues, we propose an innovative heuristic architecture called Multi-stage Factorized Spatio-Temporal (MFST) for RGB-D action and gesture recognition.  ...  Specifically, the CDC-Stem module captures bottom-level spatio-temporal features and passes them successively to the following spatio-temporal factored stages to capture the hierarchical spatial and temporal  ...  With these designs, our MFST model effectively learns spatio-temporal features within each modality for action and gesture recognition.  ... 
arXiv:2308.12006v2 fatcat:uajfhoofpbg23ac54tnctjbdvy

TAN: Temporal Aggregation Network for Dense Multi-label Action Recognition [article]

Xiyang Dai, Bharat Singh, Joe Yue-Hei Ng, Larry S. Davis
2018 arXiv   pre-print
Experiments show that our model is well suited for dense multi-label action recognition, which is a challenging sub-topic of action recognition that requires predicting multiple action labels in each frame  ...  By stacking spatial and temporal convolutions repeatedly, TAN forms a deep hierarchical representation for capturing spatio-temporal information in videos.  ...  This demonstrates that our model progressively constructs hierarchical spatio-temporal representations for recognizing human actions.  ... 
arXiv:1812.06203v1 fatcat:7allnptupzalbnxdoshweo5rpu

Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition [article]

Syed Talal Wasim, Muhammad Uzair Khattak, Muzammal Naseer, Salman Khan, Mubarak Shah, Fahad Shahbaz Khan
2023 arXiv   pre-print
Recent video recognition models utilize Transformer models for long-range spatio-temporal context modeling.  ...  We extensively explore the design space of focal modulation-based spatio-temporal context modeling and demonstrate our parallel spatial and temporal encoding design to be the optimal choice.  ...  Conclusion To learn spatio-temporal representations that can effectively model both local and global contexts, this paper introduces Video-FocalNets for video action recognition tasks.  ... 
arXiv:2307.06947v4 fatcat:4mz2hvnhzbbhbmaji4ehjjkupa

A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example

Mehrsan Javan Roshtkhari, Martin D. Levine
2012 2012 Ninth Conference on Computer and Robot Vision  
At the second level of the hierarchy, a large contextual region containing many STVs (Ensemble of STVs) is considered in order to construct a probabilistic model of STVs and their spatio-temporal compositions  ...  in the cases of a single training example and cross-dataset action recognition.  ...  Recently, local STVs have been used in the context of BOV models and have shown promising results for action recognition [2] , [3] , [7] - [13] .  ... 
doi:10.1109/crv.2012.32 dblp:conf/crv/RoshtkhariL12 fatcat:woqfp4exlbbytbpkydgsjfni5i

Study on Recent Approaches for Human Action Recognition in Real Time

R. Rajitha Jasmine, Dr. K. K. Thyagharajan
2015 International Journal of Engineering Research and  
The challenge is to recognize human actions with more accuracy and efficiency in recognition time.  ...  The important area in computer vision is human understanding and recognizing actions. The main aim of action recognition is an automatic analysis of various actions from video data.  ...  Ryoo and Aggarwal propose a spatio-temporal relationship matching strategy for human action recognition.  ... 
doi:10.17577/ijertv4is080577 fatcat:hikmv56t6jc5la7ipcny5u4kha

Action Recognition using Temporal Bag-of-Words from Depth Maps

Parul Shukla, Kanad Kishore Biswas, Prem K. Kalra
2013 IAPR International Workshop on Machine Vision Applications  
In this paper, we present a methodology for human action recognition from a sequence of depth maps obtained using Microsoft Kinect.  ...  In order to make the representation insensitive to temporal sequence misalignment, we propose using the Temporal Bag-of-Words model in a hierarchical manner by recursively partitioning the depth maps sequence  ...  Spatio-temporal features Local spatio-temporal features have shown good performance in action recognition task for color video stream [3] , [8] .  ... 
dblp:conf/mva/ShuklaBK13 fatcat:oragei32zfdxbbjjumt73uwbne

A Comprehensive Study of Group Activity Recognition Methods in Video

S. A. Vahora, N. C. Chauhan
2017 Indian Journal of Science and Technology  
, person-group interaction, group-group interaction, uses of temporal information, and recognition of group activity frame wise or video wise.  ...  Findings: Different models of group activity recognition are characterized as per the capabilities of the defined model considering individual pose of person, atomic activity of person, person-person interaction  ...  as they uses context model like Action Context (AC) descriptor, Relative Action context (RAC) descriptor, Spatio-Temporal Volume (STV). 11, 12 Person-person interaction model like distance based attraction  ... 
doi:10.17485/ijst/2017/v10i23/113996 fatcat:5ltu45vqmvdgxgeasifu3fipry

Learning Dynamic Spatio-Temporal Relations for Human Activity Recognition

Zhenyu Liu, Yaqiang Yao, Yan Liu, Yuening Zhu, Zhenchao Tao, Lei Wang, Yuhong Feng
2020 IEEE Access  
Finally, a hierarchical decomposition of the human body is introduced to obtain a discriminative representation for a single action.  ...  INDEX TERMS Human activity recognition, qualitative spatio-temporal graph, vector quantization, discrete HMMs.  ...  of the hierarchical decomposition of the human body for activity recognition, we constructed spatio-temporal graphs for actions based on three alternative decompositions: the whole body (no hierarchical  ... 
doi:10.1109/access.2020.3009136 fatcat:skmwviewnjg2rigyztji55z6ga

Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition [article]

Sumin Lee, Yooseung Wang, Sangmin Woo, Changick Kim
2024 arXiv   pre-print
Secondly, deviating from existing hierarchical approaches (individual-to-social-to-global activity), we introduce a dual-path architecture for multi-granular activity recognition.  ...  First, while previous works often focus on spatial distance among individuals within an image, we argue to consider the spatio-temporal proximity.  ...  For comparisons, we model three types of transformer structure: parallel, hierarchical, and reverse hierarchical.  ... 
arXiv:2403.14113v1 fatcat:liicsm7wjnbbtosmgtendspopq

Spatio-Temporal Triangular-Chain CRF for Activity Recognition

Congqi Cao, Yifan Zhang, Hanqing Lu
2015 Proceedings of the 23rd ACM international conference on Multimedia - MM '15  
This paper addresses the problem of complex activity recognition with a unified hierarchical model. We expand triangularchain CRFs (TriCRFs) to the spatial dimension.  ...  The proposed architecture can be perceived as a spatio-temporal version of the TriCRFs, in which the labels of actions and activity are modeled jointly and their complex dependencies are exploited.  ...  ACKNOWLEDGEMENTS We thank the anonymous reviews for their valuable comments.  ... 
doi:10.1145/2733373.2806304 dblp:conf/mm/CaoZL15 fatcat:47oporwe5fc47ousn5ispxjzba

Human activity recognition in videos using a single example

Mehrsan Javan Roshtkhari, Martin D. Levine
2013 Image and Vision Computing  
This paper presents a novel approach for action recognition, localization and video matching based on a hierarchical codebook model of local spatio-temporal video volumes.  ...  The hierarchical algorithm codes a video as a compact set of spatio-temporal volumes, while considering their spatio-temporal compositions in order to account for spatial and temporal contextual information  ...  In this paper we present a hierarchical probabilistic codebook method for action recognition and localization in videos.  ... 
doi:10.1016/j.imavis.2013.08.005 fatcat:oyvytzkhnfalnlxqdp2zkqtwr4

Spatio-temporal action detection and localization using a hierarchical LSTM

Akshaya Ramaswamy, Karthik Seemakurthy, Jayavardhana Gubbi, Balamuralidhar Purushothaman
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
Inspired by the way the human visual system operates, we propose a hierarchical architecture to capture the spatio-temporal information from a given input video at different time scales.  ...  The proposed network is used for video action detection and localization application that is the foundational element for video analysis.  ...  [7] proposes transformer networks for video action recognition, which can be used to extract the spatio-temporal context of the person. This can be used for human activity clas-sification.  ... 
doi:10.1109/cvprw50498.2020.00390 dblp:conf/cvpr/RamaswamySGP20 fatcat:crmhhmo5krfdzj6uyzboqojncq

A Hierarchical Context Model for Event Recognition in Surveillance Video

Xiaoyang Wang, Qiang Ji
2014 2014 IEEE Conference on Computer Vision and Pattern Recognition  
Experiments on VIRAT 1.0 and 2.0 Ground Datasets demonstrate the effectiveness of the proposed hierarchical context model for improving the event recognition performance even under great challenges like  ...  To tackle the learning and inference challenges brought in by the model hierarchy, we develop complete learning and inference algorithms for the proposed hierarchical context model based on variational  ...  In addition, approaches like [26, 23] also utilize hierarchical probabilistic models for event and action recognition.  ... 
doi:10.1109/cvpr.2014.328 dblp:conf/cvpr/WangJ14 fatcat:vsbin2nebfgapbnv2lxgf7ekai
« Previous Showing results 1 — 15 out of 7,901 results