2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning.

In this work, we propose a multitask framework for jointly 2D and 3D pose estimation from still images and human action recognition from video sequences. ... Action recognition and human pose estimation are closely related but both problems are generally handled as distinct tasks in the literature. ... Figure 1 . 1 The proposed multitask approach for pose estimation and action recognition. Our method provides 2D/3D pose estimation from single images or frame sequences. ...

arXiv:1802.09232v2 fatcat:qwactqazanccrb6toshqo7mvxq

Multiple Versions

In this work, we propose a multitask framework for jointly 2D and 3D pose estimation from still images and human action recognition from video sequences. ... Action recognition and human pose estimation are closely related but both problems are generally handled as distinct tasks in the literature. ... The proposed multitask approach for pose estimation and action recognition. Our method provides 2D/3D pose estimation from single images or frame sequences. ...

doi:10.1109/cvpr.2018.00539 dblp:conf/cvpr/LuvizonPT18 fatcat:cnjhs4rie5ea3fm3vr27ltk4aa

We present a deep learning-based multitask framework for joint 3D human pose estimation and action recognition from RGB sensors using simple cameras. The approach proceeds along two stages. ... In particular, the experimental results show that by using a monocular RGB sensor, we can develop a 3D pose estimation and human action recognition approach that reaches the performance of RGB-depth sensors ... There are four hypotheses that motivate us to build a deep learning framework for human action recognition from 3D poses. ...

doi:10.3390/s20071825 pmid:32218350 pmcid:PMC7180926 fatcat:ilbvk55rcvhx7k2wkvgdgu4fjm

DOAJ

Human pose estimation and action recognition are related tasks since both problems are strongly dependent on the human body representation and analysis. ... In this work, we propose a multi-task framework for jointly estimating 2D or 3D human poses from monocular color images and classifying human actions from video sequences. ... ACKNOWLEDGMENTS This work was partially supported by the Brazilian National Council for Scientific and Technological Development (CNPq) -Grant 233342/2014-1. ...

doi:10.1109/tpami.2020.2976014 pmid:32091993 fatcat:47mkuvrc3nauxbmfqzksxsl2xa

Human pose estimation and action recognition are related tasks since both problems are strongly dependent on the human body representation and analysis. ... In this work, we propose a multi-task framework for jointly estimating 2D or 3D human poses from monocular color images and classifying human actions from video sequences. ... Acknowledgment This work was partially supported by the Brazilian National Council for Scientific and Technological Development (CNPq) -Grant 233342/2014-1. ...

arXiv:1912.08077v1 fatcat:2kth5hzd55hvvipst44kc4442u

Multiple Versions

Recently, benefited from the deep learning technologies, a significant amount of research efforts have greatly advanced the monocular human pose estimation both in 2D and 3D areas. ... We believe this survey will provide the readers with a deep and insightful understanding of monocular human pose estimation. ... [24] propose a multi-task framework for jointly 2D/3D pose estimation and human action recognition from video sequences. ...

arXiv:2104.11536v1 fatcat:tdag2jq2vjdrjekwukm5nu7l6a

Recently, benefiting from the deep learning technologies, a significant amount of research efforts have advanced the monocular human pose estimation both in 2D and 3D areas. ... We believe this survey will provide the readers (researchers, engineers, developers, etc.) with a deep and insightful understanding of monocular human pose estimation. ... [123] propose a multi-task framework for jointly 2D/3D pose estimation and human action recognition from video sequences. ...

doi:10.1145/3524497 fatcat:4pbvntngrnfp7lqhcpjmy7p2fq

A novel publicly available dataset named HARPET (Hockey Action Recognition Pose Estimation, Temporal) was created, composed of sequences of annotated actions and pose of hockey players including their ... Third, pose and optical flow streams are fused and passed to fully-connected layers to estimate the hockey player's action. ... [14] also use action recognition for 3D pose estimation. Luvizon et al. [31] use a multitask framework for joint action recognition and 2D/3D pose estimation. Wang et al. ...

arXiv:1812.09533v1 fatcat:yc3vxgo2wvbljfa7wf46i7sd4a

(3) For pose estimation, a bigger and more general dataset, MSCOCO, is successfully used for transfer learning to a smaller and more specific dataset, HARPET, achieving a PCKh of 87%. ... A novel publicly available dataset named HARPET (Hockey Action Recognition Pose Estimation, Temporal) was created, composed of sequences of annotated actions and pose of hockey players including their ... [14] also use action recognition for 3D pose estimation. Luvizon et al. [32] use a multitask framework for joint action recognition and 2D/3D pose estimation. Wang et al. ...

doi:10.1109/cvprw.2019.00310 dblp:conf/cvpr/CaiNVCZ19 fatcat:oyjmp3fnq5asviu3p4yslxt4wy

Action Machine can benefit from the multi-task training of action recognition and pose estimation, the fusion of predictions from RGB images and poses. ... It extends the Inflated 3D ConvNet (I3D) by adding a branch for human pose estimation and a 2D CNN for pose-based action recognition, being fast to train and test. ... Thus, action recognition in videos can be naturally formulated as a multi-task learning problem including RGB-based action recognition, pose estimation and pose-based action recognition. ...

arXiv:1812.05770v2 fatcat:r3d32bxhwrgprbuuvm6cntowmq

Multiple Versions

to 3D action recognition methods. ... In this paper, we propose to benchmark action recognition methods in such absence of context and introduce a novel dataset, Mimetics, consisting of mimed actions for a subset of 50 classes from the Kinetics ... In other words, we transfer the features learned for 2D-3D pose estimation to action recognition: they typically contain information about the human poses without explicitly representing them as body keypoint ...

arXiv:1912.07249v3 fatcat:wio3wwk7indztg737ls5hysfmi

Multiple Versions

With the transition of facial expression recognition (FER) from laboratory-controlled to challenging in-the-wild conditions and the recent success of deep learning techniques in various fields, deep neural ... Recent deep FER systems generally focus on two important issues: overfitting caused by a lack of sufficient training data and expression-unrelated variations, such as illumination, head pose and identity ... [224] proposed a deep fusion CNN (DF-CNN) to explore multi-modal 2D+3D FER. ...

arXiv:1804.08348v2 fatcat:katpvrizybha5bgy6bepfi3xpe

Multiple Versions

Various deep learning techniques have been proposed to solve the single-view 2D-to-3D pose estimation problem. ... We further apply the proposed technique on the skeleton-based action recognition task and also achieve state-of-the-art performance. ... using multitask deep learning. ...

arXiv:2108.07181v2 fatcat:4agxj3njkngqzfny4oqagbhhkq

Multiple Versions

In this paper, we propose SportsCap -- the first approach for simultaneously capturing 3D human motions and understanding fine-grained actions from monocular challenging sports video input. ... mapping block to assemble various correlated action attributes into a high-level action label for the overall detailed understanding of the whole sequence, so as to enable various applications like action ... Acknowledgements This work was supported by NSFC programs (61 976138, 61977047), the National Key Research and Development Program (2018YFB2100500), STCSM (2015F0203-000-06) and SHMEC (2019-01-07-00-01 ...

arXiv:2104.11452v4 fatcat:kq3q5vs2ajdnjf2y43apoi75hm

Multiple Versions

A model's accuracy is improved by removing redundant data and focusing on useful visual information in a higher resolution. ... Using hard attention, the essential regions of videos are identified and separated from the non-informative parts of the data. ... ., & Tabia, H. (2018). 2d/3d pose estimation and action recognition using multitask deep learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5137-5146). ...

arXiv:2202.02212v4 fatcat:roum46ecdbhnti4ui5zs5y53j4

Open Access Multiple Versions

2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning [article]

Preserved Fulltext

Other Versions

2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning

Preserved Fulltext

A Unified Deep Framework for Joint 3D Pose Estimation and Action Recognition from a Single RGB Camera

Preserved Fulltext

Multi-task Deep Learning for Real-Time 3D Human Pose Estimation and Action Recognition

Preserved Fulltext

Multi-task Deep Learning for Real-Time 3D Human Pose Estimation and Action Recognition [article]

Preserved Fulltext

Other Versions

Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective [article]

Preserved Fulltext

Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective

Preserved Fulltext

Temporal Hockey Action Recognition via Pose and Optical Flows [article]

Preserved Fulltext

Temporal Hockey Action Recognition via Pose and Optical Flows

Preserved Fulltext

Action Machine: Rethinking Action Recognition in Trimmed Videos [article]

Preserved Fulltext

Other Versions

Mimetics: Towards Understanding Human Actions Out of Context [article]

Preserved Fulltext

Other Versions

Deep Facial Expression Recognition: A Survey [article]

Preserved Fulltext

Other Versions

Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation [article]

Preserved Fulltext

Other Versions

SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos [article]

Preserved Fulltext

Other Versions

Video Violence Recognition and Localization Using a Semi-Supervised Hard Attention Model [article]

Preserved Fulltext

Other Versions