A semantic tree method for image classification and video action recognition.

Whereas most existing action recognition methods require computationally demanding feature extraction and/or classification, this paper presents a novel real-time solution that utilises local appearance ... Semantic texton forests (STFs) are applied to local space-time volumes as a powerful discriminative codebook. ... The method called bag of semantic textons (BOST) developed for image classification [25] is applied to analyse local space-time appearance. ...

doi:10.5244/c.24.52 dblp:conf/bmvc/YuKC10 fatcat:6olwm3mygvca5e5c6raetxkrg4

And construct the semantic meaningful graph gives the semantic structure and matches the nouns, verb and adverb detected in the video frame and also detect the action and position of the object by using ... Textual queries semantic search can contain temporal and spatial information about multiple objects like trees and building present in the scene. ... And detect the object from video frame by giving semantic graph.Concept based is another method to retrieve videos and this method is based on a set of concepts detector which bridges the semantic query ...

doi:10.17148/ijarcce.2017.64163 fatcat:d6hyjjzudbgg3g5dwnbjbildwq

In order to improve the semantic understanding capacity and efficiency of video segments, this paper adopts a 3-layer semantic recognition approach based on key frame extraction. ... For example, most existing behaviour recognition methods use the video frames obtained by even segmentation and fixed sampling as the input, which may lose important information between sampling intervals ... Current methods for fusion of semantic themes express visual features of each image as a visual "Bag-of-words". ...

doi:10.17559/tv-20180620082101 fatcat:srsipbjh4jcnrlxkwmlsxufhga

DOAJ

In the proposed space, the distances between words represent the semantic distances which are used for constructing a discriminative and semantically meaningful vocabulary. ... We use the latent semantic models, such as LSA and pLSA, in order to define semantically-rich features and embed the visual features into a semantic space. ... Considering these drawbacks, we propose a method for action and scene recognition based on a semantic visual vocabulary that uses latent aspect models to embed visual words into a rich semantic space which ...

doi:10.5244/c.24.15 dblp:conf/bmvc/KhademFRS10 fatcat:he62aquf35gftmma3mswpfuwum

So on this paper we improve the CRF model and bag-of-visual-words semantic model, combine the advantages of both models to build a hierarchical model for behavior recognition, first we create a hierarchical ... semantic mark CRFs model, the model is divided into upper and lower layers and a gymnastic image filter that based on bag-of-visual-words semantic model. ... of the Semantic Dictionary Algorithm Figure 5 . 5 The High-Level Semantic Tree Constructs Table 1 . 1 Accuracy of Three Types of Classification Algorithm % method fall down crash action transforms ...

doi:10.14257/ijsip.2015.8.1.03 fatcat:3qco5rezvfg4zitg7ov3qraw6y

In this paper, a high-level semantic recognition model is used to parse the video content of human sports under engineering management, and the stream shape of the previous layer is embedded in the convolutional ... The method is applied to image classification, and the experimental results show that the method can extract image features more effectively, thus improving the accuracy of feature classification. ... images and optical streams to obtain effective video semantic concept features, and the confidence fusion classification method for the category score results of the two-stream SoftMax layer can more ...

doi:10.1155/2022/6761857 pmid:35069724 pmcid:PMC8769854 fatcat:dosm2tf7ijccjkr4s6um24slh4

DOAJ

Major video analysis system includes action detection and classification, action tracking, recognizing actions and behavior understanding. ... Even though, traditional methods have achieved greater success on several human actions. But, still it is a challenging problem to recognize human action. ... MODERN METHODS FOR HUMAN ACTION RECOGNITION In the following Sections, we discuss the various challenging methods for action recognition A. ...

doi:10.17577/ijertv4is080577 fatcat:hikmv56t6jc5la7ipcny5u4kha

Previous surveys have focused on the methodologies of sports video analysis from the spatiotemporal viewpoint instead of a content-based viewpoint, and few of these studies have considered semantics. ... Finally, the paper summarizes the future trends and challenges for sports video analysis. ... [223] presented a method for text localization and segmentation for images and videos, and for extracting information used for semantic indexing. Noll et al. ...

doi:10.1109/tcsvt.2017.2655624 fatcat:rwqzu46sgfb7tpkcav4ysmh6ae

Multiple Versions

In this paper we present an instant action recognition method, which is able to recognize an action in real-time from only two continuous video frames. ... For the sake of instantaneity, we employ two types of computationally efficient but perceptually important features -optical flow and edges -to capture motion and shape characteristics of actions. ... Acknowledgement The authors would like to thank for the support from research grants 973-2009CB320904, and the National Science Foundation of China NSFC-61272027, 61272321, 61103087. ...

doi:10.1016/j.patcog.2012.08.016 fatcat:3h3l5d6k2jevdahkfogcvdv7xy

The results show benefits of two stages method, the accuracy of action recognition was significantly improved compared to a single-stage method (above 87% compared to human expert). ... The proposed method was evaluated under two complex and challenging scenarios: making a pancake and making a sandwich. ... This algorithm needs to be adapted for the video domain. The inputs to the network are 3D video blocks instead of image patches, i.e. we flatten the sequence of patches into a vector. ...

doi:10.1109/humanoids.2013.7030014 fatcat:rie3v6j5efexdkt5n37ohqy3kq

Local space-time features have recently shown promising results within Bag-of-Features (BoF) approach to action recognition in video. ... For this purpose, we decompose video into region classes and augment local features with corresponding region-class labels. ... Our main focus in this paper has been to demonstrate a notion of semantic level video segmentation and its advantage for action recognition. ...

doi:10.5244/c.24.95 dblp:conf/bmvc/UllahPL10 fatcat:y4v5xwsnejcohidfuyb76jlchy

By aligning the classes of previous datasets to our semantic space, we gather (image/video/skeleton/MoCap) datasets into a unified database in a unified label system, i.e., bridging "isolated islands" ... To this end, we design a structured action semantic space given verb taxonomy hierarchy and covering massive actions. ... Action Recognition 6.2.1 Verb Node Classification To evaluate the verb node classification, we build a Pangea test set with 178 K images. ...

arXiv:2304.00553v4 fatcat:tmyw4ubxgjaszod3oxkivkzyf4

Multiple Versions

Our ultimate findings suggest that a significant amount of semantics have been well retained in the video supervoxel segmentation and can be used for further video analysis. ... In this paper, we conduct a systematic study of how well the actor and action semantics are retained in video supervoxel segmentation. ... For example, the state of the art methods on the ImageNet Large Scale Visual Recognition Challenge [3] have accuracies near 20% [4] and a recent work achieves a mean average precision of 0.16 on a ...

arXiv:1311.3318v1 fatcat:ld2ntfh3jnc5nehkplfb7p5kry

In the last years, the computer vision research community has studied on how to model temporal dynamics in videos to employ 3D human action recognition. ... The proposed representation has the advantage of combining the use of reference joints and a tree structure skeleton. ... We improve the representation of skeleton joints for 3D action recognition encoding temporal dynamics by combining the use of reference joints [15] and a tree structure skeleton [18] . ...

arXiv:1909.05704v1 fatcat:bsezywfwsbfaheucuoneubafki

To this end, we propose a progressive visual reasoning method that progressively generates fine sentences from weak annotations by inferring more semantic concepts and their dependency relationships for ... It is the fact that there now exist an amazing number of videos with weak annotation that only contains semantic concepts such as actions and objects. ... It is the fact there are many video datasets with weak annotation that only contains semantic concepts (i.e. actions and objects), such as object classification datasets and action recognition datasets ...

arXiv:2009.01067v1 fatcat:55wjdfuvz5h4vojswkvneuwj7e

Real-time Action Recognition by Spatiotemporal Semantic and Structural Forests

Preserved Fulltext

Analytical Review on Textual Queries Semantic Search based Video Retrieval

Preserved Fulltext

Application of Video Scene Semantic Recognition Technology in Smart Video

Preserved Fulltext

Embedding Visual Words into Concept Space for Action and Scene Recognition

Preserved Fulltext

Human Behavior Recognition based on Conditional Random Field and Bag-Of-Visual-Words Semantic Model

Preserved Fulltext

Video Content Analysis of Human Sports under Engineering Management Incorporating High-Level Semantic Recognition Models

Preserved Fulltext

Study on Recent Approaches for Human Action Recognition in Real Time

Preserved Fulltext

A Survey of Content-Aware Video Analysis for Sports

Preserved Fulltext

Learning discriminative features for fast frame-based action recognition

Preserved Fulltext

Enhancing human action recognition through spatio-temporal feature learning and semantic rules

Preserved Fulltext

Improving bag-of-features action recognition with non-local cues

Preserved Fulltext

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding [article]

Preserved Fulltext

Other Versions

A Study of Actor and Action Semantic Retention in Video Supervoxel Segmentation [article]

Preserved Fulltext

Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints [article]

Preserved Fulltext

Video Captioning Using Weak Annotation [article]

Preserved Fulltext