Open-Set Recognition: an Inexpensive Strategy to Increase DNN Reliability.

To address these challenges, we propose Seeker, a novel approach to efficiently execute DNN inferences for Human Activity Recognition (HAR) tasks, using both an EH-WSN and a host mobile device. ... There is an increasing demand for intelligent processing on emerging ultra-low-power internet of things (IoT) devices, and recent works have shown substantial efficiency boosts by executing inference tasks ... To quantify the scope of this challenge, we perform experiments on the MHEALTH data-set [24] , [25] (see Section V for data-set details) using the DNNs proposed in [26] , [27] , an energy harvesting ...

arXiv:2204.13106v1 fatcat:ryfubvnskrbarjj7kwdalia2f4

Open Access

The latter disturbances severely hamper the intelligibility of a speech signal, making Distant Speech Recognition (DSR) one of the major open challenges in the field. ... Deep learning is an emerging technology that is considered one of the most promising directions for reaching higher levels of artificial intelligence. ... The improved network capacity, however, increases the number of parameters, inherently requiring more data to reliably estimate them. ...

arXiv:1712.06086v1 fatcat:2b7ymqmihjan5nkxeqrxq52wki

This paper offers an empirical investigation on which aspects of DNN acoustic model design are most important for speech recognition system performance. ... This larger corpus allows us to more thoroughly examine performance of large DNN models -- with up to ten times more parameters than those typically used in speech recognition systems. ... Our work aims to address these concerns by systematically exploring several strategies to improve DNN acoustic models. ...

arXiv:1406.7806v2 fatcat:4oc3szk3sbewlgstnnaq2s25ca

Multiple Versions

This work explores the use of constant-Q transform based modulation spectral features (CQT-MSF) for speech emotion recognition (SER). ... Finally, we perform Grad-CAM analysis to visually inspect the contribution of constant-Q modulation features over SER. ... Up-to-date design and inclusion of an extensive emotion set with varying intensities make this an important database for SER. ...

doi:10.1016/j.specom.2022.11.005 fatcat:xudmgcpirncd3k2qujlqmpk4z4

Multiple Versions

We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as back-end of a reverberant speech recognition system, and a novel method to improve the dereverberation ... The DNN-HMM system trained on the multi-condition training set achieved a conspicuously higher word accuracy compared to the MLLR-adapted GMM-HMM system trained on the same data. ... The PC decode hard feature is expected to be more reliable than the PC hard feature which is computed frame by frame because it take advantage of the initial recognition results generated using triphone ...

doi:10.1186/s13634-015-0246-6 fatcat:fxywhwrakjbe7jlmvmvv7p2tyq

DOAJ

In this paper, we attempt to unravel three aspects related to the robustness of DNNs for face recognition: (i) assessing the impact of deep architectures for face recognition in terms of vulnerabilities ... Our experimental evaluation using multiple open-source DNN-based face recognition networks, including OpenFace and VGG-Face, and two publicly available databases (MEDS and PaSC) demonstrates that the performance ... To address this challenge without increasing the failure to process rate (by rejecting the samples), the third contribution of this research is a novel technique of selective dropout in the DNN to mitigate ...

arXiv:1803.00401v1 fatcat:giq5ojssbvc65errmgfpkrc7fe

We began right at the start of the Kinect T M revolution when inexpensive infrared cameras providing image depth recordings became available. ... This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. ... We thank our co-organizers of ChaLearn gesture and action recognition challenges: Miguel Reyes, Jordi Gonzalez, Xavier Baro, Jamie Shotton, Victor Ponce, Miguel Angel Bautista, and Hugo Jair Escalante. ...

doi:10.1007/978-3-319-57021-1_1 fatcat:vfeijghqtvffllogw2tium3pwa

OpenCV from Intel is a free and open-source image and video processing library. It is related to computer vision in terms of feature and object recognition, as well as machine learning. ... , and recognition of a person's gender and age. ... The device is inexpensive, easy to set up, and use. The Attitude Tracking Algorithm is highly precise and reliable. ...

doi:10.5120/ijca2022922085 fatcat:wgftgmto55h2rmairdkrqkdmnm

As an RGB-D camera scans a cluttered scene, image-based instance-level semantic segmentation creates semantic object masks that enable real-time object recognition and the creation of an object-level representation ... MaskFusion takes full advantage of using instance-level semantic segmentation to enable semantic labels to be fused into an object-aware map, unlike recent semantics enabled SLAM systems that perform voxel-level ... accurate and robust SLAM system that can deal with arbitrary dynamic and non-rigid scenes remains an open challenge. ...

arXiv:1804.09194v2 fatcat:uinjm6mor5aehp34sydtx2snbm

Multiple Versions

We began right at the start of the Kinect T M revolution when inexpensive infrared cameras providing image depth recordings became available. ... This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. ... We thank our co-organizers of ChaLearn gesture and action recognition challenges: Miguel Reyes, Jordi Gonzalez, Xavier Baro, Jamie Shotton, Victor Ponce, Miguel Angel Bautista, and Hugo Jair Escalante. ...

dblp:journals/jmlr/EscaleraAG16 fatcat:r4q2iywy7balhjlh2vpknltrde

Szczepanski

We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures. ... We present a new comprehensive survey hoping to boost research in the field. ... Combining features into multi-modal sets resulted in a large increase in the recognition rates, by more than 10% when compared to the most successful unimodal system. ...

arXiv:1801.07481v1 fatcat:7x7zhe77rvbqtpedefmwwqisqm

In this context, this paper describes an approach for real-time human action recognition from raw depth image-sequences, provided by an RGB-D camera. ... Human actions recognition is a fundamental task in artificial vision, that has earned a great importance in recent years due to its multiple applications in different areas. surveillance. ... Despite the numerous works dealing with the recognition of actions, this is then still an open issue in real scenarios, with open problems such as the different viewpoints in videos, changing lighting ...

arXiv:2006.07743v1 fatcat:2yea5ajtbnhrzkbrw4r3kp25x4

Open Access

An Unmanned Aerial Vehicle (UAV), commonly called a drone, is an aircraft without a human pilot aboard. ... Furthermore, we conduct empirical research studies to assess several factors that might influence the efficiency of human detection and action recognition techniques in UAVs. ... There has been an increasing rate of attention paid to training the generated activity recognition model utilizing multi-task learning. ...

doi:10.18280/ts.380515 fatcat:mjtvcswnuvc3dk7hmni4oolkyi

We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures. ... At the present, most researchers agree that words serve primarily to convey information and the body movements to form relationships and sometimes even to substitute the verbal communication (e.g. lethal ... Exercising open-handed gestures during conversation can give the impression of a more reliable person. ...

doi:10.1109/taffc.2018.2874986 fatcat:zjnr2w4orje7vj2bhmia4f5qki

As an RGB-D camera scans a cluttered scene, image-based instance-level semantic segmentation creates semantic object masks that enable realtime object recognition and the creation of an object-level representation ... MaskFusion takes full advantage of using instance-level semantic segmentation to enable semantic labels to be fused into an object-aware map, unlike recent semantics enabled SLAM systems that perform voxel-level ... accurate and robust SLAM system that can deal with arbitrary dynamic and non-rigid scenes remains an open challenge. ...

doi:10.1109/ismar.2018.00024 dblp:conf/ismar/RunzBA18 fatcat:bjyv6fxeqnexvoslr4bila4tn4

Seeker: Synergizing Mobile and Energy Harvesting Wearable Sensors for Human Activity Recognition [article]

Preserved Fulltext

Deep Learning for Distant Speech Recognition [article]

Preserved Fulltext

Building DNN Acoustic Models for Large Vocabulary Speech Recognition [article]

Preserved Fulltext

Other Versions

Modulation spectral features for speech emotion recognition using deep neural networks

Preserved Fulltext

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

Preserved Fulltext

Unravelling Robustness of Deep Learning based Face Recognition Against Adversarial Attacks [article]

Preserved Fulltext

Challenges in Multi-modal Gesture Recognition [chapter]

Preserved Fulltext

Face Detection and Recognition using OpenCV

Preserved Fulltext

MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects [article]

Preserved Fulltext

Other Versions

Challenges in multimodal gesture recognition

Preserved Fulltext

Survey on Emotional Body Gesture Recognition [article]

Preserved Fulltext

3DFCNN: Real-Time Action Recognition using 3D Deep Neural Networks with Raw Depth Information [article]

Preserved Fulltext

Challenges and Limitations in Human Action Recognition on Unmanned Aerial Vehicles: A Comprehensive Survey

Preserved Fulltext

Survey on Emotional Body Gesture Recognition

Preserved Fulltext

MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects

Preserved Fulltext