Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








522 Hits in 7.2 sec

Seeker: Synergizing Mobile and Energy Harvesting Wearable Sensors for Human Activity Recognition [article]

Cyan Subhra Mishra, Jack Sampson, Mahmut Taylan Kandemir, Vijaykrishnan Narayanan
2022 arXiv   pre-print
To address these challenges, we propose Seeker, a novel approach to efficiently execute DNN inferences for Human Activity Recognition (HAR) tasks, using both an EH-WSN and a host mobile device.  ...  There is an increasing demand for intelligent processing on emerging ultra-low-power internet of things (IoT) devices, and recent works have shown substantial efficiency boosts by executing inference tasks  ...  To quantify the scope of this challenge, we perform experiments on the MHEALTH data-set [24] , [25] (see Section V for data-set details) using the DNNs proposed in [26] , [27] , an energy harvesting  ... 
arXiv:2204.13106v1 fatcat:ryfubvnskrbarjj7kwdalia2f4

Deep Learning for Distant Speech Recognition [article]

Mirco Ravanelli
2017 arXiv   pre-print
The latter disturbances severely hamper the intelligibility of a speech signal, making Distant Speech Recognition (DSR) one of the major open challenges in the field.  ...  Deep learning is an emerging technology that is considered one of the most promising directions for reaching higher levels of artificial intelligence.  ...  The improved network capacity, however, increases the number of parameters, inherently requiring more data to reliably estimate them.  ... 
arXiv:1712.06086v1 fatcat:2b7ymqmihjan5nkxeqrxq52wki

Building DNN Acoustic Models for Large Vocabulary Speech Recognition [article]

Andrew L. Maas, Peng Qi, Ziang Xie, Awni Y. Hannun, Christopher T. Lengerich, Daniel Jurafsky, Andrew Y. Ng
2015 arXiv   pre-print
This paper offers an empirical investigation on which aspects of DNN acoustic model design are most important for speech recognition system performance.  ...  This larger corpus allows us to more thoroughly examine performance of large DNN models -- with up to ten times more parameters than those typically used in speech recognition systems.  ...  Our work aims to address these concerns by systematically exploring several strategies to improve DNN acoustic models.  ... 
arXiv:1406.7806v2 fatcat:4oc3szk3sbewlgstnnaq2s25ca

Modulation spectral features for speech emotion recognition using deep neural networks

Premjeet Singh, Md Sahidullah, Goutam Saha
2022 Speech Communication  
This work explores the use of constant-Q transform based modulation spectral features (CQT-MSF) for speech emotion recognition (SER).  ...  Finally, we perform Grad-CAM analysis to visually inspect the contribution of constant-Q modulation features over SER.  ...  Up-to-date design and inclusion of an extensive emotion set with varying intensities make this an important database for SER.  ... 
doi:10.1016/j.specom.2022.11.005 fatcat:xudmgcpirncd3k2qujlqmpk4z4

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
2015 EURASIP Journal on Advances in Signal Processing  
We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as back-end of a reverberant speech recognition system, and a novel method to improve the dereverberation  ...  The DNN-HMM system trained on the multi-condition training set achieved a conspicuously higher word accuracy compared to the MLLR-adapted GMM-HMM system trained on the same data.  ...  The PC decode hard feature is expected to be more reliable than the PC hard feature which is computed frame by frame because it take advantage of the initial recognition results generated using triphone  ... 
doi:10.1186/s13634-015-0246-6 fatcat:fxywhwrakjbe7jlmvmvv7p2tyq

Unravelling Robustness of Deep Learning based Face Recognition Against Adversarial Attacks [article]

Gaurav Goswami, Nalini Ratha, Akshay Agarwal, Richa Singh, Mayank Vatsa
2018 arXiv   pre-print
In this paper, we attempt to unravel three aspects related to the robustness of DNNs for face recognition: (i) assessing the impact of deep architectures for face recognition in terms of vulnerabilities  ...  Our experimental evaluation using multiple open-source DNN-based face recognition networks, including OpenFace and VGG-Face, and two publicly available databases (MEDS and PaSC) demonstrates that the performance  ...  To address this challenge without increasing the failure to process rate (by rejecting the samples), the third contribution of this research is a novel technique of selective dropout in the DNN to mitigate  ... 
arXiv:1803.00401v1 fatcat:giq5ojssbvc65errmgfpkrc7fe

Challenges in Multi-modal Gesture Recognition [chapter]

Sergio Escalera, Vassilis Athitsos, Isabelle Guyon
2017 Gesture Recognition  
We began right at the start of the Kinect T M revolution when inexpensive infrared cameras providing image depth recordings became available.  ...  This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015.  ...  We thank our co-organizers of ChaLearn gesture and action recognition challenges: Miguel Reyes, Jordi Gonzalez, Xavier Baro, Jamie Shotton, Victor Ponce, Miguel Angel Bautista, and Hugo Jair Escalante.  ... 
doi:10.1007/978-3-319-57021-1_1 fatcat:vfeijghqtvffllogw2tium3pwa

Face Detection and Recognition using OpenCV

Ajay Kumar, Shivansh Chaudhary, Sonik Sangal, Raj Dhama
2022 International Journal of Computer Applications  
OpenCV from Intel is a free and open-source image and video processing library. It is related to computer vision in terms of feature and object recognition, as well as machine learning.  ...  , and recognition of a person's gender and age.  ...  The device is inexpensive, easy to set up, and use. The Attitude Tracking Algorithm is highly precise and reliable.  ... 
doi:10.5120/ijca2022922085 fatcat:wgftgmto55h2rmairdkrqkdmnm

MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects [article]

Martin Rünz, Maud Buffier, Lourdes Agapito
2018 arXiv   pre-print
As an RGB-D camera scans a cluttered scene, image-based instance-level semantic segmentation creates semantic object masks that enable real-time object recognition and the creation of an object-level representation  ...  MaskFusion takes full advantage of using instance-level semantic segmentation to enable semantic labels to be fused into an object-aware map, unlike recent semantics enabled SLAM systems that perform voxel-level  ...  accurate and robust SLAM system that can deal with arbitrary dynamic and non-rigid scenes remains an open challenge.  ... 
arXiv:1804.09194v2 fatcat:uinjm6mor5aehp34sydtx2snbm

Challenges in multimodal gesture recognition

Sergio Escalera, Vassilis Athitsos, Isabelle Guyon
2016 Journal of machine learning research  
We began right at the start of the Kinect T M revolution when inexpensive infrared cameras providing image depth recordings became available.  ...  This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015.  ...  We thank our co-organizers of ChaLearn gesture and action recognition challenges: Miguel Reyes, Jordi Gonzalez, Xavier Baro, Jamie Shotton, Victor Ponce, Miguel Angel Bautista, and Hugo Jair Escalante.  ... 
dblp:journals/jmlr/EscaleraAG16 fatcat:r4q2iywy7balhjlh2vpknltrde

Survey on Emotional Body Gesture Recognition [article]

Fatemeh Noroozi and Ciprian Adrian Corneanu and Dorota Kamińska and Tomasz Sapiński and Sergio Escalera and Gholamreza Anbarjafari
2018 arXiv   pre-print
We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures.  ...  We present a new comprehensive survey hoping to boost research in the field.  ...  Combining features into multi-modal sets resulted in a large increase in the recognition rates, by more than 10% when compared to the most successful unimodal system.  ... 
arXiv:1801.07481v1 fatcat:7x7zhe77rvbqtpedefmwwqisqm

3DFCNN: Real-Time Action Recognition using 3D Deep Neural Networks with Raw Depth Information [article]

Adrian Sanchez-Caballero, Sergio de López-Diz, David Fuentes-Jimenez, Cristina Losada-Gutiérrez, Marta Marrón-Romera, David Casillas-Perez, Mohammad Ibrahim Sarker
2020 arXiv   pre-print
In this context, this paper describes an approach for real-time human action recognition from raw depth image-sequences, provided by an RGB-D camera.  ...  Human actions recognition is a fundamental task in artificial vision, that has earned a great importance in recent years due to its multiple applications in different areas. surveillance.  ...  Despite the numerous works dealing with the recognition of actions, this is then still an open issue in real scenarios, with open problems such as the different viewpoints in videos, changing lighting  ... 
arXiv:2006.07743v1 fatcat:2yea5ajtbnhrzkbrw4r3kp25x4

Challenges and Limitations in Human Action Recognition on Unmanned Aerial Vehicles: A Comprehensive Survey

Nashwan Adnan Othman, Ilhan Aydin
2021 Traitement du signal  
An Unmanned Aerial Vehicle (UAV), commonly called a drone, is an aircraft without a human pilot aboard.  ...  Furthermore, we conduct empirical research studies to assess several factors that might influence the efficiency of human detection and action recognition techniques in UAVs.  ...  There has been an increasing rate of attention paid to training the generated activity recognition model utilizing multi-task learning.  ... 
doi:10.18280/ts.380515 fatcat:mjtvcswnuvc3dk7hmni4oolkyi

Survey on Emotional Body Gesture Recognition

Fatemeh Noroozi, Dorota Kaminska, Ciprian Corneanu, Tomasz Sapinski, Sergio Escalera, Gholamreza Anbarjafari
2019 IEEE Transactions on Affective Computing  
We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures.  ...  At the present, most researchers agree that words serve primarily to convey information and the body movements to form relationships and sometimes even to substitute the verbal communication (e.g. lethal  ...  Exercising open-handed gestures during conversation can give the impression of a more reliable person.  ... 
doi:10.1109/taffc.2018.2874986 fatcat:zjnr2w4orje7vj2bhmia4f5qki

MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects

Martin Runz, Maud Buffier, Lourdes Agapito
2018 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)  
As an RGB-D camera scans a cluttered scene, image-based instance-level semantic segmentation creates semantic object masks that enable realtime object recognition and the creation of an object-level representation  ...  MaskFusion takes full advantage of using instance-level semantic segmentation to enable semantic labels to be fused into an object-aware map, unlike recent semantics enabled SLAM systems that perform voxel-level  ...  accurate and robust SLAM system that can deal with arbitrary dynamic and non-rigid scenes remains an open challenge.  ... 
doi:10.1109/ismar.2018.00024 dblp:conf/ismar/RunzBA18 fatcat:bjyv6fxeqnexvoslr4bila4tn4
« Previous Showing results 1 — 15 out of 522 results