Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








1,478 Hits in 5.2 sec

Visually Supervised Speaker Detection and Localization via Microphone Array [article]

Davide Berghi, Adrian Hilton, Philip J. B. Jackson
2022 arXiv   pre-print
Our solution extends the audio front-end using a microphone array.  ...  Monaural audio may successfully detect the presence of speech activity but fails in localizing the speaker due to the lack of spatial cues.  ...  ACKNOWLEDGMENT Thanks to Marco Volino, Mohd Azri Mohd Izhar, Hansung Kim, Charles Malleson and actors for audio-visual recordings.  ... 
arXiv:2203.03291v1 fatcat:szqx5vlthrbe7lkslyyxwsphee

2020 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 28

2020 IEEE/ACM Transactions on Audio Speech and Language Processing  
., +, TASLP 2020 1755-1766 Focusing and Frequency Smoothing for Arbitrary Arrays With Application to Speaker Localization.  ...  Jo, B., +, TASLP 2020 1692-1705 Focusing and Frequency Smoothing for Arbitrary Arrays With Application to Speaker Localization.  ...  T Target tracking Multi-Hypothesis Square-Root Cubature Kalman Particle Filter for Speaker Tracking in Noisy and Reverberant Environments. Zhang, Q., +, TASLP 2020 1183 -1197  ... 
doi:10.1109/taslp.2021.3055391 fatcat:7vmstynfqvaprgz6qy3ekinkt4

Multi-Modal Localization and Enhancement of Multiple Sound Sources from a Micro Aerial Vehicle

Ricardo Sanchez-Matilla, Lin Wang, Andrea Cavallaro
2017 Proceedings of the 2017 ACM on Multimedia Conference - MM '17  
array and a video camera.  ...  We irst perform audiovisual calibration via camera resectioning, audio-visual temporal alignment and geometrical alignment to jointly use the features in the audio and video streams, which are independently  ...  As our camera has its own built-in microphone, we only need to detect the time ofset between the audio sequences from the array microphone and the camera microphone.  ... 
doi:10.1145/3123266.3123412 dblp:conf/mm/Sanchez-Matilla17 fatcat:gd4iptvcizb7dnjdi47cpfv5qu

Table of Contents

2021 IEEE/ACM Transactions on Audio Speech and Language Processing  
Hasegawa-Johnson, and S. Thomas Multiple Acoustic Source Localization in Microphone Array Networks. . . . . ..J. Yang, X. Zhong, W. Chen, and W.  ...  Liu, and H. Meng Towards Duration Robust Weakly Supervised Sound Event Detection . . . . . . . . . . . . . . . . . . . . H. Dinkel, M. Wu, and K.  ... 
doi:10.1109/taslp.2021.3137064 fatcat:rpka3f2bhjh37c7pkhiowyndhm

Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion

Israel D. Gebru, Sileye Ba, Xiaofei Li, Radu Horaud
2018 IEEE Transactions on Pattern Analysis and Machine Intelligence  
than facing the cameras and the microphones.  ...  The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment  ...  They combine voice activity detection with sound-source localization using a linear microphone array which provides horizontal (azimuth) speech directions.  ... 
doi:10.1109/tpami.2017.2648793 pmid:28103192 fatcat:cn6tcdf5n5dp7leyrrrevtxln4

Self-supervised Neural Audio-Visual Sound Source Localization via Probabilistic Spatial Modeling [article]

Yoshiki Masuyama, Yoshiaki Bando, Kohei Yatabe, Yoko Sasaki, Masaki Onishi, Yasuhiro Oikawa
2020 arXiv   pre-print
Our system for localizing sound source objects in the image is composed of audio and visual DNNs. The visual DNN is trained to localize sound source candidates within an input image.  ...  We also demonstrate that the visual DNN detected objects including talking visitors and specific exhibits from real data recorded in a science museum.  ...  Yusuke Date and Dr. Yu Hoshina for their support in the experiment in Miraikan. This study was partially supported by JSPS KAKENHI No. 18H06490 for funding.  ... 
arXiv:2007.13976v1 fatcat:k4sho4ggnbafbfc3wyhngmg76a

2021 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 29

2021 IEEE/ACM Transactions on Audio Speech and Language Processing  
The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, and inclusive pagination.  ...  -that appeared in this periodical during 2021, and items from previous years that were commented upon or corrected in 2021.  ...  ., +, TASLP 2021 1864-1880 Multiple Acoustic Source Localization in Microphone Array Networks.  ... 
doi:10.1109/taslp.2022.3147096 fatcat:7nl52k7sjfalbhpxtum3y5nmje

Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained with Noise Signals [article]

Soumitro Chakrabarty, Emanuël A. P. Habets
2018 arXiv   pre-print
Supervised learning based methods for source localization, being data driven, can be adapted to different acoustic conditions via training and have been shown to be robust to adverse acoustic environments  ...  Through additional empirical investigation, it is also shown that with an array of M microphones our proposed framework yields the best localization performance with M-1 convolution layers.  ...  For the eight microphone array, 6 CNNs are trained, whereas for the six microphones and the four microphone array, 4 and 2 CNNs are trained, respectively.  ... 
arXiv:1807.11722v1 fatcat:xv5uuacz6zdlvdmitxvtzxdzam

2019 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 27

2019 IEEE/ACM Transactions on Audio Speech and Language Processing  
Microphone Arrays.  ...  ., +, TASLP April 2019 679-691 On Mainlobe Orientation of the First-and Second-Order Differential Microphone Arrays.  ... 
doi:10.1109/taslp.2020.2971902 fatcat:j66uwjyqlfbmtgda6zhzlswpva

Acoustic sensor networks for woodpecker localization

H. Wang, C. E. Chen, A. Ali, S. Asgari, R. E. Hudson, K. Yao, D. Estrin, C. Taylor, Franklin T. Luk
2005 Advanced Signal Processing Algorithms, Architectures, and Implementations XV  
In this paper, we investigate design, analysis, and testing of acoustic arrays for localizing acorn woodpeckers using their vocalizations. 1, 2 Each acoustic array consists of four microphones arranged  ...  Woodpecker localization experiments using robust array element spacing in different types of environments are conducted and compared.  ...  We appreciate the assistance of Kathy Griffith, Joe Wise, Chih-Kai Chen, and Hyunggon Park in conducting various experiments.  ... 
doi:10.1117/12.617983 fatcat:j43b62vfhfbg3kh5umxspk4r24

Tracking the Active Speaker Based on a Joint Audio-Visual Observation Model

Israel D. Gebru, Sileye Ba, Georgios Evangelidis, Radu Horaud
2015 2015 IEEE International Conference on Computer Vision Workshop (ICCVW)  
We here cast the diarization problem into a tracking formulation whereby the active speaker is detected and tracked over time.  ...  A probabilistic tracker exploits the on-image (spatial) coincidence of visual and auditory observations and infers a single latent variable which represents the identity of the active speaker.  ...  They combine voice activity detection with sound-source local-ization using a linear microphone array. The latter can only provide the azimuth (horizontal) sound direction.  ... 
doi:10.1109/iccvw.2015.96 dblp:conf/iccvw/GebruBEH15 fatcat:lruasrz6sfgn7imwdwdd7gne2y

Table of Contents

2021 IEEE/ACM Transactions on Audio Speech and Language Processing  
in Microphone Array Networks.......J.  ...  Chin Towards Duration Robust Weakly Supervised Sound Event Detection . . . . . . . . . . . . . . ......H. Dinkel, M. Wu, and K.  ...  Speech Enhancement and Separation  ... 
doi:10.1109/taslp.2021.3137066 fatcat:ocit27xwlbagtjdyc652yws4xa

Data-Driven Multi-Microphone Speaker Localization on Manifolds

Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot
2020 Foundations and Trends® in Signal Processing  
Data-Driven Localization and Tracking Learning-based approaches have been proposed for both microphone array and binaural localization.  ...  We present two localization algorithms that were designed for a single microphone array of two microphones.  ... 
doi:10.1561/2000000098 fatcat:a7et5bmprvcvxajwsx73j3lywy

ChildBot: Multi-Robot Perception and Interaction with Children [article]

Niki Efthymiou, Panagiotis P. Filntisis, Petros Koutras, Antigoni Tsiami, Jack Hadfield, Gerasimos Potamianos, Petros Maragos
2020 arXiv   pre-print
In this paper we present an integrated robotic system capable of participating in and performing a wide range of educational and entertainment tasks, in collaboration with one or more children.  ...  The system, called ChildBot, features multimodal perception modules and multiple robotic agents that monitor the interaction environment, and can robustly coordinate complex Child-Robot Interaction use-cases  ...  Asimenia Papoulidi for their help in designing the use-cases, supervising and evaluating the experiments with the children, and their useful remarks.  ... 
arXiv:2008.12818v1 fatcat:au33jpbqpnfr5foaiivmd76n3y

Audiovisual Information Fusion in Human–Computer Interfaces and Intelligent Environments: A Survey

Shankar T. Shivappa, Mohan Manubhai Trivedi, Bhaskar D. Rao
2010 Proceedings of the IEEE  
Microphones and cameras have been extensively used to observe and detect human activity and to facilitate natural modes of interaction between humans and intelligent systems.  ...  Intelligent systems with audio-visual sensors should be capable of achieving similar goals. The audio-visual information fusion strategy is a key component in designing such systems.  ...  ACKNOWLEDGMENT We would like to thank our main sponsors, CALIT2 at UC San Diego, NSF's RESCUE project and the UC Discovery program.  ... 
doi:10.1109/jproc.2010.2057231 fatcat:lfzgfmn2hjdq7h6o5txva3oapq
« Previous Showing results 1 — 15 out of 1,478 results