A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Visually Supervised Speaker Detection and Localization via Microphone Array
[article]
2022
arXiv
pre-print
Our solution extends the audio front-end using a microphone array. ...
Monaural audio may successfully detect the presence of speech activity but fails in localizing the speaker due to the lack of spatial cues. ...
ACKNOWLEDGMENT Thanks to Marco Volino, Mohd Azri Mohd Izhar, Hansung Kim, Charles Malleson and actors for audio-visual recordings. ...
arXiv:2203.03291v1
fatcat:szqx5vlthrbe7lkslyyxwsphee
2020 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 28
2020
IEEE/ACM Transactions on Audio Speech and Language Processing
., +, TASLP 2020 1755-1766 Focusing and Frequency Smoothing for Arbitrary Arrays With Application to Speaker Localization. ...
Jo, B., +, TASLP 2020 1692-1705 Focusing and Frequency Smoothing for Arbitrary Arrays With Application to Speaker Localization. ...
T Target tracking Multi-Hypothesis Square-Root Cubature Kalman Particle Filter for Speaker Tracking in Noisy and Reverberant Environments. Zhang, Q., +, TASLP 2020 1183 -1197 ...
doi:10.1109/taslp.2021.3055391
fatcat:7vmstynfqvaprgz6qy3ekinkt4
Multi-Modal Localization and Enhancement of Multiple Sound Sources from a Micro Aerial Vehicle
2017
Proceedings of the 2017 ACM on Multimedia Conference - MM '17
array and a video camera. ...
We irst perform audiovisual calibration via camera resectioning, audio-visual temporal alignment and geometrical alignment to jointly use the features in the audio and video streams, which are independently ...
As our camera has its own built-in microphone, we only need to detect the time ofset between the audio sequences from the array microphone and the camera microphone. ...
doi:10.1145/3123266.3123412
dblp:conf/mm/Sanchez-Matilla17
fatcat:gd4iptvcizb7dnjdi47cpfv5qu
Table of Contents
2021
IEEE/ACM Transactions on Audio Speech and Language Processing
Hasegawa-Johnson, and S. Thomas Multiple Acoustic Source Localization in Microphone Array Networks. . . . . ..J. Yang, X. Zhong, W. Chen, and W. ...
Liu, and H. Meng Towards Duration Robust Weakly Supervised Sound Event Detection . . . . . . . . . . . . . . . . . . . . H. Dinkel, M. Wu, and K. ...
doi:10.1109/taslp.2021.3137064
fatcat:rpka3f2bhjh37c7pkhiowyndhm
Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion
2018
IEEE Transactions on Pattern Analysis and Machine Intelligence
than facing the cameras and the microphones. ...
The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment ...
They combine voice activity detection with sound-source localization using a linear microphone array which provides horizontal (azimuth) speech directions. ...
doi:10.1109/tpami.2017.2648793
pmid:28103192
fatcat:cn6tcdf5n5dp7leyrrrevtxln4
Self-supervised Neural Audio-Visual Sound Source Localization via Probabilistic Spatial Modeling
[article]
2020
arXiv
pre-print
Our system for localizing sound source objects in the image is composed of audio and visual DNNs. The visual DNN is trained to localize sound source candidates within an input image. ...
We also demonstrate that the visual DNN detected objects including talking visitors and specific exhibits from real data recorded in a science museum. ...
Yusuke Date and Dr. Yu Hoshina for their support in the experiment in Miraikan. This study was partially supported by JSPS KAKENHI No. 18H06490 for funding. ...
arXiv:2007.13976v1
fatcat:k4sho4ggnbafbfc3wyhngmg76a
2021 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 29
2021
IEEE/ACM Transactions on Audio Speech and Language Processing
The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, and inclusive pagination. ...
-that appeared in this periodical during 2021, and items from previous years that were commented upon or corrected in 2021. ...
., +, TASLP 2021 1864-1880 Multiple Acoustic Source Localization in Microphone Array Networks. ...
doi:10.1109/taslp.2022.3147096
fatcat:7nl52k7sjfalbhpxtum3y5nmje
Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained with Noise Signals
[article]
2018
arXiv
pre-print
Supervised learning based methods for source localization, being data driven, can be adapted to different acoustic conditions via training and have been shown to be robust to adverse acoustic environments ...
Through additional empirical investigation, it is also shown that with an array of M microphones our proposed framework yields the best localization performance with M-1 convolution layers. ...
For the eight microphone array, 6 CNNs are trained, whereas for the six microphones and the four microphone array, 4 and 2 CNNs are trained, respectively. ...
arXiv:1807.11722v1
fatcat:xv5uuacz6zdlvdmitxvtzxdzam
2019 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 27
2019
IEEE/ACM Transactions on Audio Speech and Language Processing
Microphone Arrays. ...
., +, TASLP April 2019 679-691 On Mainlobe Orientation of the First-and Second-Order Differential Microphone Arrays. ...
doi:10.1109/taslp.2020.2971902
fatcat:j66uwjyqlfbmtgda6zhzlswpva
Acoustic sensor networks for woodpecker localization
2005
Advanced Signal Processing Algorithms, Architectures, and Implementations XV
In this paper, we investigate design, analysis, and testing of acoustic arrays for localizing acorn woodpeckers using their vocalizations. 1, 2 Each acoustic array consists of four microphones arranged ...
Woodpecker localization experiments using robust array element spacing in different types of environments are conducted and compared. ...
We appreciate the assistance of Kathy Griffith, Joe Wise, Chih-Kai Chen, and Hyunggon Park in conducting various experiments. ...
doi:10.1117/12.617983
fatcat:j43b62vfhfbg3kh5umxspk4r24
Tracking the Active Speaker Based on a Joint Audio-Visual Observation Model
2015
2015 IEEE International Conference on Computer Vision Workshop (ICCVW)
We here cast the diarization problem into a tracking formulation whereby the active speaker is detected and tracked over time. ...
A probabilistic tracker exploits the on-image (spatial) coincidence of visual and auditory observations and infers a single latent variable which represents the identity of the active speaker. ...
They combine voice activity detection with sound-source local-ization using a linear microphone array. The latter can only provide the azimuth (horizontal) sound direction. ...
doi:10.1109/iccvw.2015.96
dblp:conf/iccvw/GebruBEH15
fatcat:lruasrz6sfgn7imwdwdd7gne2y
Table of Contents
2021
IEEE/ACM Transactions on Audio Speech and Language Processing
in Microphone Array Networks.......J. ...
Chin Towards Duration Robust Weakly Supervised Sound Event Detection . . . . . . . . . . . . . . ......H. Dinkel, M. Wu, and K. ...
Speech Enhancement and Separation ...
doi:10.1109/taslp.2021.3137066
fatcat:ocit27xwlbagtjdyc652yws4xa
Data-Driven Multi-Microphone Speaker Localization on Manifolds
2020
Foundations and Trends® in Signal Processing
Data-Driven Localization and Tracking Learning-based approaches have been proposed for both microphone array and binaural localization. ...
We present two localization algorithms that were designed for a single microphone array of two microphones. ...
doi:10.1561/2000000098
fatcat:a7et5bmprvcvxajwsx73j3lywy
ChildBot: Multi-Robot Perception and Interaction with Children
[article]
2020
arXiv
pre-print
In this paper we present an integrated robotic system capable of participating in and performing a wide range of educational and entertainment tasks, in collaboration with one or more children. ...
The system, called ChildBot, features multimodal perception modules and multiple robotic agents that monitor the interaction environment, and can robustly coordinate complex Child-Robot Interaction use-cases ...
Asimenia Papoulidi for their help in designing the use-cases, supervising and evaluating the experiments with the children, and their useful remarks. ...
arXiv:2008.12818v1
fatcat:au33jpbqpnfr5foaiivmd76n3y
Audiovisual Information Fusion in Human–Computer Interfaces and Intelligent Environments: A Survey
2010
Proceedings of the IEEE
Microphones and cameras have been extensively used to observe and detect human activity and to facilitate natural modes of interaction between humans and intelligent systems. ...
Intelligent systems with audio-visual sensors should be capable of achieving similar goals. The audio-visual information fusion strategy is a key component in designing such systems. ...
ACKNOWLEDGMENT We would like to thank our main sponsors, CALIT2 at UC San Diego, NSF's RESCUE project and the UC Discovery program. ...
doi:10.1109/jproc.2010.2057231
fatcat:lfzgfmn2hjdq7h6o5txva3oapq
« Previous
Showing results 1 — 15 out of 1,478 results