A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Multi-Microphone Complex Spectral Mapping for Speech Dereverberation
[article]
2020
arXiv
pre-print
This study proposes a multi-microphone complex spectral mapping approach for speech dereverberation on a fixed array geometry. ...
Experimental results on multi-channel speech dereverberation demonstrate the effectiveness of the proposed approach. ...
With multiple microphones, spatial information can be leveraged in addition to spectral cues to improve speech enhancement and audio source separation. ...
arXiv:2003.01861v1
fatcat:7yaudakexrddzcyq5gpbm6flka
Resource-Efficient Speech Mask Estimation for Multi-Channel Speech Enhancement
[article]
2020
arXiv
pre-print
This speech mask is used to obtain either the Minimum Variance Distortionless Response (MVDR) or Generalized Eigenvalue (GEV) beamformer. ...
In particular, we use reduced-precision DNNs for estimating a speech mask from noisy, multi-channel microphone observations. ...
We use the 6 channel data processed with 32-, 8-, 4-, and 1-bit DNNs for speech mask estimation using GEV-PAN, GEV-BAN and MVDR beamformers. ...
arXiv:2007.11477v1
fatcat:67ldkced3rf5nj2evkbejmr66y
Backward Compatible Spatialized Teleconferencing based on Squeezed Recordings
[chapter]
2011
Advances in Sound Localization
In echoic conditions ( Fig. 12 (b) The final set of results in Fig. 13 is for recordings of two simultaneous speech sources of equal power (SNR of 0 dB) separated by an angle of 45˚ and at a distance ...
Average results for recordings in diffuse noise across all noise sources for an SNR of 0 dB (a) Anechoic recordings. (b) Echoic recordings. ...
This chapter is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike-3.0 License, which permits use, distribution and reproduction for non-commercial purposes, provided ...
doi:10.5772/14413
fatcat:hocbqdiqrndihjpx3odimqcxoa
A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation
2017
IEEE/ACM Transactions on Audio Speech and Language Processing
Research in speech enhancement and separation has followed two convergent paths, starting with microphone array processing and blind source separation, respectively. ...
In addition, they are crucial pre-processing steps for noise-robust automatic speech and speaker recognition. Many devices now have two to eight microphones. ...
Moreover, several source separation methods were used as pre-processing for speech recognition within a series of speech separation and recognition challenges [305] . ...
doi:10.1109/taslp.2016.2647702
fatcat:ltfmmoguxngk5jrvzy7azzufae
Hybrid Neural Networks for On-Device Directional Hearing
2022
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
On-device directional hearing requires audio source separation from a given direction while achieving stringent human-imperceptible latency requirements. ...
Further, our real-time hybrid model runs in 8 ms on mobile CPUs designed for low-power wearable devices and achieves an end-to-end latency of 17.5 ms. ...
Acknowledgments We thank Les Atlas, Steve Seitz, Laura Trutoiu and Ludwig Schmidt for their important feedback on this work. ...
doi:10.1609/aaai.v36i10.21394
fatcat:d6bukcg6efgejbouu2gb3fezbm
Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
[article]
2021
arXiv
pre-print
A promising approach for multi-microphone speech separation involves two deep neural networks (DNN), where the predicted target speech from the first DNN is used to compute signal statistics for time-invariant ...
minimum variance distortionless response (MVDR) beamforming, and the MVDR result is then used as extra features for the second DNN to predict target speech. ...
Shinji Watanabe for helpful discussions. ...
arXiv:2110.00570v1
fatcat:zpuhd7gxezaynkwgwisrs2dqxm
Clustered Blind Beamforming From Ad-Hoc Microphone Arrays
2011
IEEE Transactions on Audio, Speech, and Language Processing
Therefore, it is hypothesised that using a cluster of microphones (ie, a sub-array), closely located both to each other and to the desired speech source may in fact provide more robust speech enhancement ...
A similar, but distinct, scenario has been investigated in the NIST meeting transcription evaluations, in which data is recorded using a variety of randomly placed table-top microphones as well as small ...
B. Cluster Proximity Ranking For the subsequent speech enhancement and recognition evaluation, the rule-based clustering was used to obtain microphone clusters for Data Sets B, C and D. ...
doi:10.1109/tasl.2010.2055560
fatcat:n4inflyvdvdl5ehla3snopitd4
Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation
[article]
2021
arXiv
pre-print
components of multiple microphones. ...
State-of-the-art separation performance is obtained on the simulated two-talker SMS-WSJ corpus and the real-recorded LibriCSS dataset. ...
MISO1-BF We use the initial separation results by MISO1 to compute an MVDR beamformer for each source (denoted as MISO1-BF). ...
arXiv:2010.01703v2
fatcat:huvvxizr2jhjlhugtwk4kr7kze
Hybrid Neural Networks for On-device Directional Hearing
[article]
2021
arXiv
pre-print
On-device directional hearing requires audio source separation from a given direction while achieving stringent human-imperceptible latency requirements. ...
Further, our real-time hybrid model runs in 8 ms on mobile CPUs designed for low-power wearable devices and achieves an end-to-end latency of 17.5 ms. ...
Acknowledgments We thank Les Atlas, Steve Seitz, Laura Trutoiu and Ludwig Schmidt for their important feedback on this work. ...
arXiv:2112.05893v1
fatcat:eahy332puzdttmriiqx65bzy3q
A Source Separation Evaluation Method In Object-Based Spatial Audio
2015
Zenodo
Fig. 3 . 3 Setup for real-room speech recordings. ...
The 48-channel microphone array was hung right above the dummy head, to record concurrent speech signals coming from position pairs (A,B), (A,C) and (A,D). ...
doi:10.5281/zenodo.38891
fatcat:yufnxyhigzdcvhfkrbnkkqxdpy
Comparing Binaural Pre-processing Strategies III
2015
Trends in Hearing
The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. ...
The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. ...
Funding The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the Cluster of Excellence "Hearing4All ...
doi:10.1177/2331216515618609
pmid:26721922
pmcid:PMC4771033
fatcat:h4javfvdlvfx7edf5dnhl4iemm
A Framework for Speech Enhancement With Ad Hoc Microphone Arrays
2016
IEEE/ACM Transactions on Audio Speech and Language Processing
reference when more than one subarray are used. ...
When perceptual quality or intelligibility of the speech are the ultimate goals, there are turning points where the MVDR and the LCMV are superior to Wiener-based methods. ...
One sample (3-5 seconds) for each speaker type is used from TSP speech audio recordings available in SMARD. ...
doi:10.1109/taslp.2016.2537202
fatcat:3xb4plnpkjcgfgahqrft524cj4
TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor's Approximation Theory
[article]
2022
arXiv
pre-print
As an attempt to fill the blank, we propose a novel neural beamformer inspired by Taylor's approximation theory called TaylorBeamformer for multi-channel speech enhancement. ...
While existing end-to-end beamformers achieve impressive performance in various front-end speech processing tasks, they usually encapsulate the whole process into a black box and thus lack adequate interpretability ...
Introduction Multi-channel speech enhancement (MCSE) aims at extracting target speech from multiple noisy-reverberant microphone recording signals. ...
arXiv:2203.07195v2
fatcat:n5if7hfkubeotkmavdqh2f2rgm
Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm
2020
Applied Sciences
In this article, we first propose a uniform circular nested microphone array (CNMA) for data recording. ...
The multi-channel methods increase the speech enhancement performance by providing more information with the use of more microphones. ...
Acknowledgments: This work was supported by the Vicerrectoría de Investigación y Postgrado of the Universidad Tecnológica Metropolitana, the Vicerrectoría de Investigación y Postgrado, and Faculty of Engineering ...
doi:10.3390/app10113955
fatcat:ixkzoa6vwnedzcvzwewiervq6m
Minimum Variance Distortionless Response Beamformer with Enhanced Nulling Level Control via Dynamic Mutated Artificial Immune System
2014
The Scientific World Journal
Minimum variance distortionless response (MVDR) beamforming is capable of determining the weight vectors for beam steering; however, its nulling level on the interference sources remains unsatisfactory ...
Hence, in this paper, a new dynamic mutated artificial immune system (DM-AIS) is proposed to enhance MVDR beamforming for controlling the null steering of interference and increase the signal to interference ...
Acknowledgment This work was supported in part by MOSTI (Ministry of Science, Technology and Innovation, Malaysia) with Project no. 01-02-03-SF0202. ...
doi:10.1155/2014/164053
pmid:25003136
pmcid:PMC4070486
fatcat:jcb57f7mpvajdn3or5dlevqvyi
« Previous
Showing results 1 — 15 out of 117 results