Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








117 Hits in 4.2 sec

Multi-Microphone Complex Spectral Mapping for Speech Dereverberation [article]

Zhong-Qiu Wang, DeLiang Wang
2020 arXiv   pre-print
This study proposes a multi-microphone complex spectral mapping approach for speech dereverberation on a fixed array geometry.  ...  Experimental results on multi-channel speech dereverberation demonstrate the effectiveness of the proposed approach.  ...  With multiple microphones, spatial information can be leveraged in addition to spectral cues to improve speech enhancement and audio source separation.  ... 
arXiv:2003.01861v1 fatcat:7yaudakexrddzcyq5gpbm6flka

Resource-Efficient Speech Mask Estimation for Multi-Channel Speech Enhancement [article]

Lukas Pfeifenberger, Matthias Zöhrer, Günther Schindler, Wolfgang Roth, Holger Fröning, Franz Pernkopf
2020 arXiv   pre-print
This speech mask is used to obtain either the Minimum Variance Distortionless Response (MVDR) or Generalized Eigenvalue (GEV) beamformer.  ...  In particular, we use reduced-precision DNNs for estimating a speech mask from noisy, multi-channel microphone observations.  ...  We use the 6 channel data processed with 32-, 8-, 4-, and 1-bit DNNs for speech mask estimation using GEV-PAN, GEV-BAN and MVDR beamformers.  ... 
arXiv:2007.11477v1 fatcat:67ldkced3rf5nj2evkbejmr66y

Backward Compatible Spatialized Teleconferencing based on Squeezed Recordings [chapter]

Christian H., Muawiyath Shujau, Xiguang Zheng, Bin Cheng, Eva Cheng, Ian S
2011 Advances in Sound Localization  
In echoic conditions ( Fig. 12 (b) The final set of results in Fig. 13 is for recordings of two simultaneous speech sources of equal power (SNR of 0 dB) separated by an angle of 45˚ and at a distance  ...  Average results for recordings in diffuse noise across all noise sources for an SNR of 0 dB (a) Anechoic recordings. (b) Echoic recordings.  ...  This chapter is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike-3.0 License, which permits use, distribution and reproduction for non-commercial purposes, provided  ... 
doi:10.5772/14413 fatcat:hocbqdiqrndihjpx3odimqcxoa

A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation

Sharon Gannot, Emmanuel Vincent, Shmulik Markovich-Golan, Alexey Ozerov
2017 IEEE/ACM Transactions on Audio Speech and Language Processing  
Research in speech enhancement and separation has followed two convergent paths, starting with microphone array processing and blind source separation, respectively.  ...  In addition, they are crucial pre-processing steps for noise-robust automatic speech and speaker recognition. Many devices now have two to eight microphones.  ...  Moreover, several source separation methods were used as pre-processing for speech recognition within a series of speech separation and recognition challenges [305] .  ... 
doi:10.1109/taslp.2016.2647702 fatcat:ltfmmoguxngk5jrvzy7azzufae

Hybrid Neural Networks for On-Device Directional Hearing

Anran Wang, Maruchi Kim, Hao Zhang, Shyamnath Gollakota
2022 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
On-device directional hearing requires audio source separation from a given direction while achieving stringent human-imperceptible latency requirements.  ...  Further, our real-time hybrid model runs in 8 ms on mobile CPUs designed for low-power wearable devices and achieves an end-to-end latency of 17.5 ms.  ...  Acknowledgments We thank Les Atlas, Steve Seitz, Laura Trutoiu and Ludwig Schmidt for their important feedback on this work.  ... 
doi:10.1609/aaai.v36i10.21394 fatcat:d6bukcg6efgejbouu2gb3fezbm

Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement [article]

Zhong-Qiu Wang and Gordon Wichern and Jonathan Le Roux
2021 arXiv   pre-print
A promising approach for multi-microphone speech separation involves two deep neural networks (DNN), where the predicted target speech from the first DNN is used to compute signal statistics for time-invariant  ...  minimum variance distortionless response (MVDR) beamforming, and the MVDR result is then used as extra features for the second DNN to predict target speech.  ...  Shinji Watanabe for helpful discussions.  ... 
arXiv:2110.00570v1 fatcat:zpuhd7gxezaynkwgwisrs2dqxm

Clustered Blind Beamforming From Ad-Hoc Microphone Arrays

I Himawan, I McCowan, S Sridharan
2011 IEEE Transactions on Audio, Speech, and Language Processing  
Therefore, it is hypothesised that using a cluster of microphones (ie, a sub-array), closely located both to each other and to the desired speech source may in fact provide more robust speech enhancement  ...  A similar, but distinct, scenario has been investigated in the NIST meeting transcription evaluations, in which data is recorded using a variety of randomly placed table-top microphones as well as small  ...  B. Cluster Proximity Ranking For the subsequent speech enhancement and recognition evaluation, the rule-based clustering was used to obtain microphone clusters for Data Sets B, C and D.  ... 
doi:10.1109/tasl.2010.2055560 fatcat:n4inflyvdvdl5ehla3snopitd4

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation [article]

Zhong-Qiu Wang and Peidong Wang and DeLiang Wang
2021 arXiv   pre-print
components of multiple microphones.  ...  State-of-the-art separation performance is obtained on the simulated two-talker SMS-WSJ corpus and the real-recorded LibriCSS dataset.  ...  MISO1-BF We use the initial separation results by MISO1 to compute an MVDR beamformer for each source (denoted as MISO1-BF).  ... 
arXiv:2010.01703v2 fatcat:huvvxizr2jhjlhugtwk4kr7kze

Hybrid Neural Networks for On-device Directional Hearing [article]

Anran Wang, Maruchi Kim, Hao Zhang, Shyamnath Gollakota
2021 arXiv   pre-print
On-device directional hearing requires audio source separation from a given direction while achieving stringent human-imperceptible latency requirements.  ...  Further, our real-time hybrid model runs in 8 ms on mobile CPUs designed for low-power wearable devices and achieves an end-to-end latency of 17.5 ms.  ...  Acknowledgments We thank Les Atlas, Steve Seitz, Laura Trutoiu and Ludwig Schmidt for their important feedback on this work.  ... 
arXiv:2112.05893v1 fatcat:eahy332puzdttmriiqx65bzy3q

A Source Separation Evaluation Method In Object-Based Spatial Audio

Trevor Cox, Philip Jackson, Qingju Liu, Wenwu Wang
2015 Zenodo  
Fig. 3 . 3 Setup for real-room speech recordings.  ...  The 48-channel microphone array was hung right above the dummy head, to record concurrent speech signals coming from position pairs (A,B), (A,C) and (A,D).  ... 
doi:10.5281/zenodo.38891 fatcat:yufnxyhigzdcvhfkrbnkkqxdpy

Comparing Binaural Pre-processing Strategies III

Christoph Völker, Anna Warzybok, Stephan M. A. Ernst
2015 Trends in Hearing  
The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model.  ...  The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status.  ...  Funding The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the Cluster of Excellence "Hearing4All  ... 
doi:10.1177/2331216515618609 pmid:26721922 pmcid:PMC4771033 fatcat:h4javfvdlvfx7edf5dnhl4iemm

A Framework for Speech Enhancement With Ad Hoc Microphone Arrays

Vincent Mohammad Tavakoli, Jesper Rindom Jensen, Mads Graecboll Christensen, Jacob Benesty
2016 IEEE/ACM Transactions on Audio Speech and Language Processing  
reference when more than one subarray are used.  ...  When perceptual quality or intelligibility of the speech are the ultimate goals, there are turning points where the MVDR and the LCMV are superior to Wiener-based methods.  ...  One sample (3-5 seconds) for each speaker type is used from TSP speech audio recordings available in SMARD.  ... 
doi:10.1109/taslp.2016.2537202 fatcat:3xb4plnpkjcgfgahqrft524cj4

TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor's Approximation Theory [article]

Andong Li, Guochen Yu, Chengshi Zheng, Xiaodong Li
2022 arXiv   pre-print
As an attempt to fill the blank, we propose a novel neural beamformer inspired by Taylor's approximation theory called TaylorBeamformer for multi-channel speech enhancement.  ...  While existing end-to-end beamformers achieve impressive performance in various front-end speech processing tasks, they usually encapsulate the whole process into a black box and thus lack adequate interpretability  ...  Introduction Multi-channel speech enhancement (MCSE) aims at extracting target speech from multiple noisy-reverberant microphone recording signals.  ... 
arXiv:2203.07195v2 fatcat:n5if7hfkubeotkmavdqh2f2rgm

Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm

Ali Dehghan Firoozabadi, Pablo Irarrazaval, Pablo Adasme, David Zabala-Blanco, Hugo Durney, Miguel Sanhueza, Pablo Palacios-Játiva, Cesar Azurdia-Meza
2020 Applied Sciences  
In this article, we first propose a uniform circular nested microphone array (CNMA) for data recording.  ...  The multi-channel methods increase the speech enhancement performance by providing more information with the use of more microphones.  ...  Acknowledgments: This work was supported by the Vicerrectoría de Investigación y Postgrado of the Universidad Tecnológica Metropolitana, the Vicerrectoría de Investigación y Postgrado, and Faculty of Engineering  ... 
doi:10.3390/app10113955 fatcat:ixkzoa6vwnedzcvzwewiervq6m

Minimum Variance Distortionless Response Beamformer with Enhanced Nulling Level Control via Dynamic Mutated Artificial Immune System

Tiong Sieh Kiong, S. Balasem Salem, Johnny Koh Siaw Paw, K. Prajindra Sankar, Soodabeh Darzi
2014 The Scientific World Journal  
Minimum variance distortionless response (MVDR) beamforming is capable of determining the weight vectors for beam steering; however, its nulling level on the interference sources remains unsatisfactory  ...  Hence, in this paper, a new dynamic mutated artificial immune system (DM-AIS) is proposed to enhance MVDR beamforming for controlling the null steering of interference and increase the signal to interference  ...  Acknowledgment This work was supported in part by MOSTI (Ministry of Science, Technology and Innovation, Malaysia) with Project no. 01-02-03-SF0202.  ... 
doi:10.1155/2014/164053 pmid:25003136 pmcid:PMC4070486 fatcat:jcb57f7mpvajdn3or5dlevqvyi
« Previous Showing results 1 — 15 out of 117 results