Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

1,189 Hits in 2.5 sec

Dual-path Transformer Based Neural Beamformer for Target Speech Extraction [article]

Aoqi Guo and Sichong Qian and Baoxiang Li and Dazhi Gao
2023 arXiv   pre-print
Neural beamformers, which integrate both pre-separation and beamforming modules, have demonstrated impressive effectiveness in target speech extraction.  ...  Initially, we employ the cross-attention mechanism in the time domain to extract crucial spatial information related to beamforming from the noisy covariance matrix.  ...  INTRODUCTION Neural beamformers have demonstrated exceptional capabilities in the realm of multi-channel target speech extraction [1] .  ... 
arXiv:2308.15990v2 fatcat:47v3t6d6j5d73eqllzzt2hz63y

A Study on Online Source Extraction in the Presence of Changing Speaker Positions [chapter]

Jens Heitkaemper, Thomas Fehér, Michael Freitag, Reinhold Haeb-Umbach
2019 Lecture Notes in Computer Science  
Assuming an enrollment utterance of the target speakeris available, the so-called SpeakerBeam concept has been recently proposed to extract the target speaker from a speech mixture.  ...  In this contribution we investigate different approaches to exploit such spatial information.  ...  Computational resources were provided by the Paderborn Center for Parallel Computing.  ... 
doi:10.1007/978-3-030-31372-2_17 fatcat:36mscnxvnbhnzlqobfij5socoe

Speech Enhancement Based on Beamforming and Post-Filtering by Combining Phase Information

Rui Cheng, Changchun Bao
2020 Interspeech 2020  
The spatial features and phase information of target speech are incorporated into the beamforming by neural network, and a neural network based single-channel postfiltering with the phase correction is  ...  With the development of microphone array signal processing technology and deep learning, the beamforming combined with neural network has provided a more diverse solution for this field.  ...  The After the beamforming with spatial and phase information, this post-filtering can effectively enhance the speech.  ... 
doi:10.21437/interspeech.2020-0990 dblp:conf/interspeech/ChengB20 fatcat:3hjxi5b4indyheyv7ly6ayxzzy

Spatial Attention for Far-field Speech Recognition with Deep Beamforming Neural Networks [article]

Weipeng He, Lu Lu, Biqiao Zhang, Jay Mahadeokar, Kaustubh Kalgaonkar, Christian Fuegen
2020 arXiv   pre-print
In this paper, we introduce spatial attention for refining the information in multi-direction neural beamformer for far-field automatic speech recognition.  ...  However, the features extracted by such methods contain redundant information, as only the direction of the target speech is relevant.  ...  The spatial attention, computed from multi-directional features, indicates how informative each direction is for recognizing the target speech.  ... 
arXiv:1911.02115v2 fatcat:pqwcc4stwzaa5iviudac57zu2u

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation [article]

Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu
2022 arXiv   pre-print
Recently, frequency domain all-neural beamforming methods have achieved remarkable progress for multichannel speech separation.  ...  This study proposes a novel all-neural beamforming method in time domain and makes an attempt to unify the all-neural beamforming pipelines for time domain and frequency domain multichannel speech separation  ...  Although the formulation of conventional MCWF only involves the target speech estimation at the reference channel (Eq. 23), the spatial information of the target speech will be neglected during the all-neural  ... 
arXiv:2212.08348v2 fatcat:xearmdcnq5bcrnggtgofdhblke

All-Neural Multi-Channel Speech Enhancement

Zhong-Qiu Wang, DeLiang Wang
2018 Interspeech 2018  
This study proposes a novel all-neural approach for multichannel speech enhancement, where robust speaker localization, acoustic beamforming, post-filtering and spatial filtering are all done using deep  ...  Next, the directional features are combined with the spectral features extracted from the beamformed signal to achieve further enhancement.  ...  With multiple microphones, spatial information can be exploited to complement spectral information for better de-noising and dereverberation.  ... 
doi:10.21437/interspeech.2018-1664 dblp:conf/interspeech/WangW18a fatcat:5ovmvjayuzcszhuxljeuggsdmq

Embedding and Beamforming: All-neural Causal Beamformer for Multichannel Speech Enhancement [article]

Andong Li, Wenzhe Liu, Chengshi Zheng, Xiaodong Li
2021 arXiv   pre-print
For EM, instead of estimating spatial covariance matrix explicitly, the 3-D embedding tensor is learned with the network, where both spectral and spatial discriminative information can be represented.  ...  The spatial covariance matrix has been considered to be significant for beamformers.  ...  INTRODUCTION Speech enhancement (SE) attempts to extract the target speech from the mixture signals.  ... 
arXiv:2109.00265v2 fatcat:vq4ozejaszfotcm6sy3o2zu2ym

Multi-channel end-to-end neural network for speech enhancement, source localization, and voice activity detection [article]

Yuan Chen, Yicheng Hsu, Mingsian R. Bai
2022 arXiv   pre-print
Simulation results show that the proposed neural beamformer is effective in enhancing speech signals, with speech quality well preserved.  ...  In this study, a neural beamformer consisting of a beamformer and a novel multi-channel DCCRN is proposed for speech enhancement and source localization.  ...  INTRODUCTION The goal of speech enhancement is to extract the target speech from the noisy signal.  ... 
arXiv:2206.09728v1 fatcat:xssjul7bfbhbvdz4krezdkc55m

A Neural Beam Filter for Real-time Multi-channel Speech Enhancement [article]

Wenzhe Liu, Andong Li, Chengshi Zheng, Xiaodong Li
2022 arXiv   pre-print
After that, the neural spatial filter is learned by simultaneously modeling the spatial and spectral discriminability of the speech and the interference, so as to extract the desired speech coarsely in  ...  To handle these problems, this paper designs a causal neural beam filter that fully exploits the spatial-spectral information in the beam domain.  ...  In this paper, we design a neural beam filter for real-time multichannel speech enhancement.  ... 
arXiv:2202.02500v1 fatcat:677rkbgysvhovo67pzaaxxyxje

A Pre-Separation and All-Neural Beamformer Framework for Multi-Channel Speech Separation

Wupeng Xie, Xiaoxiao Xiang, Xiaojuan Zhang, Guanghong Liu
2023 Symmetry  
In this study, a pre-separation and all-neural beamformer framework is proposed for multi-channel speech separation without following the solutions of the conventional beamformers, such as the minimum  ...  Furthermore, this method can be used for symmetrical stereo speech.  ...  Introduction Speech separation can extract target speaker information from speech signals corrupted by interference and reverberation, and it can improve the quality of communication between people.  ... 
doi:10.3390/sym15020261 fatcat:6cwujw7i6rdxdnvbeh3kzzddha

Multi-Channel Overlapped Speech Recognition with Location Guided Speech Extraction Network

Zhuo Chen, Xiong Xiao, Takuya Yoshioka, Hakan Erdogan, Jinyu Li, Yifan Gong
2018 2018 IEEE Spoken Language Technology Workshop (SLT)  
Then a neural network is trained using all features with a target of the clean speech of the required speaker.  ...  In the proposed system, three different features are formed for each target speaker, namely, spectral, spatial, and angle features.  ...  Beamforming utilizes the spatial information collected from multiple microphones to enhance the target speech, while the neural networks learn the regularities in speech magnitude spectra to separate speakers  ... 
doi:10.1109/slt.2018.8639593 dblp:conf/slt/ChenXYELG18 fatcat:lwfz7dkatzejhmc72uhxj4aufy

Multi-Channel Block-Online Source Extraction Based on Utterance Adaptation

Juan M. Martín-Doñas, Jens Heitkaemper, Reinhold Haeb-Umbach, Angel M. Gomez, Antonio M. Peinado
2019 Interspeech 2019  
This paper deals with multi-channel speech recognition in scenarios with multiple speakers.  ...  In this work we present two variants of speakeraware neural networks, which exploit both spectral and spatial information to allow better discrimination between target and interfering speakers.  ...  In this work we propose two novel multi-channel low-latency speech extraction systems, which retrieve spatial and spectral information from an AU to force a neural network to focus on the speech signal  ... 
doi:10.21437/interspeech.2019-2244 dblp:conf/interspeech/Martin-DonasHHG19 fatcat:pglsy45aera5tn6s2ulx26r5xm

Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain [article]

Dejan Markovic, Alexandre Defossez, Alexander Richard
2022 arXiv   pre-print
We divide the scene into two spatial regions containing, respectively, the target and the interfering sound sources.  ...  We evaluate the proposed model on a real-world dataset and show that the model matches the performance of an oracle beamformer followed by a state-of-the-art single-channel enhancement network.  ...  For example, enhancement post-filter that follows a beamformer does not have access to spatial information that's lost after spatial filtering.  ... 
arXiv:2206.15423v1 fatcat:c3gloch46zemrnfxianoa62laq

Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information

Rongzhi Gu, Lianwu Chen, Shi-Xiong Zhang, Jimeng Zheng, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu
2019 Interspeech 2019  
direction, for target speaker separation.  ...  In this paper, integrated with the power spectra and inter-channel spatial features at the input level, we explore to leverage directional features, which imply the speaker source from the desired target  ...  With the direction of arrival (DOA) information, beamforming techniques [13, 14] can be applied to enhance the speaker from the desired direction.  ... 
doi:10.21437/interspeech.2019-2266 dblp:conf/interspeech/GuCZZXYSZ019 fatcat:ebrxte7o2fhvzdoybevt57dpvm

Insights into Deep Non-linear Filters for Improved Multi-channel Speech Enhancement [article]

Kristina Tesch, Timo Gerkmann
2022 arXiv   pre-print
The key advantage of using multiple microphones for speech enhancement is that spatial filtering can be used to complement the tempo-spectral processing.  ...  However, the internal mechanisms that lead to good performance of such data-driven filters for multi-channel speech enhancement are not well understood.  ...  Berger and Rohde&Schwarz SwissQual AG for their support with POLQA.  ... 
arXiv:2206.13310v2 fatcat:bneiifs4xnbofokutyyxhhlueu
« Previous Showing results 1 — 15 out of 1,189 results