A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Method of Estimating Signal-to-Noise Ratio Based on Optimal Design for Sub-band Voice Activity Detection
2017
Journal of Information Hiding and Multimedia Signal Processing
Here, we propose a gSNR estimation framework that mainly consists of sub-band processing, voice activity detection (VAD), and threshold optimization. ...
The global signal to noise ratio (gSNR) is the ratio of concurrent powers between speech and noise in a noisy speech signal. ...
We designed power-level-based VAD to detect speech and noise periods in each sub-band. ...
dblp:journals/jihmsp/MoritaLUA17
fatcat:fbdwdfvf5jgvpplqwm4axtpksu
Efficient voice activity detection algorithm based on sub-band temporal envelope and sub-band long-term signal variability
2014
The 9th International Symposium on Chinese Spoken Language Processing
The proposed algorithm could apply to speech-based systems. Index Terms: voice activity detection, sub-band temporal envelope, sub-band long-term signal variability, fusion decision ...
Voice activity detection (VAD) is widely used for various speech-based systems which is an important pre-processing step. This paper proposes a robust voice activity detection algorithm. ...
However, developing a VAD for noisy environments with low signal-to-noise ratios (SNR) or for any non-stationary noise is still very challenging. Many methods have been proposed for speech detection. ...
doi:10.1109/iscslp.2014.6936602
dblp:conf/iscslp/LiuTMLWL14
fatcat:mhwzjpalz5cjbobbke6zso3zf4
Speech Signal Recovery Based on Source Separation and Noise Suppression
2014
Journal of Computer and Communications
In this paper, a speech signal recovery algorithm is presented for a personalized voice command automatic recognition system in vehicle and restaurant environments. ...
This novel algorithm is able to separate a mixed speech source from multiple speakers, detect presence/absence of speakers by tracking the higher magnitude portion of speech power spectrum and adaptively ...
Acknowledgements The authors would like to thank STMicroelectronics Asia Pacific Pte Ltd for providing speech dataset and experiment environment. ...
doi:10.4236/jcc.2014.29015
fatcat:jgnbw3fidvdxjey24wgxj3ahnm
Robust voice activity detection directed by noise classification
2013
Signal, Image and Video Processing
Keywords Voice activity detection · Perceptual wavelet packet transform · Noise classification · Support vector machine Introduction Voice activity detection (VAD) is a process, which can detect speech ...
A multiclass SVM is also used to classify background noises in order to select SVM model for VAD. ...
Taking into account more noise types in the proposed VAD can improve the performance in real-world applications. Future work should be done on these promising issues. ...
doi:10.1007/s11760-013-0479-5
fatcat:hvikpaxptnfixg6pnkum3q5jta
A Novel Approach Based on Adaptive Long-Term Sub-Band Entropy and Multi-Thresholding Scheme for Detecting Speech Signal
2012
IEICE transactions on information and systems
The proposed ALT-SubEnpy-based VAD method is shown to be an effective method while working at variable noise-level condition. key words: voice activity detection, long-term spectral analysis, sub-band ...
Here we propose a novel concept of adaptive long-term sub-band entropy (ALT-SubEnpy) measure and combine it with a multi-thresholding scheme for voice activity detection. ...
for detecting speech with a low signal-to-noise ratio (SNR). ...
doi:10.1587/transinf.e95.d.2732
fatcat:rigvac4knvf5nk2rrjq3eu7sde
Far-field continuous speech recognition system based on speaker Localization and sub-band Beamforming
2008
2008 IEEE/ACS International Conference on Computer Systems and Applications
To localize the speaker an algorithm based on Steered Response Power by utilizing harmonic structures of speech signal is proposed. ...
This new scheme has the ability of speaker verification by fundamental frequency variation; therefore it can be utilized in the design of a speech recognition system for verified speakers. ...
These sub-bands are summarized in table 1.
Voice activity detection Speaker localization performance is improved by the detection and removal of non-speech frames from the localization process. ...
doi:10.1109/aiccsa.2008.4493578
dblp:conf/aiccsa/AsaeiTS08
fatcat:y3lckuw6prastcdvjn3iedev5q
Performance Improvement of Digital Hearing Aid Systems
2015
Journal of Communications Technology Electronics and Computer Science
In this study, an adaptive spectral subtraction algorithm is implemented using the noise-estimation algorithm for highly non-stationary noisy environments instead of the voice activity detection (VAD) ...
Also, signal to residual spectrum ratio (SR) is implemented in order to control the amplification distortion for speech intelligibility improvement. ...
Implementation of SS requires the estimation of the spectral magnitude of the received noisy signal and the noise spectrum for regions that are considered as "noise-only" using voice activity detection ...
doi:10.22385/jctecs.v1i0.15
fatcat:gf6xjgfpznhe3k2vrzekm2zmkm
An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment
2010
IEICE transactions on information and systems
In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. ...
On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. ...
On the other hand, the method based on higher-order statistics has been proposed to utilize the statistics of speech signals for VAD [32] . ...
doi:10.1587/transinf.e93.d.341
fatcat:6ldodptxbzcrvhhgej6sr4qsqq
Automatic detection of African elephant (Loxodonta africana) infrasonic vocalisations from recordings
2010
Biosystems Engineering
Important signal characteristics of elephant vocalisations were identified from spectrograms and a technique, based on the principles of existing voice activity detection algorithms, was developed to exploit ...
Manual analysis of recordings (by listening to these and by visual inspection of spectrograms) to locate vocalisations is tedious. The automatic detection of vocalisations in recordings is explored. ...
For signal in detection in noise, a matched filter is optimal. ...
doi:10.1016/j.biosystemseng.2010.04.001
fatcat:wtpcztrp7rdgrokicdpugkhc4m
Noise Estimation and Noise Removal Techniques for Speech Recognition in Adverse Environment
[chapter]
2010
IFIP Advances in Information and Communication Technology
This paper focuses on voice activity detection, noise estimation, removal techniques and an optimal filter. ...
With such a formulation, the core issue of noise reduction becomes how to design an optimal filter that can significantly suppress noise without noticeable speech distortion. ...
The simplest approach is to estimate and update the noise spectrum during the silent (pauses) segments of the signal using a voice-activity detection (VAD) [4] . ...
doi:10.1007/978-3-642-16327-2_40
fatcat:i47sokh7zbeazjf7tfyuqjprc4
Noise reduction algorithm with the soft thresholding based on the Shannon entropy and bone-conduction speech cross- correlation bands
2018
Technology and Health Care
BACKGROUND: The conventional methods of speech enhancement, noise reduction, and voice activity detection are based on the suppression of noise or non-speech components of the target air-conduction signals ...
METHODS: A new algorithm for speech detection and noise reduction is proposed, which makes use of the Shannon entropy principle and cross-correlation with the bone-conduction speech signals to threshold ...
The traditional speech enhancement, noise reduction, and voice activity detection algorithms are based on linear processing techniques and general air-conduction speech signals [2] . ...
doi:10.3233/thc-174615
pmid:29710756
pmcid:PMC6004965
fatcat:wht7guphsjhbjd2xf3l4xgoe6u
Adaptive Fuzzy Filter for Speech Enhancement
[chapter]
2010
Lecture Notes in Computer Science
First an amplified voice activity detector is designed to improve performance on SNR lower than 5dB. Then an adaptive threshold decision module based on fuzzy inference system is proposed. ...
In this fuzzy inference system overall relations between speech and noise are summarized into seven fuzzy rules and four linguistic variables, which are used to detect the state of signals. ...
This amplified voice activity detector employs the full wavelet packet transform (instead of the perceptual wavelet packet transform) to decompose the input speech signal into critical sub-band signals ...
doi:10.1007/978-3-642-12179-1_42
fatcat:s3sqmtr22fganc725eeuu35pvq
On the Use of Adaptive Fuzzy Wavelet Filter in the Speech Enhancement
2014
Journal of Computers
An amplified voice activity detector in the proposed hybrid filter is designed to improve performance when the signal-to-noise ratio (SNR) is lower than 5 dB. ...
This paper proposes an adaptive fuzzy wavelet filter that is based on a fuzzy inference system for enhancing speech signals and improving the accuracy of speech recognition. ...
Amplified Voice Activity Detection Voice activity detection (VAD) is used to distinguish speech from contaminated speech signals and is required in various speech communication systems [16] . ...
doi:10.4304/jcp.9.11.2501-2513
fatcat:aqltk4itxvbbnkzylqbe6jixr4
A Novel Voice Sensor for the Detection of Speech Signals
2013
Sensors
The proposed TD-PBEE-based VAD algorithm is evaluated for four types of noises and five signal-to-noise ratio (SNR) levels. ...
In order to develop a novel voice sensor to detect human voices, the use of features which are more robust to noise is an important issue. Voice sensor is also called voice activity detection (VAD). ...
Conflicts of Interest The authors declare no conflict of interest. ...
doi:10.3390/s131216533
pmid:24316566
pmcid:PMC3892860
fatcat:ldaxnvwsmze6vifx5pydvyxjgy
2020 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 28
2020
IEEE/ACM Transactions on Audio Speech and Language Processing
., +, TASLP 2020 2349-2363 Correlation methods Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach. ...
Beit-On, H., +, TASLP 2020 2184-2193 Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach. ...
T Target tracking Multi-Hypothesis Square-Root Cubature Kalman Particle Filter for Speaker Tracking in Noisy and Reverberant Environments. Zhang, Q., +, TASLP 2020 1183 -1197 ...
doi:10.1109/taslp.2021.3055391
fatcat:7vmstynfqvaprgz6qy3ekinkt4
« Previous
Showing results 1 — 15 out of 5,913 results