A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2023; you can also visit the original URL.
The file type is application/pdf
.
Filters
Disentangling Voice and Content with Self-Supervision for Speaker Recognition
[article]
2023
arXiv
pre-print
For speaker recognition, it is difficult to extract an accurate speaker representation from speech because of its mixture of speaker traits and content. ...
The efficacy of the proposed framework is validated via experiments conducted on the VoxCeleb and SITW datasets with 9.56% and 8.24% average reductions in EER and minDCF, respectively. ...
Generalization ability improvement of
speaker representation and anti-interference for speaker verification. ...
arXiv:2310.01128v3
fatcat:opcbs5wqs5h4hlnmhmpkimnfbi
A review on state-of-the-art Automatic Speaker verification system from spoofing and anti-spoofing perspective
2021
Indian Journal of Science and Technology
The performance of any anti-spoofing speaker verification system may be evaluated using standard objective measures such are Equal Error Rate, False positive ratios, and graphical plots. ...
Background/Objectives: The anti-spoofing measures are blooming with an aim to protect the Automatic Speaker Verification systems from susceptible spoofing attacks. ...
The increasing developments in utilizing DNN in the speaker verification scenario are owing to accurate results and of course, their ability to discriminate between speakers (46) . ...
doi:10.17485/ijst/v14i40.1279
fatcat:qwn2ntd5lvanho4ktxs4oxwrsu
Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and Future Prospects
[article]
2023
arXiv
pre-print
Lastly, the work aims to accentuate the need for building more robust and generalizable methods, the integration of automatic speaker verification and countermeasure systems, and better evaluation protocols ...
Audio has become an increasingly crucial biometric modality due to its ability to provide an intuitive way for humans to interact with machines. ...
Vatsa is also partially supported through the SwarnaJayanti Fellowship by the Government of India. ...
arXiv:2307.06669v1
fatcat:vtxil2fnbfhczfmoc6lys4j2m4
Multi-task deep cross-attention networks for far-field speaker verification and keyword spotting
2023
EURASIP Journal on Audio, Speech, and Music Processing
Personalized voice triggering involves keyword spotting (KWS) and speaker verification (SV). Conventional approaches to this task include developing KWS and SV systems separately. ...
AbstractPersonalized voice triggering is a key technology in voice assistants and serves as the first step for users to activate the voice assistant. ...
The authors read and approved the final manuscript.
Authors' information Not applicable. ...
doi:10.1186/s13636-023-00293-8
fatcat:rqrnrkvj7rg4dfhaeh3i466pkq
Voice Activated E-Learning System for the Visually Impaired
2011
International Journal of Computer Applications
In the speaker verification subsystem, Mel-Frequency Cepstral Coefficients (MFCC) is used for Feature extraction and Vector Quantization (VQ) algorithm is used for codebook generation. ...
This system consists of two major subsystems; namely Speaker Verification and Speech Recognition subsystem. ...
The current paper examines the reasons for speaker verification failures and as a result of this analysis proposes a novel technique to produce a new and improved result on the speaker verification challenges ...
doi:10.5120/1892-2514
fatcat:qxn7uwqrtbfataq5b2ueow4yqy
Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit
2021
EURASIP Journal on Audio, Speech, and Music Processing
Therefore, we propose an anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit (GRU) for anchor identification of live platform. ...
Then, the feature sequence of anchor voiceprint is generated from the speech waveform with the self-attention network RawNet-SA. ...
How-
hance the representation ability of the model. ...
doi:10.1186/s13636-021-00234-3
fatcat:foq5ya2o25hh7euakt5o2thyoa
A New Non-uniform Sampling & Quantization by using a Modified Correlation
2013
International Journal of Software Engineering and Its Applications
It focuses on the naturalness and intelligibility of speech synthesis applications and the compression and signal-to-noise ratio of speech transmission applications. ...
Moreover, from the viewpoint of speech recognition, higher frequency band components have low correlation, while the positive and the negative signals are separated to reconstruct the high-intelligible ...
or speaker verification. ...
doi:10.14257/ijseia.2013.7.6.33
fatcat:j4zk2yq5mrg6nh5n4dzhmjekby
On A New Hybrid Speech Coder using Variables LPF
2013
International Journal of Software Engineering and Its Applications
However, it is well known that when conventional sampling methods are applied directly to speech signal, the required amount of data is comparable to or more than that of uniform sampling method. ...
signal and harmonics, which is used to get high quality speech in higher bandwidth. ...
or speaker verification. ...
doi:10.14257/ijseia.2013.7.5.13
fatcat:hw2m23twmff4bhmex3ktjzma5a
Audio Deepfake Detection: A Survey
[article]
2023
arXiv
pre-print
In addition, we perform a unified comparison of representative features and classifiers on ASVspoof 2021, ADD 2023 and In-the-Wild datasets for audio deepfake detection, respectively. ...
For each aspect, the basic techniques, advanced developments and major challenges are discussed. ...
Despite partly improving the anti-attack ability of the detection model via fake game, the methods of game are simple and lack intelligence. ...
arXiv:2308.14970v1
fatcat:xenftby2cfamnnskww5qy5loe4
On-Device Voice Authentication with Paralinguistic Privacy
[article]
2023
arXiv
pre-print
prefer for their data. ...
Our objective is to design and develop a new voice input-based system that offers the following specifications: local authentication to reduce the need for sharing raw voice data, local privacy preservation ...
Anti-spoofing works as a verification system by comparing a pair of inputs, namely the enrollment and testing inputs as X = (X enroll , X test ), where X enroll denotes set of samples corresponding to ...
arXiv:2205.14026v2
fatcat:547n6o6y3vbg5n3j5btwln7kwe
A Survey on Symmetrical Neural Network Architectures and Applications
2022
Symmetry
improve the element base of such neural networks. ...
Such approaches demonstrate impressive results, both for recognition practice, and for understanding of data transformation processes in various feature spaces. ...
Conflicts of Interest: The authors declare no conflict of interest. ...
doi:10.3390/sym14071391
fatcat:k2x2nqebvfbipkka55ztfmthvq
Adversarial Attack and Defense on Deep Neural Network-Based Voice Processing Systems: An Overview
2021
Applied Sciences
Finally, we propose a systematic classification of adversarial attacks and defense methods, with which we hope to provide a better understanding of the classification and structure for beginners in this ...
This review presents a detailed introduction to the background knowledge of adversarial attacks, including the generation of adversarial examples, psychoacoustic models, and evaluation indicators. ...
Conflicts of Interest: The authors declare no conflict of interest. ...
doi:10.3390/app11188450
fatcat:zjige7gepbdvnpk2i3qwyqv2oe
Biometrics Systems and Technologies: A survey
2016
International Journal of Computers Communications & Control
Biometric based authentication is becoming increasingly appealing and common for most of the human-computer interaction devices. ...
This chapter does not intend to cover a comprehensive and detailed list of biometric techniques. ...
The cloned voice "borrowed" the features of the authentic voice and the authors successfully fooled the speaker authentication system that was based on the Bob Spear Speaker Verification System [50] . ...
doi:10.15837/ijccc.2016.3.2556
fatcat:xfchlv7mdfdw5bpwou3agazjgu
Preventing Fake Information Generation Against Media Clone Attacks
2021
IEICE transactions on information and systems
We focus on 1) fake information generation in the physical world, 2) anonymization and abstraction in the cyber world, and 3) modeling of media clone attacks. ...
This paper describes some research results of the Media Clone project, in particular, various methods for protecting personal information against generating fake information. ...
(B) Verification of the ability to generate various types of media clones for audio, visual, and text derived from fake information. ...
doi:10.1587/transinf.2020mui0001
fatcat:dzmvdl4pvvbvlpaxx5gf7tsbgq
Single-Microphone Speech Enhancement and Separation Using Deep Learning
[article]
2018
arXiv
pre-print
The cocktail party problem comprises the challenging task of understanding a speech signal in a complex acoustic environment, where multiple speakers and background noise signals simultaneously interfere ...
Additionally, we show that uPIT works well for joint speech separation and enhancement without explicit prior knowledge about the noise type or number of speakers. ...
Acknowledgment The authors would like to thank Asger Heidemann Andersen for providing software used to conduct the SI tests, and NVIDIA Corporation for the donation of a Titan X GPU. ...
arXiv:1808.10620v2
fatcat:kzk357xdbjcsfn5c75qe4cb65q
« Previous
Showing results 1 — 15 out of 3,254 results