Al-Kaltakchi, M.T. S., Woo, Wai Lok, Dlay, Satnam and Chambers, Jonathon (2016) Study of fusion strategies and exploiting the combination of MFCC and PNCC features for robust biometric speaker identification. In: 2016 4th International Conference on Biometrics and Forensics (IWBF). IEEE. ISBN 978-1-4673-9448-2
Full text not available from this repository.Abstract
In this paper, a new combination of features and normalization methods is investigated for robust biometric speaker identification. Mel Frequency Cepstral Coefficients (MFCC) are efficient for speaker identification in clean speech while Power Normalized Cepstral Coefficients (PNCC) features are robust for noisy environments. Therefore, combining both features together is better than taking each one individually. In addition, Cepstral Mean and Variance Normalization (CMVN) and Feature Warping (FW) are used to mitigate possible channel effects and the handset mismatch in voice measurements. Speaker modelling is based on a Gaussian Mixture Model (GMM) with a universal background model (UBM). Coupled parameter learning between the speaker models and UBM is utilized to improve performance. Finally, maximum, mean and weighted sum fusions of model scores are used to enhance the Speaker Identification Accuracy (SIA). Verifications conducted on the TIMIT database with and without noise confirm performance improvement.
Item Type: | Book Section |
---|---|
Uncontrolled Keywords: | Robust biometric speaker identification and robust recognition, Gaussian mixture model, universal background model, maximum a posterior probability adaptation, score fusion |
Subjects: | G400 Computer Science |
Department: | Faculties > Engineering and Environment > Computer and Information Sciences |
Depositing User: | Becky Skoyles |
Date Deposited: | 09 Apr 2019 12:10 |
Last Modified: | 10 Oct 2019 20:16 |
URI: | http://nrl.northumbria.ac.uk/id/eprint/38863 |
Downloads
Downloads per month over past year