Wavelet based dynamic Mel Frequency Cepstral Coefficients (MFCC) and block truncation techniques for efficient speaker identification under narrowband noise conditions

S. Selva Nidhyananthan; R. Shantha Selva Kumara; D. S. Rol

doi:10.5897/IJPS2013.3955

International Journal of
Physical Sciences

Abbreviation: Int. J. Phys. Sci.
Language: English
ISSN: 1992-1950
DOI: 10.5897/IJPS
Start Year: 2006
Published Articles: 2572

Full Length Research Paper

Wavelet based dynamic Mel Frequency Cepstral Coefficients (MFCC) and block truncation techniques for efficient speaker identification under narrowband noise conditions

S. Selva Nidhyananthan*, R. Shantha Selva Kumara and D. S. Roland

Department of ECE, Mepco Schlenk Engineering College, Sivakasi-626005, India
Email: [email protected]

Article Number - A3DDEE522938
Vol.8(35), pp. 1746-1752 , September 2013
https://doi.org/10.5897/IJPS2013.3955

Accepted: 09 September 2013
Published: 23 September 2013

Copyright © 2024 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0.

Abstract

Speaker identification strategies are well convincing in their performance when clean speeches are scrutinized. But the performance degrades when speech samples are corrupted by narrowband noise. Block truncation of the cepstral coefficients ensures that not all the features are affected by narrowband noise but it cannot reduce the extent of degradation. This work is focused towards improving the performance of speaker identification systems by block truncating the features which are subjected to wavelet processing. Wavelet decomposition divides the entire energy spectrum of the speech signal into bands corresponding to the number of levels of decomposition performed in the wavelet transformation thereby segregating the noise affected bands from other bands. In addition to that, wavelet filters provide the smoothening of the noisy speech signals which enhances the identification of the correct speaker. Dynamic Mel filtering of these wavelet coefficients followed by block truncation provides better identification, taking advantage of the fact that some filter bank coefficients remain unaffected by narrowband noise. The features are modeled by Gaussian mixture model - Universal background model (GMM-UBM) that serves as a generic one timed trained model. Speaker identification efficiency of 97.23% is achieved through this wavelet based dynamic MFCC technique which exhibits 7.58% improvement in speaker identification accuracy when compared with non wavelet based block truncation method.

Key words: Wavelet decomposition, block truncation, Dynamic Mel Filtering Cepstral Coefficients (DMFCC), Gaussian mixture model - Universal background model (GMM-UBM), speaker identification.

Abbreviation

MFCC, Mel Frequency Cepstral Coefficients; UBM, Universal Background Models; DCT, Discrete Cosine Transform; MFLE, Mel Filter Log Energies.

This article is published under the terms of the Creative Commons Attribution License 4.0

Back to Vol. 8 No. 35

Back to articles

Views: 0
Downloads: 0

Related Articles:
On Google
On Google Scholar

Articles on Google by:

International Journal of Physical Sciences

Wavelet based dynamic Mel Frequency Cepstral Coefficients (MFCC) and block truncation techniques for efficient speaker identification under narrowband noise conditions

S. Selva Nidhyananthan*, R. Shantha Selva Kumara and D. S. Roland

International Journal of
Physical Sciences