International Journal of
Physical Sciences

  • Abbreviation: Int. J. Phys. Sci.
  • Language: English
  • ISSN: 1992-1950
  • DOI: 10.5897/IJPS
  • Start Year: 2006
  • Published Articles: 2533

Full Length Research Paper

Wavelet based dynamic Mel Frequency Cepstral Coefficients (MFCC) and block truncation techniques for efficient speaker identification under narrowband noise conditions

S. Selva Nidhyananthan*, R. Shantha Selva Kumara and D. S. Roland
Department of ECE, Mepco Schlenk Engineering College, Sivakasi-626005, India
Email: [email protected]

  •  Accepted: 09 September 2013
  •  Published: 23 September 2013

Abstract

Speaker identification strategies are well convincing in their performance when clean speeches are scrutinized. But the performance degrades when speech samples are corrupted by narrowband noise. Block truncation of the cepstral coefficients ensures that not all the features are affected by narrowband noise but it cannot reduce the extent of degradation. This work is focused towards improving the performance of speaker identification systems by block truncating the features which are subjected to wavelet processing. Wavelet decomposition divides the entire energy spectrum of the speech signal into bands corresponding to the number of levels of decomposition performed in the wavelet transformation thereby segregating the noise affected bands from other bands. In addition to that, wavelet filters provide the smoothening of the noisy speech signals which enhances the identification of the correct speaker. Dynamic Mel filtering of these wavelet coefficients followed by block truncation provides better identification, taking advantage of the fact that some filter bank coefficients remain unaffected by narrowband noise. The features are modeled by Gaussian mixture model - Universal background model (GMM-UBM) that serves as a generic one timed trained model. Speaker identification efficiency of 97.23% is achieved through this wavelet based dynamic MFCC technique which exhibits 7.58% improvement in speaker identification accuracy when compared with non wavelet based block truncation method.

 

Key words: Wavelet decomposition, block truncation, Dynamic Mel Filtering Cepstral Coefficients (DMFCC), Gaussian mixture model - Universal background model (GMM-UBM), speaker identification.

Abbreviation

 MFCC, Mel Frequency Cepstral Coefficients; UBM, Universal Background Models; DCT, Discrete Cosine Transform; MFLE, Mel Filter Log Energies.