Learning statistically efficient features for speaker recognition

Gil Jin Jang, Te Won Lee, Yung Hwan Oh

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

We apply independent component analysis for extracting an optimal basis to the problem of finding efficient features for representing speech signals of a given speaker. The speech segments are assumed to be generated by a linear combination of the basis functions, thus the distribution of speech segments of a speaker is modeled by adapting the basis functions so that each source component is statistically independent. The learned basis functions are oriented and localized in both space and frequency, bearing a resemblance to Gabor wavelets. These features are speaker-dependent characteristics and to assess their efficiency we performed speaker recognition experiments and compared our results with the conventional Fourier basis. Our results show that the proposed method is more efficient than the conventional Fourier-based features, in that they can obtain a higher recognition rate and coding efficiency.

Original languageEnglish
Pages (from-to)329-348
Number of pages20
JournalNeurocomputing
Volume49
Issue number1-4
DOIs
StatePublished - Dec 2002

Keywords

  • Feature extraction
  • Generalized Ggaussian mixture model
  • Independent component analysis
  • Speaker recognition
  • Speech coding

Fingerprint

Dive into the research topics of 'Learning statistically efficient features for speaker recognition'. Together they form a unique fingerprint.

Cite this