Abstract
We apply independent component analysis for extracting an optimal basis to the problem of finding efficient features for representing speech signals of a given speaker. The speech segments are assumed to be generated by a linear combination of the basis functions, thus the distribution of speech segments of a speaker is modeled by adapting the basis functions so that each source component is statistically independent. The learned basis functions are oriented and localized in both space and frequency, bearing a resemblance to Gabor wavelets. These features are speaker-dependent characteristics and to assess their efficiency we performed speaker recognition experiments and compared our results with the conventional Fourier basis. Our results show that the proposed method is more efficient than the conventional Fourier-based features, in that they can obtain a higher recognition rate and coding efficiency.
Original language | English |
---|---|
Pages (from-to) | 329-348 |
Number of pages | 20 |
Journal | Neurocomputing |
Volume | 49 |
Issue number | 1-4 |
DOIs | |
State | Published - Dec 2002 |
Keywords
- Feature extraction
- Generalized Ggaussian mixture model
- Independent component analysis
- Speaker recognition
- Speech coding