Abstract
Singularities in the parameter spaces of hierarchical learning machines are known to be a main cause of slow convergence of gradient descent learning. The EM algorithm, which is another learning algorithm giving a maximum likelihood estimator, is also suffering from its slow convergence, which often appears when the component overlap is large. We analyze the dynamics of the EM algorithm for Gaussian mixtures around singularities and show that there exists a slow manifold caused by a singular structure, which is closely related to the slow convergence of the EM algorithm. We also conduct numerical simulations to confirm the theoretical analysis. Through the simulations, we compare the dynamics of the EM algorithm with that of the gradient descent algorithm, and show that their slow dynamics are caused by the same singular structure, and thus they have the same behaviors around singularities.
Original language | English |
---|---|
Pages (from-to) | 45-59 |
Number of pages | 15 |
Journal | Neural Processing Letters |
Volume | 29 |
Issue number | 1 |
DOIs | |
State | Published - Feb 2009 |
Keywords
- EM algorithm
- Gradient descent learning
- Learning dynamics
- Singularity
- Slow convergence