Our data can be represented as $X_{n\times d}$ matrix, where we have $n$ data points lying in $\mathbb{R}^d$. We assume that there are $k$ underlying Gaussian models, $\mathcal{N}_d(\mu_j, \Sigma_j)$, from which we could have drawn these points. In the Expectation-Maximization algorithm for finding these parameters (the probability of belonging to a certain distribution and the distribution parameters), we have four main calculations at every iteration:
- $W_{n\times k}$ - belief that $x_i\in X$ belongs to distribution $k$
- $\Pi_{k\times 1}$ - probability distribution of drawing from one of the $k$ models
- $M_{k\times d}$ - means of the $k$ distributions
- $\Sigma_{k\times d\times d}$ - covariance matrices ($d\times d$) for each of the $k$ distributions
I'm wondering if there are closed-form solutions for all four of the calculations. Two of them seem evident:
- $\Pi = (W^\tau \vec{1})/n$
- $M = ((W^\tau X)\odot\Pi^{-1})/n$
but the others are calculated by iterating over values:
- $W_{ij} = \frac{\pi_jf(x_i;\mu_j,\Sigma_j)}{\sum_{\ell=1}^{k}\pi_\ell f(x_i;\mu_\ell,\Sigma_\ell)}$, where $f$ is the probability density function
- $\Sigma_j' = \frac{1}{n\pi_j'}\sum_{i=1}^{n}W_{ij}(x_i-\mu_j')^\tau(x_i-\mu_j')$
Where the $'$ indicates we found that value during the current iteration. Is there a different way to store these to get closed-form solutions? Or should I just give up and just sum them out?