1

I am slightly confused as to how you calculate Mahalanobis distance given a set of data. I have tried asking my tutor for help but he does not seem interested in helping what so ever and I am continuously insulted. I thought I would turn to the community for help.

I have a set of data here and I have performed distance calculation once using Euclidean distance to group the data. Now I am looking to calculate distance using Mahalanobis distance. I have calculated the means and also calculated a Pooled covariance matrix. I am unsure as to what I need to do from here to begin calculating distances for each point.

Data clustered into 3 clusters after performing Euclidean distance to place points into initial groups enter image description here

Pooled Covariance matrix \begin{bmatrix}1.394&1.702\\1.702&6.62\end{bmatrix}

Inverse Pooled Covariance \begin{bmatrix}1.046&-0.269\\-0.269&0.221\end{bmatrix}

ASH
  • 147

1 Answers1

1

Try something like this: $$d^2_{(1,1)}=\begin{bmatrix}1& 1\end{bmatrix}\begin{bmatrix}1.046&-0.269\\-0.269&0.221\end{bmatrix}\begin{bmatrix}1\\ 1\end{bmatrix}$$

Andrei
  • 39,869
  • Thanks for the reply! I am slightly confused, do I need to calculate a pooled covariance matrix for each Cluster? Also you mention 1,1 but would I not need to subtract the mean from the point: 1-2.7, 1-5.6? – ASH Mar 22 '22 at 20:08
  • Sorry, you are right. You need to subtract the mean from each point. And you need to do it for each cluster – Andrei Mar 22 '22 at 21:16
  • Thanks for getting back. I wanted to make sure I was calculating my pooled covariance matrix correctly. For each cluster/group we calculate the Covariance matrix first. Then we use the amount of samples in each cluster e.g. cluster1 has 7 samples so we do 7/15 * Covariance Matrix to obtain the Pooled covariance matrix? Finally I would inverse the Pooled covariance matrix and use this to begin determining distances from points. – ASH Mar 23 '22 at 06:44
  • Seems reasonable to me. This seems to me like the same approach as https://blogs.sas.com/content/iml/2020/07/01/pooled-covariance-between-group.html – Andrei Mar 23 '22 at 12:41