4

Let's say we have the following ordinal data with one subject and five observers:

Q1
1 
1 
1 
1 
1 

Krippendorf's alpha turns out to be $1$, which means we have perfect agreement (as expected). However, if we introduce a little disagreement in answers by changing one value to $2$:

Q1
2 
1 
1 
1 
1 

We get $\alpha = -0.188$. What does this negative value mean?

Similarly, if we introduce more subjects:

Q1; Q2
1; 1
1; 1
1; 2
1; 1
1; 1

$\alpha = -0.0833$, despite four observers agreeing on their answers ($80\%$ agreement).

Why is this happening and how are these negative values supposed to be interpreted?

Karla
  • 65

1 Answers1

1

The reasons why you get such a results with the Krippendorf's alpha are the following:

  1. First recall that: $$ \alpha=1-\frac{D_{o}}{D_{e}} $$ where ${D}_{\mathrm{o}}$ is the observed disagreement among values assigned to units of analysis: $$ D_{o}=\frac{1}{n} \sum_{c} \sum_{k} o_{c k \text { metric }} \delta_{c k}^{2} $$ and $\mathrm{D}_{\mathrm{e}}$ is the disagreement one would expect when the coding of units is attributable to chance rather than to the properties of these units: $$ D_{e}=\frac{1}{n(n-1)} \sum_{c} \sum_{k} n_{c} \cdot n_{k \text { metric }} \delta_{c k}^{2} $$ The arguments in the two disagreement measures, $o_{c k}, n_{c}, n_{k}$ and $n$, refer to the frequencies of values in coincidence matrices (always using the ordinal metric in your case).

  2. Whenever your data points are all agreeing, then you will see that $\alpha = 1$ but only because $D_o = 0$, and $D_e >0$ always because it is computed entailing coding disagreement happened by pure chance (so if your matrix is more than 1 observer there will be always some disagreement by chance).

  3. Your dataset is very small, thus only 1 change in the table makes a bit difference in the observed disagreement $D_o$. On the other hand, $D_e$ remains always very small, because your matrix of datapoints is very small, thus there is a very small possibility of disagreement by chance! This inevitably leads to somewhat paradoxical conclusion because a change of only 1 datapoint in such a small dataset leads to a big difference in $D_o$. So, as soon as you change 1 point, you get right away $D_o >> D_e$ and thus $\alpha = 1 - \frac{D_o}{D_e} << 0$. The situation improves a bit when you double the datapoints, because then you double also the possibility of disagreement by chance, thus $D_e$ approximately doubles, so you get still a negative $\alpha$ because the disagreement of a couple has a great impact, but thanks to an increased number of datapoints the disagreement by chance $D_e$ also increases and compensates $D_o$ more.

I think it is hardly interpretable this case because it shows a misbehavior of the index $\alpha$ whenever the dataset is very small and the change you introduce is quite substantial (for instance 20% of the datapoints, even if it is just 1 datapoint).

(Please, if this answer is useful for you, leave a positive feedback and/or accept the answer).