How is a coincidence matrix constructed for computing Krippendorff's alpha?

Question

How is a coincidence matrix constructed for computing Krippendorff's alpha?

quanty

2022年2月21日 11:01

I am looking at two documents to help me learn about constructing coincidence matrices in order to gain a better understanding of Krippendorff's alpha. I am using these two:

There seems to me to be a discrepancy between the two. There probably isn't, but I'm looking for some help in figuring out whether my understanding is wrong, or if there is indeed a discrepancy.

In link 1, I am looking at section B ("Nominal data, 2 observers, no missing data"), where the coincidence matrix is presented. In link 2, I am looking at the section "Coincidence matrices".

Consider the reliability matrix presented in link 1:

In order to calculate the elements of the coincidence matrix, we have the following definition in link 2:

$$o_{vv'}=\sum_{u=1}^{N}\frac{\sum_{i\neq i'}^{m}I(v_{iu}=v)I(v_{i'u}=v')}{m_u-1}=o_{v'v},$$

where $u$ is the horizontal element of the reliability matrix (the columns), $m_u$ is the number of labels actually present in column $u$.

This seems simple enough. For element $o_{aa}$ (or $o_{11}$) we should have:

$$ o_{aa}=\frac{I(a=a)I(b=a)}{2-1}+\frac{I(a=a)I(a=a)}{2-1}+\frac{I(b=a)I(b=a)}{2-1}+\frac{I(b=a)I(b=a)}{2-1} $$

$$ +\frac{I(d=a)I(b=a)}{2-1}+\frac{I(c=a)I(c=a)}{2-1}+... $$

and so on. Clearly, only one of the summation elements is non-zero, namely the second element. Hence $$o_{aa}=1.$$

Using the same formula/logic, we arrive at $$o_{bb}=2.$$

But if we look at link 1, it is getting double the value for the coincidence matrix elements, and I don't understand why. I don't even understand the link's explanation as to why it is getting those values for the coincidence matrix.

Can somebody help?

Topic labels statistics

Category Data Science

Klaus Krippendorff · Accepted Answer · 2020年1月8日 23:39

The formula in the Wikipedia article seems to be limited to two coders. Look at the sum of values in u for m coders, i I like j, so each value in u is to be matched with one other value in u. Actually, this sum would pair values of i with values of j and values in j with values i. So, o sub aa would be 2 not 1. And o sub bb would be 4, not two.

However, I don’t see how Wikipedia’s formulation could yield coincidence matrices for more than two coders as this would require two sums which pair each of m valors with each of m-1 other values.

How is a coincidence matrix constructed for computing Krippendorff's alpha?

About