How is a coincidence matrix constructed for computing Krippendorff's alpha?
I am looking at two documents to help me learn about constructing coincidence matrices in order to gain a better understanding of Krippendorff's alpha. I am using these two:
There seems to me to be a discrepancy between the two. There probably isn't, but I'm looking for some help in figuring out whether my understanding is wrong, or if there is indeed a discrepancy.
In link 1, I am looking at section B ("Nominal data, 2 observers, no missing data"), where the coincidence matrix is presented. In link 2, I am looking at the section "Coincidence matrices".
Consider the reliability matrix presented in link 1:
In order to calculate the elements of the coincidence matrix, we have the following definition in link 2:
$$o_{vv'}=\sum_{u=1}^{N}\frac{\sum_{i\neq i'}^{m}I(v_{iu}=v)I(v_{i'u}=v')}{m_u-1}=o_{v'v},$$
where $u$ is the horizontal element of the reliability matrix (the columns), $m_u$ is the number of labels actually present in column $u$.
This seems simple enough. For element $o_{aa}$ (or $o_{11}$) we should have:
$$ o_{aa}=\frac{I(a=a)I(b=a)}{2-1}+\frac{I(a=a)I(a=a)}{2-1}+\frac{I(b=a)I(b=a)}{2-1}+\frac{I(b=a)I(b=a)}{2-1} $$
$$ +\frac{I(d=a)I(b=a)}{2-1}+\frac{I(c=a)I(c=a)}{2-1}+... $$
and so on. Clearly, only one of the summation elements is non-zero, namely the second element. Hence $$o_{aa}=1.$$
Using the same formula/logic, we arrive at $$o_{bb}=2.$$
But if we look at link 1, it is getting double the value for the coincidence matrix elements, and I don't understand why. I don't even understand the link's explanation as to why it is getting those values for the coincidence matrix.
Can somebody help?
Topic labels statistics
Category Data Science