Let us now consider a hypothetical situation in which examiners do exactly that, i.e. assign notes by throwing a coin toss; Heads – pass, tails – Table 1, situation 2. In this case, one would expect that 25% (-0.50 × 0.50) of the students would receive the results of both and that 25% of the two would receive the „fail“ grade – a total approval rate „expected“ for „not“ or „fail“ of 50% (-0.25 – 0.25 – 0.50). Therefore, the observed approval rate (80% in situation 1) must be interpreted to mean that a 50% agreement was foreseen by chance. These auditors could have improved it by 50% (at best an agreement minus the randomly expected agreement – 100% 50% – 50%), but only reached 30% (observed agreement minus the randomly expected agreement – 80% 50% – 30%). Thus, their real return in agreement is 30%/50% – 60%. Cohens – can also be used if the same counsellor evaluates the same patients at two times (for example. B to 2 weeks apart) or, in the example above, re-evaluated the same response sheets after 2 weeks. Its limitations are: (i) it does not take into account the magnitude of the differences, so it is unsuitable for ordinal data, (ii) it cannot be used if there are more than two advisors, and (iii) it does not distinguish between agreement for positive and negative results – which can be important in clinical situations (for example. B misdiagnosing a disease or falsely excluding them can have different consequences). Concordance limits – average difference observed ± 1.96 × standard deviation of observed differences. ( observed agreement [Po] – expected agreement [Pe]) / (agreement 1 expected [Pe]).

Cohens Kappa () calculates the agreement between observers taking into account the agreement expected by chance as follows: Consider the case of two examiners A and B, who evaluate the answer sheets of 20 students in a class and mark each of them as „passport“ or „fail“, each examiner having passed half of the students. Table 1 presents three different situations that can occur. In situation 1 in this table, eight students receive a pass score from the two examiners, eight from the examiners a „bad grade“ and four from one examiner the pass mark, but the „fail“ score of the other (two from A and the other two from B). Thus, the results of the two examiners are the same for 16/20 students (agreement – 16/20 – 0.80, disagreement – 4/20 – 0.20). It looks good. However, it is not taken into account that some notes could have been presumptions and that the agreement could have been reached by chance. As mentioned above, correlation is not synonymous with agreement. The correlation refers to the existence of a relationship between two different variables, while the agreement considers the agreement between two measures of a variable.

Two sets of observations, strongly correlated, may have a poor agreement; However, if the two sets of values agree, they will certainly be strongly correlated. For example, in the hemoglobin example, the correlation coefficient between the values of the two methods is high, although the agreement is poor [Figure 2]; (r – 0.98).