# The measurement of observer agreement for categorical data.

@article{Landis1977TheMO, title={The measurement of observer agreement for categorical data.}, author={J Richard Landis and Gary G. Koch}, journal={Biometrics}, year={1977}, volume={33 1}, pages={ 159-74 } }

This paper presents a general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies. The procedure essentially involves the construction of functions of the observed proportions which are directed at the extent to which the observers agree among themselves and the construction of test statistics for hypotheses involving these functions. Tests for interobserver bias are presented in terms of first-order marginal homogeneity and… Expand

#### Paper Mentions

Observational Clinical Trial

This study will investigate the reproducibility of a clinical diagnostic classification
system for groin pain between two different examiners.

Conditions | Adductor Tendinitis, Groin Injury, Hip Pain Chronic, (+2 more) |
---|---|

Intervention | Diagnostic Test |

Observational Clinical Trial

This study will investigate the reproducibility of clinical palpation, resistance and
stretching tests which are currently being used for the diagnosis of longstanding groin pain… Expand

Conditions | Adductor Strains, Groin Injury, Iliopsoas Syndrome |
---|---|

Intervention | Diagnostic Test |

Observational Clinical Trial

BACKGROUND and RATIONALE Colorectal cancer, with 49,000 new diagnoses expected in 2019
(27,000 in men and 22,000 in women) represents, in Italy, the third neoplasm in men (14%) and… Expand

Conditions | Rectal Cancer |
---|

Observational Clinical Trial

Haemolytic uremic syndrome (HUS) is defined by the presence of the classic triad of
non-immune microangiopathic hemolytic anemia (negative direct Coombs), thrombocytopenia and… Expand

Conditions | Hemolytic-Uremic Syndrome |
---|

#### 56,544 Citations

Measurement of observer agreement.

- Medicine
- Radiology
- 2003

The review concentrates on the chance-corrected indices, kappa and weighted kappa, which are used in diagnostic imaging for expressing observer agreement in regard to categorical data. Expand

An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers.

- Mathematics, Medicine
- Biometrics
- 1977

A subset of 'observers who demonstrate a high level of interobserver agreement can be identified by using pairwise agreement statistics betweeni each observer and the internal majority standard opinion on each subject. Expand

A Modification of Kappa for Interobserver Bias

- Mathematics
- 1984

By COHEN and others the kappa index was developed for measuring nominal scale agreement between two raters. This statistic measures the distance from the nullhypothesis of independent ratings of two… Expand

Measures of interrater agreement.

- Medicine
- Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer
- 2011

The weighted kappa when the outcome is ordinal and the intraclass correlation to assess agreement in an event the data are measured on a continuous scale are introduced. Expand

A Measure of Agreement for Interval or Nominal Multivariate Observations

- Mathematics
- 2001

This article addresses the problem of accounting overall chance-corrected interobserver agreement among the multivariate ratings of several judges. Modifying an approach by Berry and Mielke, a… Expand

Homogeneity of kappa statistics in multiple samples

- Mathematics, Medicine
- Comput. Methods Programs Biomed.
- 2000

The measurement of intra-observer agreement when the data are categorical has been the subject of several investigators since Cohen first proposed the kappa (kappa) as a chance-corrected coefficient… Expand

Measures of clinical agreement for nominal and categorical data: the kappa coefficient.

- Mathematics, Medicine
- Computers in biology and medicine
- 1992

A simple computer program written in PASCAL that can be used in a clinical environment to quickly determine the reliability of nominal or categorical data. Expand

Statistical methods in epidemiology. v. Towards an understanding of the kappa coefficient

- Mathematics, Medicine
- Disability and rehabilitation
- 2000

The kappa coefficient is recommended for measuring agreement in observer variation studies and some pointers are given to determining sample size estimation. Expand

Measurement of reliability for categorical data in medical research

- Mathematics, Medicine
- Statistical methods in medical research
- 1992

The problem of measuring reliability of categorical measurements, particularly diagnostic categorizations, is addressed and a general model is proposed, leading to definition of reliability indices. Expand

Understanding the calculation of the kappa statistic: A measure of inter-observer reliability

- Computer Science
- 2016

The aim is that health care personnel may better understand the purpose of the kappa statistic and how to calculate it by providing a stepwise approach that is supplemented with an example. Expand

#### References

SHOWING 1-10 OF 44 REFERENCES

An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers.

- Mathematics, Medicine
- Biometrics
- 1977

A subset of 'observers who demonstrate a high level of interobserver agreement can be identified by using pairwise agreement statistics betweeni each observer and the internal majority standard opinion on each subject. Expand

A general methodology for the measurement of observer agreement when the data are categorical

- Computer Science
- 1975

This dissertation reviews research situations in medicine, epidemiology, and psychiatry, in psychological measurement and testing, and in sample surveys in which the observer can be an important source of measurement error, and a general statistical methodology is proposed for the analysis of multivariate categorical data arising from observer reliability studies. Expand

A review of statistical methods in the analysis of data arising from observer reliability studies (Part II)

- Mathematics
- 1975

Summary This paper reviews research situations in medicine, epidemiology and psychiatry, in psychological measurement and testing, and in sample surveys in which the observer(rater or interviewer)… Expand

A general methodology for the analysis of experiments with repeated measurement of categorical data.

- Computer Science, Medicine
- Biometrics
- 1977

This paper is concerned with the analysis of multivariate categorical data which are obtained from repeated measurement experiments and appropriate test statistics are developed through the application of weighted least squares methods. Expand

The analysis of categorical data from mixed models

- Mathematics
- 1971

This paper is concerned with contingency tables which are analogous to the well-known mixed model in analysis of variance. The corresponding experimental situation involves exposing each of n… Expand

A computer program for the generalized chi-square analysis of categorical data using weighted least squares (GENCAT).

- Mathematics, Medicine
- Computer programs in biomedicine
- 1976

GENCAT is a computer program which implements an extremely general methodology for the analysis of multivariate categorical data which produces minimum modified chi-square statistics, obtained by partitioning the sums of squares as in ANOVA. Expand

Reliability of measurements for studies of cerebrovascular atherosclerosis.

- Mathematics, Medicine
- Biometrics
- 1972

The development of the methodology for assessing reliability of data is presented and within and between coder variability is estimated. Expand

Large sample standard errors of kappa and weighted kappa.

- Mathematics
- 1969

The statistics kappa (Cohen, 1960) and weighted kappa (Cohen, 1968) were introduced to provide coefficients of agreement between two raters for nominal scales. Kappa is appropriate when all… Expand

An analysis for compounded functions of categorical data.

- Mathematics, Medicine
- Biometrics
- 1973

In each of the previously mentioned papers, primary emphasis was given to the formulation of models and the problems of analysis under various conditions of "no interaction" (see Roy and Kastenbaum [1956] or Bhapkar). Expand

On the analysis of contingency tables with a quantitative response.

- Mathematics
- 1968

This paper illustrates tests for some suitable hypotheses in analysis of contingency tables when some characters are quantitative. For a two-dimensional table tests are given for the hypothesis of… Expand