cloud-based inter-rater reliability analysis, Cohen's kappa, Gwet's AC1/AC2, Krippendorff's alpha, Brennan-Prediger, Fleiss generalized kappa, intraclass correlation coefficients
Analyzing a Contingency Table
Cumulative Probability Threshold
Confidence Level (%):
Sampling Fraction (%):
Key in your custom weights here!
Analyzing 2 -Rater Flat List Frequency Data
To subset your data, highlight the target area on the data grid and click the appropriate dark red button
Cumulative Probability Threshold
Confidence Level (%):
Sampling Fraction (%):
Key in your custom weights here!
Analyzing 2-Rater Raw Scores
To subset your data, highlight the target area on the data grid and click the appropriate dark red button
Rater 1's Data Range:
Rater 2's Data Range:
Cumulative Probability Threshold
Confidence Level (%):
Sampling Fraction (%):
Confidence Level (%):
Type of Intraclass Correlation Coefficient
Key in your custom weights here!
Analyzing Raw Scores for 3 Raters or More
To subset your data, highlight the target area on the data grid and click the appropriate dark red button
Cumulative Probability Threshold
Confidence Level (%):
Sampling Fraction (%):
Confidence Level (%):
Type of Intraclass Correlation Coefficient
Key in your custom weights here!
Analyzing the Distribution of Raters by Subject & Category
To subset your data, highlight the target area on the data grid and click the appropriate dark red button
Cumulative Probability Threshold
Confidence Level (%):
Sampling Fraction (%):
Key in your custom weights here!
Paired and Unpaired t-Tests: Testing the Difference Between 2 Coefficients for Statistical Significance
About this App
AgreeStat360 is an App that implements various methods for evaluating the extent of agreement among 2 or more raters. These methods are discussed in details in 2 volumes that comprise the 5th edition of the book "Handbook of Inter-Rater Reliability" by Kilem L. Gwet. Both volumes are available in the form of printable PDF file and can be obtained here among other books.
For 2 raters, organize your data either as a contingency table (for categorical ratings only) or as a two-column table of raw categorical or quantitative ratings. For 3 raters or more, your data can be in the form of columns of raw scores (for categorical or quantitative ratings), or alternatively in the form of a distribution of raters by response category for categorical only.
Check the "Load a test dataset" checkbox to populate the data grid with test data, and click on the Execute button to see the results.
Your data can be captured in 2 ways. You can key in the ratings directly in the data grid or import it from a CSV text file or MS Excel. Whichever method you choose, you can highlight a portion of the grid you want to analyze and click on the red action button. The selected data will be described below the associated red action button.
Author
Kilem L. Gwet, PhD
gwet@agreestat.com