Skip to the content.

Kendall’s W

Kendall’s W, also known as Kendall’s coefficient of concordance, is a measure of agreement among multiple raters or observers who are rating multiple items or subjects. It is a statistical measure that quantifies the degree to which multiple raters agree on the rankings of a set of items or subjects.

Kendall’s W ranges from 0 to 1, where 0 indicates no agreement among the raters, and 1 indicates perfect agreement among the raters. Kendall’s W is calculated based on the differences between the observed rankings and the average rankings of each item or subject.

Kendall’s W is a useful measure when dealing with ordinal data or rankings. It can be used to assess the reliability or consistency of ratings among multiple raters or observers. A high value of Kendall’s W indicates that there is a high level of agreement among the raters, while a low value of Kendall’s W suggests poor agreement among the raters.

Example of when and how to use Kendall’s W

Let’s say you and your friends are trying to rank your favorite pizza toppings, and you each have different opinions. You all write down your top five favorite toppings, and then compare your rankings.

After some discussion, you realize that you all have different opinions on what makes a great pizza. Some of you prefer classic toppings like pepperoni and cheese, while others prefer more exotic toppings like pineapple and bacon.

The data you collected should look something like this:

  1st Topping 2nd Topping 3rd Topping 4th Topping 5th Topping
Me Pepperoni Mushrooms Onions Sausage Olives
Friend 1 Ham Bell Pepper Tomato Pepperoni Pineapple
Friend 2 Pepperoni Onions Olives Mushrooms Chicken
Friend 3 Mushrooms Onions Ham Pineapple Artichoke
Friend 3 Sausage Bell Pepper Artichoke Bacon Ham

You decide to calculate Kendall’s W to see how much agreement there is among your group. The value turns out to be 0.2, which means that there is very little agreement among your group.

A heatmap is a good way to visualize Kendall’s W. A heatmap shows the pairwise agreement among the raters. Each cell in the heatmap represents the agreement between two raters, and the color of the cell can be used to indicate the level of agreement (e.g., green for high agreement, red for low agreement). This can help to identify any patterns or clusters of raters who tend to agree with each other.