A Pragmatic Approach to Spearmanís Rank Correlation Coefficient

A full understanding of correlation requires an appreciation of bivariate distributions, but increasingly rank correlation coeffjicients are being used as a measure of agreement with pupils for whom such appreciation is not possible. How can we justify the formula used?

Although the formula for Spearmanís Coefficient of Rank Correlation is being increasingly used in school courses in Geography and other subjects, thc justification for its use is rarely available. That the Spearman formula is the result of finding the product moment correlation for the ranks, although bestowing some credibility on the formula, is not helpful since the product moment coefficient is not usually known at this level. The Schools Council publication Mathematics across the Curriculum (Blackie) remarks (p. 104) "Kendallís coefficient has an advantage for teaching purposes over Spearmanís, in that it is more easily explained as a reasonable measure". Whatever the validity of that remark Spearmanís coefficient is the one which is commonly used and in this article I try to explain how its algebraic structure arises.

Assuming that readers understand the principle of ranking, I propose that it is desirable that any coefficient of correlation should both give an indication of the extent to which two sets of ranks differ (or agree) and also should be standardised so as to be consistent with other measures of correlation in that its range should be between -1 and +1.

An Example

Suppose we measure two characteristics A and B of eight towns. Let A be the density of public houses and B the density of places of worship, in each case given as the number per 10000 of the population.
Town P Q R S T U V W
  41 36 26 45 48 35 51 43
  22 7 14 21 13 11 17 20
Ranking these in ascending order of magnitude gives
Town P Q R S T U V W
Rank of A 4 3 1 6 7 2 8 5
Rank of B 8 1 4 7 3 2 5 6
It is convenient and conventional to reorder the pairs so that one characteristic (in this case A) is placed in ascending rank order. Thus
Town R U Q P W S T V
Rank of A 1 2 3 4 5 6 7 8
Rank of B 4 2 1 8 6 7 3 5
One measure of the difference between these ranks is obtained by summing the squares of the difference, d2, between the corresponding ranks.

Sd2 = 32 + 02 + 22 + 42+ 12 + 12 + 42 + 32 = 56

(It is reasonable to ask whether other treatments of the differences in ranks could provide a suitable coefficient, e.g. the sum of the absolute values ·S|d|, but that is another article.)

In general this measure is small when there is a high agreement between the ranks and only for complete agreement does it take its minimum value (it is obvious that · Sd2 cannot be negative and that 0 will be its smallest value).

  • With complete agreement
    Rank of A 1 2 3 4 5 6 7 8
    Rank of B 1 2 3 4 5 6 7 8
    So the minimum Sd2= 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 = 0

    This measure is large when there is a high disagreement between the ranks and only for complete disagreement does it take its maximum value (this is not obvious although intuitively reasonable).

  • With complete disagreement
    Rank of A 1 2 3 4 5 6 7 8
    Rank of B 8 7 6 5 4 3 2 1
    So the maximum Sd2 = 72+ 52+ 32 + 12 + 12 + 32 + 52 + 72 = 168

    Thus our coefficient does seem to discriminate between different degrees of agreement by taking values in the range 0 to 168.

    In general although the minimum value is always zero the maximum value depends on the number of pairs of ranks and standardisation to the range ó1 to 1 is desirable. We also need to reverse the order since -1 has to mean complete disagreement. This can be achieved as follows:    The General Formula

    We can use this approach to generate Spearmanís formula for n pairs of values but first we need to calculate the maximum value of ·Sd2 which, as we have seen, occurs when there is complete disagreement.
    Rank of A 1 2 3  ...  ...  ... n-1 n
    Rank of B n n-1 n-2  ...  ...  ... 2 1
    Here Sd2 = (n-1)2 + (n-3)2 + Ö + (n-3)2 + (n-1)2

    = (n3 ó n)/3

    Standardisation takes place as follows:


    Thus Spearmanís Coefficient of Rank Correlation for n pairs of values is    or    or 

    Back to Contents of The Best of Teaching Statistics
    Back to main Teaching Statistics Page