Slide 1: Spearman's rank correlation coefficient
Slide 2: In statistics, Spearman's rank correlation coefficient or Spearman's rho , named after Charles Spearman and often denoted by the Greek letter (rho) or as r s , is a no-parametric measure of statistical dependence between two variables. It assesses how well the relationship between two variables can be described using a monotonic function * It assesses how well the relationship between two variables can be described using a monotonic function. If there are no repeated data values, a perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect monotone function of the other
Spearman's rank correlation coefficient allows you to identify easily the strength of correlation within a data set of two variables, and whether the correlation is positive or negative (whether the slope of the corresponding line is positive or negative). This guide will help you calculate it without too much difficulty. Formula : Spearman's rank correlation coefficient allows you to identify easily the strength of correlation within a data set of two variables, and whether the correlation is positive or negative (whether the slope of the corresponding line is positive or negative). This guide will help you calculate it without too much difficulty. Formula d = Difference of the ranks between paired items in two series n = Number of pairs of ranks
Example : to find out relation between IQ level and No. of hours of study for students from following data: Example : to find out relation between IQ level and No. of hours of study for students from following data IQ , X i Hours of TV per week, Y i `106 7 86 0 100 27 101 50 99 20 103 29 97 20 113 12 112 6 110 17 First, we must find the value of the term di square . To do so we use the following steps, reflected in the table below. Sort the data by the first column ( X i ). Create a new column x i and assign it the ranked values 1,2,3,... n . Next, sort the data by the second column ( Y i ). Create a fourth column y i and similarly assign it the ranked values 1,2,3,... n . Create a fifth column d i to hold the differences between the two rank columns ( x i and y i ). Create one final column di square to hold the value of column d i squared.
Slide 5: IQ , X i Hours of TV per week, Y i rank x i rank y i d i di*di 86 0 1 1 0 0 97 20 2 6 −4 16 99 28 3 8 −5 25 100 27 4 7 −3 9 101 50 5 10 −5 25 103 29 6 9 −3 9 106 7 7 3 4 16 110 17 8 5 3 9 112 6 9 2 7 49 113 12 10 4 6 36
Slide 6: With found, we can add them to find The value of n is 10. So these values can now be substituted back into the equation, Which evaluates to ρ = −0.175757575... With a P-value = 0.6864058 (using the t distribution ) This low value shows that the correlation between IQ and hours spent watching TV is very low. In the case of ties in the original values, this formula should not be used. Instead, the Pearson correlation coefficient should be calculated on the ranks (where ties are given ranks, as described above ).