How Much Difference Does it Make?

Views:
 
Category: Education
     
 

Presentation Description

Understanding, Using and Calculating Effect Sizes for Schools

Comments

Presentation Transcript

How Much Difference Does it Make? : 

How Much Difference Does it Make? Understanding, Using and Calculating Effect Sizes for Schools

Basic ideas : 

Basic ideas Test a class, and retest later. 12 point rise in scores Another teacher, different test: 25 point rise in scores Need to take account of ‘spread’ in scores for each test A common measure of spread is ‘standard deviation’

Standard deviation : 

Standard deviation

Options for Standardising Test Results : 

Options for Standardising Test Results Change test scores so they all have the same mean and standard deviation – e.g. IQ tests: (100, 15) international tests: (500,100) Divide score differences by standard deviation to get a ‘dimensionless’ result This is called an ‘effect size’

Effect sizes can help us to: : 

Effect sizes can help us to: Investigate differences between groups on a common scale See how much change a teaching approach makes Compare different approaches across schools and classrooms Know about the uncertainty in our estimates

Simple example : 

Simple example Class A: Change in scores = 12; standard deviation = 10; effect size = 1.2 Class B: Change in scores = 25; standard deviation = 30; effect size = 0.83

In this presentation : 

In this presentation Getting standard deviations Comparisons using effect sizes Estimating uncertainty in effect sizes How do we know differences are real? How big should an effect size be? What are the cautions and caveats in using effect sizes?

Getting a standard deviation : 

Getting a standard deviation Standard deviations can be estimated from the data These will vary according to the group of students studied Better to use a fixed value for all groups using the same measure Estimate from large population OR – cheat!

How to cheat : 

How to cheat Suppose test is meant to have mean score 50 and most students score between 10 and 90 Expect 95% of cases between ±2 standard deviations about the mean So guess standard deviation = (90-10)/4 = 20 8 point scale – guess standard deviation = about 2 Etc.

Advantages of a fixed value : 

Advantages of a fixed value The standard deviation used to calculate effect sizes is basically a scaling parameter Better if this is fixed (even if not precisely estimated) for consistency and uniformity Can be estimated from data (e.g. pooled standard deviation) but subject to random fluctuations

Comparisons using effect sizes : 

Comparisons using effect sizes Differences between 2 different groups (boys/girls; Maori/Pakeha) Changes in scores for the same group, measured twice (before/after) Relationships between different factors and scores, all considered together (regression)

Uncertainty in effect sizes : 

Uncertainty in effect sizes All statistical calculations are subject to uncertainty, including effect sizes ‘Standard error’ (SE) = standard deviation in the uncertainty around an effect size About 95% probability that ‘true’ effect size within ±2 SEs of calculated value SE inversely proportional to square root of number of cases

Example data : 

Example data

Scenario 1: Separate Groups : 

Scenario 1: Separate Groups SE for Group A mean = Standard deviation divided by √10 = 1.45/3.16 = 0.46 SE for Group B mean = 1.43/3.16 = 0.45 SE for difference B-A = √(0.462+ 0.452) = 0.64 Effect size = (4.5-3.1)/2.0 = 0.70 SE of ES = 0.64/2.0 = 0.32 95% confidence interval = 0.07 to 1.33

Scenario 2: Change over time : 

Scenario 2: Change over time Group B = Group A re-measured Mean difference = 1.40 Standard deviation of difference = 1.26 SE of mean difference = 1.26/√10 = 0.40 Effect size = 1.40/2.0 = 0.70 SE of effect size = 0.40/2.0 = 0.20 95% confidence interval = 0.31 to 1.09

A simple approximation for SE : 

A simple approximation for SE Due to Edith Hodgen of NZCER 2 separate samples: SE = √(1/N1 + 1/N2) = 0.45 Same sample re-tested: SE = √(1/N) = 0.32 Gives a quick way of estimating SEs directly from sample sizes

How do we know effect sizes are real? : 

How do we know effect sizes are real? If we estimate an effect size, how do we know it is not really zero? Use the SE to give a 95% confidence interval = estimated value ± 2xSE If confidence interval includes zero, we say the effect size is not significant Can plot effect sizes and confidence intervals in a ‘Star Wars Plot’

Star Wars Plot : 

Star Wars Plot

How Big is Big? : 

How Big is Big? How big should an effect size be to be educationally significant? Some people say, e.g.: < 0.2 = ‘small’, about 0.4 = ‘medium’, > 0.6 = ‘large’ These can be misleading You can get ‘large’ effect sizes by focusing on small homogeneous groups Or by testing immediately for an impact of something you’ve just taught

More importantly… : 

More importantly… Small effects sizes can be important, if applied to the whole population (e.g. 0.1 = 10 point rise in PISA scores) Small can be good if it’s cheap and easy to obtain Need to evaluate factors such as costs, whether impact is ongoing or temporary, and whether it applies to the whole population or just a sub-group

Caveats and Heffalump Traps : 

Caveats and Heffalump Traps Effect sizes are a handy way of looking at data, but not a magic bullet Regression to the mean – if you pick the lowest performing students, do anything, and test them again, you will always find they have improved faster than others! Always check the standard errors However, if something is making a difference you should be able to measure it.

Good luck! : 

Good luck!