logging in or signing up lectures Natalya Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 166 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 03, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript PSY1128Statistics and Research Participation: PSY1128 Statistics and Research Participation Introduction Version 1.2Exam bits and pieces: Exam bits and pieces You can only bring a pen, pencil, eraser, rule and simple calculator (foreign language dictionaries - consult tutor). Copies of the orange stats handbook will be available. Disallowed calculators All scientific, programmable or graphical calculators All palmtops, laptops, mobile phones etc. etc. Functions you’ll need: Add, subtract, divide, multiply, decimal point, square root. The Birthday question: The Birthday question What are the chances two of these people share a birthday ?Statistics and psychology: Statistics and psychology “Women have a larger vocabulary than men”, says leading scientist.Experimental Design: Experimental Design Ask for definitions of a list of words Bed, Ship, Penny, Winter, Repair, Breakfast, Fabric, Slice, Assemble, Conceal, Enormous, Hasten, Sentence, Regulate, Commence, Ponder, Cavern, Designate, Domestic, Consume, Terminate, Obstruct, Remorse, Sanctuary, Matchless, Reluctant, Calamity, Fortitude, Tranquil, Edifice, Compassion, Tangible, Perimeter, Audacious, Ominous, Tirade, Encumber, Plagiarise, Impale, Travesty Collect scores from a random sample of men and women. Data: Data Averages (Means): Averages (Means) Women (34 + 29 + 40 + 16 + 38 + 31)/6 = 31.3 Men (32 + 35 + 24 + 18 + 12 +22) / 6 = 21.0 Should we be convinced?Key Concepts: 1: Key Concepts: 1 Probability of occurrence by chance. Statistical significance The .05 convention Statistical test Statistical tablesKey concepts: 1: Key concepts: 1 Sample size (N) Tails Null hypothesisRecipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test Wilcoxon rank-sum... … is a test for a significant difference between two different groups (e.g. of people) Compared to later tests... … it is relatively easy to calculate … it is happy with small data sets … makes few assumptions about the data Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test 1. Rank all the data Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test 2. Calculate the sum of the ranks of the group with lower n. If groups are of equal n, calculate the sum of the ranks for each group, and take the smaller. Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test Women, W = 11 + 7 + 8 + 12 + 6 + 10 = 54 Men, W = 2 + 5 + 9 + 3 + 1 + 4 = 24 W = 24Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test The result is significant if W is smaller or equal to the appropriate value in a Wilcoxon Rank-sum table. N1 = n for the smaller groupMaths refresher - GCSE: Maths refresher - GCSE Negative numbers -3 x 4 = -12 -3 + 4 = 1 -8 x -2 = 16 Brackets 2+3 x 2 = 8 (2+3) x 2 = 10 2(2+3) = 10 Squares and square roots 32 = 9 16 = 4 Algebra 2y = 4 + 6 y = 5Slide16: Repeated measures Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test Wilcoxon rank-sum... … is a test for a significant difference between two different groups (e.g. of people) Compared to later tests... … it is relatively easy to calculate … it is happy with small data sets … makes few assumptions about the data Key concepts: 2: Key concepts: 2 Between-subjects design Unrelated Within-subjects design Related Repeated measuresRecipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test Wilcoxon matched-pairs... … is a test for a significant difference between a set of measurements taken on two different occasions (e.g. a group of people, each tested twice) Compared to later tests, it ... … is relatively easy to calculate … is happy with small data sets … makes few assumptions Recipes: 2. Wilcoxon Matched-pairs test: Recipes: 2. Wilcoxon Matched-pairs test Alcohol consumption increases reaction times. Simple two-choice reaction time task, before and after consumption of 6 units of alcohol. Recipes : 2. Wilcoxon Matched-pairs test: Recipes : 2. Wilcoxon Matched-pairs test 1. Calculate the difference between each pair.Recipes: 2. Wilcoxon Matched-pairs test: Recipes: 2. Wilcoxon Matched-pairs test 2. Remove pairs whose difference is zero Reduce n accordingly 3. Rank the differences, ignoring the signRecipes: 2. Wilcoxon Matched-pairs test: Recipes: 2. Wilcoxon Matched-pairs test 4. Calculate the sum of the ranks of the positive differences T+ = 5 + 6 + 1 + 2 + 4 = 18 5. Calculate the sum of the ranks of the negative differences: T+ = 3 6. Let T be the smaller of T+ and T-. ( T = 3)Recipes: 2. Wilcoxon Matched-pairs test: Recipes: 2. Wilcoxon Matched-pairs test 7. The result is significant if T is smaller or equal to the appropriate value in a Wilcoxon matched-pairs table.Tied Ranks: Tied Ranks To rank a set of numbers such as (23, 25, 25, 30, 87), you give each of the ‘tied’ numbers the average rank: (1, 2.5, 2.5, 4, 5).Slide26: Variance Beyond the mean: Beyond the mean Effects of teaching programme on exam performance. Is there a difference? Tutorials 65 96 84 30 27 Lectures 64 60 47 76 55 Variance: Variance Means are identical, but there’s an important difference: distance from the mean. Variance is (basically) average distance from the mean. Tutorials 65 96 84 30 27 60.4 Lectures 64 60 47 76 55 60.4 Maths - New, possibly: Maths - New, possibly Use of Greek alphabet , , [capital sigma, small sigma, small mu] Mostly used in algebra (e.g. = 9 - 4) is special - it means “sum of” e.g. X might represent a set of numbers { 1, 2, 4, 2 } X = 1 + 2 + 4 + 2 = 9Sample & population: Sample & population Population:The entire set of measurements in which the investigator is interested. Sample: The sub-set of measurements actually collected by the investigator.Variance of a sample: Variance of a sample Note: N-1, not N. Dividing by N underestimates the variance of the population. N-1 does not. More on this on my websiteCalculating variance: Calculating variance Mean Lectures 64 60 47 76 55 60.4 (X-X) 3.6 -0.4 -13.4 15.6 -5.4 (X-X)2 12.96 0.16 179.56 243.36 29.16 (X-X)2 = 465.2 s2 = 465.2 / 4 = 116.3 Homogeneity of variance: Homogeneity of variance Do the groups differ significantly in variance? Variance test: Divide larger variance by smaller F = 971.3 / 116.3 = 8.35 If F exceeds appropriate value in F-table, difference is significant. Mean Var. Tutorials 65 96 84 30 27 60.4 971.3 Lectures 64 60 47 76 55 60.4 116.3 Degrees of freedom: Degrees of freedom “Degrees of freedom” is the number of numbers free to vary given what we know about them. When we calculate variance, we have to know the mean. If we know the mean, only N-1 numbers can freely vary e.g. 1,2,X with a mean of 2. X has to be 3. Hence, each variance has d.f. of N-1.Practice session: Practice session Attempt problems W-3, W-4, V-1, V-2Slide36: Laws of Probability Understanding Probability: Understanding Probability P(Heads) = 0.5 P(Tails) = 0.5 P(Heads)+P(Tails) = 1 Flipping a coin is a “Bernoulli trial” Two outcomes, p and q Sum of 2 probabilities = 1 Multiplicative Law: Multiplicative Law P(H)=0.5 P(H)=0.5 P(H)=0.5 P(H)=0.5 P(H)=0.5 Probability of all 5 being heads? 0.5 x 0.5 x 0.5 x 0.5 x 0.5 = 0.55 = 0.03125 Multiplicative Law: Probability of X events all happening is the product of the individual probabilities.Multiplicative Law: Multiplicative Law T Multiplicative Law: P = 0.03125 Next: Prob. of 3 heads and 2 tails? As a casino manager, what odds would you offer? Additive Law: Additive Law Ten ways of throwing 3 heads and 2 tails.Additive Law: Additive Law Each of the 10 ways, P = 0.03125 Any one of these 10 things could happen P(3 heads, 2 tails) = 0.03125 x 10 =0.31 Additive Law: Probability of any one of X outcomes occurring is the sum of the individual probabilities. Combinations: Combinations Additive Law can get quite tricky e.g. Accumulator bets: exactly 4 of 7 next cards turned over will be red (ignoring Jokers). 0.57 = 0.008 Writing out all the possibilities would take ages and is prone to error. Mathematicians have given us a short cut. “Factorial”: “Factorial” Three factorial: 3! 7! = 7 x 6 x 5 x 4 x 3 x 2 x 1 3! = 3 x 2 x 1 2! = 2 x 1 1! = 1 0! = 1Combination Rule: Combination Rule 7 cards - 1 2 3 4 5 6 7 N = 7 3 reds e.g. cards 1, 3, 4 and 5 How many ways of picking four things from seven? (r = 4)Combination Rule: Combination Rule P=35x0.008=0.27Slide46: Binomial test Binomial Test: Binomial Test Derives from multiplicative law, additive law & combination rule. Use in psychology: Siutations where each participant contributes information that can be coded as the result of a single Bernoulli trial, OR a single participant contributes multiple pieces of information that can each be coded as the result of a single Bernoulli trial Binomial Test...: Binomial Test... …gives the probability of a specific number of P outcomes over N trials. For example: Four-choice multi-choice test. What’s the probability of getting exactly four out of six correct by guessing? Bernoulli N = 6 X = 4 p = 0.25 q = 0.75Binomial Test: Binomial Test Multiplicative Law Probability of N events all happening is the product of the individual probabilities. Prob. = px q(N-X) = 0.254.0.752 = 0.002 But… don’t forget the Additive Law Probability of any one of X outcomes occurring is the sum of the individual probabilities N = 6, X = 4, p= 0.25, q = 0.75Binomial Test: Binomial Test Including the additive law Prob. = (No. of combinations) . pX . q(N-X) Using the Combination Rule, we get: which is the equation for the Binomial test N = 6, X = 4, p= 0.25, q = 0.75Binomial Test: Binomial Test Test example: N =6, X =4, p = 0.25, q = 0.75 That’s it…. more or less… “Exactly” vs. “at least”: “Exactly” vs. “at least” In many cases, we’re not interested in someone getting exactly 4 right. We’re interested in them getting at least 4 right. For example: Chances of passing a test by guessing. In an ESP experiment, chances of being at least as good as the participant’s 30 out of 40 score. Nearly all psychological uses of Binomial test are “at least”.“Exactly” vs. “at least”: “Exactly” vs. “at least” Simple procedure e.g. For our multi-choice test: P(4) = 0.03 P(5) = 0.012 P(6) = 0.002 P(Pass) = 0.03 + 0.012 + 0.002 = 0.044 (assuming pass mark is 4/6)Slide54: Beyond the binomialReminder...: Reminder... BEDMAS Binomial test Probability theory Multiplicative Law Additive LawBeyond the Binomial Test: Beyond the Binomial Test Binomial test Gives probability of occurrence by chance where each event has two possible outcomes. What if there are more than two outcomes? e.g. The Astronomer Royal’s assistant...Chi-square: Chi-square Chi-square test A variant of the binomial test. Advantage: Works for any number of groups. Disadvantage: It only works for largish N. Why? - See handout.N-by-1 chi-square: N-by-1 chi-square Derives from binomial test. Uses in psychology: Siutations where each participant contributes information that can be coded as the result of a single multi-outcome trial, OR a single participant contributes multiple pieces of information that can each be coded as the result of a single multi-outcome trial Calculating Chi-square: Calculating Chi-square O = Observed E = Expected Expected = Expected by null hypothesis (e.g. equal frequency) Rule of thumb: As long as all expected values are greater than 5, the normality assumption is O.K. Both as COUNTSSignificance of 2: Significance of 2 Use a chi-square table. d.f. = N - 1 = 10 - 1 = 9 Chi-square is multi-tailed There are many directions in which data could differ from the expecteds. Chi-square tables are appropriate for all common uses. 2 = 41, d.f. = 9. Result is significant.New type of question: New type of question Group differences e.g. Wilcoxon RS: Whether two groups differ in their average score. Relationship between two measures “Does age affect play style?” Chi-square can be applied to questions of this type. Expected values: Expected values If we knew the values expected when the two variables are independent ( did not affect each other ), we could apply the Chi-square test. Probability theory gives us the answerExpected values: Expected values Chance of co-operative play is 15/48 = 0.3125 Chance of child being 3 years old is 18/48 = 0.375 Chance of a co-operative 3 year old = 0.31 x 0.375 = 0.1172 if age and play-style are independent (Multiplicative law). No. of co-operative 3 year olds expected = 0.1172 x 48 = 5.625Chi-square contingency: Chi-square contingency Short-cut: E = (Row total x Column total ) / Grand total Chi-square contingency: Chi-square contingency When you want to know whether two different variables are significantly related AND each participant contributes a two pieces of information (one for each variable) that can be coded as the result of a multi-outcome trial, OR a single participant contributes, on multiple occasions, two pieces of information (one for each variable) that can be coded as the result of a single multi-outcome trial. Process: Process 18 30 15 33 48Process: Process 18 30 15 33 48 E = ( 15 x 18 ) / 48 = 5.625 E = ( 15 x 30) / 48 = 9.375 E = ( 33 x 18 ) / 48 = 12.375 E = (33 x 30 ) / 48 = 20.625 5.625 9.375 12.375 20.625 Process: Process 18 30 15 33 48 2 = (12 - 5.625)2 / 5.625 + (3 - 9.375)2 / 9.375 + (6-12.375)2 / 12.375 + (27-20.625)2 / 20.625 2 = 7.225 + 4.335 + 3.284 + 1.970 = 16.8 5.625 9.375 12.375 20.625 Process: Process 2 = 16.8 df = (Rows - 1) x (Columns - 1) = 1 Significant. Finger tapping! Slide70: Normal distributionReminder: Reminder Sample & population Variance New: Standard deviation The square root of variance Standard deviation Variance“Normal” distribution: “Normal” distribution Many types of data are normally distributed. Height, Astronomical observations, IQ 1 6 11 16 21 26 Score FrequencyNormal distribution: Normal distribution Example application A patient incurs brain damage following a car crash. His IQ on the WAIS is now 85. He was not tested prior to the crash. What’s the likelihood that his IQ has been adversely affected by the crash? IQ scores (WAIS) Normally distributed for population of non-brain damaged people. = 100, = 15Slide74: IQ Frequency We know the population is normal, mean 100, s.d. 15 For a mathematician, that’s enough to draw this frequency plot From the frequency plot, you can work out the proportion of non-brain damaged people who have a score of 85 or lower.Z-tests: Z-tests That proportion is the probability that our patient comes from the non-brain-damaged population. All this represents an awful lot of work. Fortunately, there’s a short-cut: Z-tests Our patient: Our patient P = 0.16 Doesn’t reach conventional levels of significance.Z-test: Z-test In psychology, use a Z-test where: There is just one participant That participant’s data is just one number The population the participant comes from is known to be normally distributed The standard deviation of the population is known. All fairly rare, but if true then few other tests would work.Slide78: Exploratory data analysisExploratory Data Analysis: Exploratory Data Analysis Discovering things about data by inspection. Test selection Many tests assume the population is normally distributed EDA helps you determine whether these assumptions are O.K. for your data set Detecting outliers (“weird” data points)Normality of distribution: Normality of distribution Scores out of 60 (whole numbers) on a behavioural problems index 1,6,12,16,20,13,8,3,6,14,7,15,12,9,13 1-5 || 6-10 ||||| 11-15 |||||| 16-20 || Create roughly N/4 equal sized “bins” Make a mark for each number in the data set It’ll never look great with small samples. This data set is roughly normal. Main things to look for: bimodality and asymmetry (skew) If N<10 then there’s not really enough data to do this. Bimodality: Bimodality Two peaks: 1-5 ||| 6-10 ||||||||| 11-15 ||| 16-20 |||||||||| Skew: Skew Severe lack of symmetry: 1-5 |||||||||| 1-5 | 6-10 |||||| 6-10 ||||| 11-15 ||| 11-15 |||||| 16-20 || 16-20 |||||||| Normality of distribution: Normality of distribution Scores out of 60 on a behavioural problems index (ADHD children) 50, 49, 25, 27, 29, 45, 52, 51, 48, 26, 27, 30, 43, 51 20 ||||| 30 | 40 |||| 50 |||| This data set does not look particularly normal: some evidence of bimodality and/or skew.Outliers: Outliers Data collection is not perfect: Calculation and recording errors Subject distracted by an outside event Accidental inclusion of a subject from a different population (e.g. non-native English speaker in a language experiment). This “rogue” data can often ruin an experiment by making all effects n.s.Outlier detection: Box plot: Outlier detection: Box plot Place data in rank order 3, 5, 5, 7, 10, 10, 11, 11, 35 Work out the median (middle value when data is placed in rank order) Median = 10 -5 0 5 10 15 20 25 30 35Outlier detection: Box plot: Outlier detection: Box plot 3, 5, 5, 7, 10, 10, 11, 11, 35 Work out the lower quartile (half way between the median and the start) Work out the upper quartile (half way between the median and the end) -5 0 5 10 15 20 25 30 35Outlier detection: Box plot: Outlier detection: Box plot 3, 5, 5, 7, 10, 10, 11, 11, 35 Work out the inter-quartile range (11-5 = 6) Work out the length of the whiskers (1.5 x IQR) = 9 -5 0 5 10 15 20 25 30 35Outlier detection: Box plot: Outlier detection: Box plot 3, 5, 5, 7, 10, 10, 11, 11, 35 Any point more than 2 whiskers beyond the nearest “hinge” is considered an outlier. -5 0 5 10 15 20 25 30 35 Detected: What now?: Detected: What now? Ideally - go back to lab notes and investigate whether there is any reason for the outlier. Exclude if a problem is discovered. Pragmatically - Sometimes not possible. Assume there was a problem and remove. Where to look for outliers: Where to look for outliers Between-subject tests On each group of data separately Within-subject tests On the differences between the two groups Tests not amenable to this procedure: Binomial Chi-square Not in the handbook or handout.Slide91: Between-subjects and within-subjects tests. Sample and population. Wilcoxon matched-pairs test (within-Ss) Variance and Standard deviation Z-test Probability of a score at least as high as X coming from a normal distribution with known mean and standard deviation. Interim revisionRelated samples t-test: Related samples t-test Does the same basic job as a Wilcoxon matched-pairs: Within-subjects comparison of means. Is more powerful than Wilcoxon but... makes some assumptions which can be hard to test. It’s based on the central limit theorem.Central limit theorem: Central limit theorem Part 1: If there’s no difference between the groups in the population as a whole (the null hypothesis), then the mean difference in a sample will, on average, be zero. Part 2: The standard deviation of these mean differences in the population can be estimated as s / N s = Standard deviation of the sample N = sample size Part 3: The distribution of these mean differences is near-normal if N is large (regardless of how the population itself is distributed). Applying CLT: Applying CLT Reaction time to auditory and visual warning signals (within-subjects design) Part 1: If there’s no difference between the groups in the population as a whole (the null hypothesis), then the mean difference in a sample will, on average, be zero.Applying CLT: Applying CLT Take differences as before Calculate the mean (do not ignore sign) Mean difference = 31. Significant?Applying CLT via Z-test: Applying CLT via Z-test Part 1: If there’s no difference between the groups in the population as a whole (the null hypothesis), then the mean difference in a sample will, on average, be zero. Z = (X - ) / Z = (X - 0 ) / Part 3: The distribution of these mean differences is near-normal if N is large (regardless of how the population itself is distributed). Assuming N is large enough, Part 3 allows use of the Z-test (which assumes normality)Applying CLT via Z-test: Applying CLT via Z-test Part 2: The standard deviation of these mean differences in the population can be estimated as s / N s = Standard deviation of the sample N = sample size Z = (X - 0 ) / Z = X / ( s / N ) Calculation: Calculation Z = X / (s / N) = 31 / (58.9/ 20 ) = 2.35 p < 0.05 Experiment worked. Sample bias problem : Sample bias problem Estimates of s.d. from a sample are slightly biased, in the sense that most of the time our estimate will be lower than the true (population) value. William Gossett created a modified Z table that corrects for this problem. t = 2.35, d.f. = N - 1 p < 0.05. Part 2: The standard deviation of these mean differences in the population can be estimated as s / N s = Standard deviation of the sample, N = sample size.Slide100: Related samples t-testRelated samples t-test: Related samples t-test 1. Calculate the difference between each pair. 2. Calculate the mean differenceSlide102: 3. Calculate the standard deviation of the differences. 4. Calculate the standard error: standard error = s / N = 58.93 / 20 = 13.2 Related samples t-test: Related samples t-test 5. Divide the mean difference by the standard error. t = 31 / 13.2 = 2.35 6. Calculate d.f. ( = N-1) 7. If t exceeds the appropriate value in the table then the result is significant. t-test or Wilcoxon?: t-test or Wilcoxon? Use a t-test if... N is large (more than 30) (because then CLT part 3 will be correct) or, you know the population is roughly normal (symmetrical with only one peak) Use EDA histogram to assess N.B.: EDA histogram can’t be done for N<10 Otherwise.. Use a WilcoxonSlide105: Unrelated-samples t-test Reminder: Related samples t-test: Reminder: Related samples t-test 1. Calculate the difference between each pair. 2. Calculate the mean differenceReminder: Related samples t-test: Reminder: Related samples t-test 3. Calculate the standard deviation of the differences. 4. Calculate the standard error: standard error = s / N = 58.93 / 20 = 13.2 Reminder: Related samples t-test: Reminder: Related samples t-test 5. Divide the mean difference by the standard error. t = 31 / 13.2 = 2.35 6. Calculate d.f. ( = N-1) 7. If t exceeds the appropriate value in the table then the result is significant. Unrelated samples t-test: Unrelated samples t-test Just like Wilcoxon, there are between-subjects and within-subjects versions of the t-test Similar procedure to related t-test except there are now 2 means and 2 s.d. Still based on CLT, but more complicated to demonstrate - won’t bother.Unrelated samples t-test: Unrelated samples t-test 1. Calculate mean & variance for each groupUnrelated samples t-test: Unrelated samples t-test 2. Calculate the difference between the means: = 58.45 - 52.85 = 5.6 3. Calculate the standard error (defined differently for unrelated t-test): N = Sample size of each groupUnrelated samples t-test: Unrelated samples t-test 4. Calculate t 5. Calculate d.f. d.f. = 2N - 2 = 14 Critical value is 2.131 No significant difference Unrelated samples t-test: Unrelated samples t-test 2. Calculate the difference between the means: = 58.45 - 52.85 = 5.6 3. Calculate the standard error (defined differently for unrelated t-test): N = Sample size of each group What if N for the two groups is different?Pooled variance estimate: Pooled variance estimate Where sample size unequal, the larger sample should contribute proportionately more to the calculation of standard error. This is done via the pooled variance estimate. The appropriate equation is:Unequal N procedure: Unequal N procedure 1. Calculate mean and variance for each group. 2. Calculate difference between means 3. Calculate the pooled variance estimate: Unequal N procedure: Unequal N procedure 4. Use the pooled variance estimate to calculate the standard error, taking into account the differing sample sizes: Unequal N procedure: Unequal N procedure 5. Calculate t 6. Calculate d.f. d.f. = 2N - 2 Reminder: t-test or Wilcoxon?: Reminder: t-test or Wilcoxon? Use a t-test if... N is large (more than 30) (because then CLT part 3 will be correct) or, you know the population is roughly normal (symmetrical with only one peak) Use EDA histogram to assess for each group N.B.: EDA histogram can’t be done for N<10 Otherwise.. Use a Wilcoxont-test or Wilcoxon: Additional: t-test or Wilcoxon: Additional In a unrelated-samples t-test, use of t-tables only valid if the two populations are of equal variance. Variance test required to see whether this assumption is violated. If significant, assumption violated Use Wilcoxon rank-sum If n.s., assumption might be OK Use unrelated t-test if other conditions metReminder: Variance test: Reminder: Variance test Variance test: Divide larger variance by smaller F = 466.88 / 116.48 = 4.01 If F exceeds appropriate value in F-table, difference is significant.Group difference vs. relationship: Group difference vs. relationship Group differences e.g. Unrelated t-test: Whether two groups differ in their average score. Relationship between two measures “Does age affect play style?” Chi-square CorrelationReminder: variance: Reminder: variance Mean Lectures 64 60 47 76 55 60.4 (X-X) 3.6 -0.4 -13.4 15.6 -5.4 (X-X)2 12.96 0.16 179.56 243.36 29.16 (X-X)2 = 465.2 s2 = 465.2 / 4 = 116.3 Correlation: Correlation Degree of relationship between two continuous variables. Quantified by r - a correlation co-efficient. r ranges from -1 to +1. 50 55 60 65 70 75 80 85 0 200 400 600 Distance % correct r =0.97 r = -0.95 r = 0.16Calculation of r: Calculation of r Calculation of r is based on the concept of co-variance. Co-variance: The extent to which changes in one variable are reflected in changes of the other. Co-variance: Co-variance If X increases as Y increases, co-variance will be positive. If X increases as Y decreases, co-variance will be negative. If X and Y are independent, co-variance will the close to zero.Co-variance: Worked e.g.: Co-variance: Worked e.g. covXY = 889.5 / 9 = 98.8Pearson Product-Moment: Pearson Product-Moment Co-variance then needs to be scaled by the total amount of variability in the data. Pearson showed that doing this meant r always ranged from -1 to +1.Significance & assumptions: Significance & assumptions Significance testing: Use an r table. If you’re testing r 0, then it’s two-tailed. X and Y should be normally distributed (quite important unless N is large).Uses: Uses Use Pearson correlation when you want to assess the relationship between two variables, and... …both variables are continuously valued, and …both variables are approximately normal Otherwise, use contingency chi-square (if categorical) or Spearman’s (see later) if non-normal Method summary: Method summary sx = ( 1961 / 9) = 14.76 sy = ( 424.5 / 9) = 6.87 covxy = 889.5 / 9 = 98.83 r = 98.83 / ( 14.76 x 6.87 ) = 0.97Slide131: Linear regression Linear regression: Linear regression The values calculated for Pearson can also be used to calculate the best-fitting straight line through the points.Line of best fit: Line of best fit Equation for a straight line: y = b x + a Assumption: y is normally distributed at each value of x, and all distributions have equal variance. Assumption is very hard to test, and is widely ignored as a result.Spearman’s r: Spearman’s r If only ranks are available, the same equations can be applied. The test is then called Spearman’s r or rs Where N > 9 the critical value of rs is numerically close to the values in a Pearson’s r table. Even if actual data are available, we sometimes use just the ranks. The main advantage is that it avoids the assumption that X and Y are normally distributed. Slide135: Test selection Normality of distribution: Normality of distribution Scores out of 60 (whole numbers) on a behavioural problems index 1,6,12,16,20,13,8,3,6,14,7,15,12,9,13 1-5 || 6-10 ||||| 11-15 |||||| 16-20 || Create roughly N/4 equal sized “bins” Make a mark for each number in the data set It’ll never look great with small samples. This data set is roughly normal. Main things to look for: bimodality and asymmetry (skew) If N<10 then there’s not really enough data to do this. Outlier detection: Box plot: Outlier detection: Box plot 3, 5, 5, 7, 10, 10, 11, 11, 35 Any point more than 2 whiskers beyond the nearest “hinge” is considered an outlier. -5 0 5 10 15 20 25 30 35 Test selection: Test selection All methods now covered (for this course). How do you select the correct test? Classify tests for usage 8 questions that will help your learning This information not available in exam.Classification of tests: Classification of tests Group differences (means) Between-subjects Wilcoxon rank-sum test (Small N / non-normal / het. var.) Unrelated samples t-test Within-subjects Wilcoxon matched-pairs test (Small N / non-normal ) Related samples t-test Group differences (variance) Variance test Classification of tests: Relationships Contingency chi-square (Categorical) Pearson’s correlation (Continuous & normal) Spearman’s correlation (Ranks, or continuous & non-normal) Linear regression (Best fitting straight line) Classification of testsClassification of tests: Other situations Binomial Bernoulli trial & small N N-by-1 chi-square Multi-outcome trial & all E >= 5 Z-test Single data point Normal population with known s.d. Classification of testsJargon: Jargon To use the questions, you need some jargon: Dependent variables Independent variables Categorical vs. quantitative Sufficiently normal Homogeneous variance Example 1: Example 1 No->No->No->Quantitative ->One ->No (N/4 =1.5) -> Wilcoxon (rank-sum test because it’s between-subjects) A neuroscientist hypothesises that the hippocampus (a small brain region) is the site of the mammalian ability to learn the spatial location of objects. To test this hypothesis, he puts rats (one at a time) in a paddling pool full of milk with a submerged platform. Rats seek out the platform because they do not like swimming. He then removes the rats and places them in a holding cage for 30mins. He then times how long it takes them to find the platform a second time. He gives the same task to a different group of rats who have had their hippocampi surgically removed. Times taken to find the platform (to the nearest second) are as follows: Hippocampus intact 15, 30, 11, 30, 12, 47 Hippocampus removed 90, 120, 42, 382, 178, 87 Does the neuroscientist have any support for his hypothesis? Outliers?Example 2: Example 2 No->No->Yes -> Z-test An elderly stroke patient is referred to a clinical psychologist for testing. In order to develop a programme of rehabilitation, the psychologist needs to know where the patient’s greatest difficulties lie. The psychologist decides to use significant deviation from normal performance as a “yard stick”. One well-used test of fluent vocabulary is to ask the patient to name as many things beginning with the letter “C” as they can in 30 seconds. Strokes frequently reduce performance on this task. They have never been known to improve it. The test has been administered to thousands of members of the general public (of a similar age), and it is known that scores are normally distributed, with a mean of 15.3 and a standard deviation of 6.6. The patient scores 4. Is the patient significantly impaired on the vocabulary fluency task? Outlier test not neededExample 3: Example 3 No->No->No->Categorical (up or down) -> No (E = 9 / 2 = 4.5) -> Binomial test In clinical trials of a drug for hypertension, use of the drug is found to increase blood pressure in 2 patients but reduce it in 7. Previous studies suggest the drug is effective. Assuming blood pressure is equally likely to increase or drop from the first to the second readings in the absence of any medication, would you conclude that the drug significantly reduces blood pressure? Outlier test not neededExample 4: Example 4 No->No->No->Quantitative->Yes -> No -> Spearman correlation A cognitive psychologist believes that when people have to decide whether two objects are identical they can "mentally rotate" one of them until it is the same orientation as the other. He also believes that this is basically analogous to rotation in the real world, in that it takes more time to rotate something through a large number of degrees than a small number of degrees. To this end, he tests 10 different groups of people on tasks where they have to say whether two objects are identical and the required degree of rotation differs. Here are his mean data: Degrees 0, 20, 40, 60, 80, 100, 120, 140, 160, 180, 200 RT (s) 1.1, 1.3, 1.5, 1.7, 1.9, 2.1, 2.3, 2.5, 2.4, 2.9, 3.2 Are reaction time and degree of rotation significantly correlated? Outlier test not appropriatePractice session 2 : Practice session 2 Attempt problems F-1, F-2, F-3 and F-4 These are exam-level questions Don’t forget about outliers Don’t forget about checking for normality Don’t forget to homogeneity of variances where appropriate This is the last stats session before the exam. Ask now or forever hold your peace (sort of…) You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
lectures Natalya Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 166 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 03, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript PSY1128Statistics and Research Participation: PSY1128 Statistics and Research Participation Introduction Version 1.2Exam bits and pieces: Exam bits and pieces You can only bring a pen, pencil, eraser, rule and simple calculator (foreign language dictionaries - consult tutor). Copies of the orange stats handbook will be available. Disallowed calculators All scientific, programmable or graphical calculators All palmtops, laptops, mobile phones etc. etc. Functions you’ll need: Add, subtract, divide, multiply, decimal point, square root. The Birthday question: The Birthday question What are the chances two of these people share a birthday ?Statistics and psychology: Statistics and psychology “Women have a larger vocabulary than men”, says leading scientist.Experimental Design: Experimental Design Ask for definitions of a list of words Bed, Ship, Penny, Winter, Repair, Breakfast, Fabric, Slice, Assemble, Conceal, Enormous, Hasten, Sentence, Regulate, Commence, Ponder, Cavern, Designate, Domestic, Consume, Terminate, Obstruct, Remorse, Sanctuary, Matchless, Reluctant, Calamity, Fortitude, Tranquil, Edifice, Compassion, Tangible, Perimeter, Audacious, Ominous, Tirade, Encumber, Plagiarise, Impale, Travesty Collect scores from a random sample of men and women. Data: Data Averages (Means): Averages (Means) Women (34 + 29 + 40 + 16 + 38 + 31)/6 = 31.3 Men (32 + 35 + 24 + 18 + 12 +22) / 6 = 21.0 Should we be convinced?Key Concepts: 1: Key Concepts: 1 Probability of occurrence by chance. Statistical significance The .05 convention Statistical test Statistical tablesKey concepts: 1: Key concepts: 1 Sample size (N) Tails Null hypothesisRecipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test Wilcoxon rank-sum... … is a test for a significant difference between two different groups (e.g. of people) Compared to later tests... … it is relatively easy to calculate … it is happy with small data sets … makes few assumptions about the data Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test 1. Rank all the data Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test 2. Calculate the sum of the ranks of the group with lower n. If groups are of equal n, calculate the sum of the ranks for each group, and take the smaller. Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test Women, W = 11 + 7 + 8 + 12 + 6 + 10 = 54 Men, W = 2 + 5 + 9 + 3 + 1 + 4 = 24 W = 24Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test The result is significant if W is smaller or equal to the appropriate value in a Wilcoxon Rank-sum table. N1 = n for the smaller groupMaths refresher - GCSE: Maths refresher - GCSE Negative numbers -3 x 4 = -12 -3 + 4 = 1 -8 x -2 = 16 Brackets 2+3 x 2 = 8 (2+3) x 2 = 10 2(2+3) = 10 Squares and square roots 32 = 9 16 = 4 Algebra 2y = 4 + 6 y = 5Slide16: Repeated measures Recipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test Wilcoxon rank-sum... … is a test for a significant difference between two different groups (e.g. of people) Compared to later tests... … it is relatively easy to calculate … it is happy with small data sets … makes few assumptions about the data Key concepts: 2: Key concepts: 2 Between-subjects design Unrelated Within-subjects design Related Repeated measuresRecipes: 1. Wilcoxon Rank-Sum test: Recipes: 1. Wilcoxon Rank-Sum test Wilcoxon matched-pairs... … is a test for a significant difference between a set of measurements taken on two different occasions (e.g. a group of people, each tested twice) Compared to later tests, it ... … is relatively easy to calculate … is happy with small data sets … makes few assumptions Recipes: 2. Wilcoxon Matched-pairs test: Recipes: 2. Wilcoxon Matched-pairs test Alcohol consumption increases reaction times. Simple two-choice reaction time task, before and after consumption of 6 units of alcohol. Recipes : 2. Wilcoxon Matched-pairs test: Recipes : 2. Wilcoxon Matched-pairs test 1. Calculate the difference between each pair.Recipes: 2. Wilcoxon Matched-pairs test: Recipes: 2. Wilcoxon Matched-pairs test 2. Remove pairs whose difference is zero Reduce n accordingly 3. Rank the differences, ignoring the signRecipes: 2. Wilcoxon Matched-pairs test: Recipes: 2. Wilcoxon Matched-pairs test 4. Calculate the sum of the ranks of the positive differences T+ = 5 + 6 + 1 + 2 + 4 = 18 5. Calculate the sum of the ranks of the negative differences: T+ = 3 6. Let T be the smaller of T+ and T-. ( T = 3)Recipes: 2. Wilcoxon Matched-pairs test: Recipes: 2. Wilcoxon Matched-pairs test 7. The result is significant if T is smaller or equal to the appropriate value in a Wilcoxon matched-pairs table.Tied Ranks: Tied Ranks To rank a set of numbers such as (23, 25, 25, 30, 87), you give each of the ‘tied’ numbers the average rank: (1, 2.5, 2.5, 4, 5).Slide26: Variance Beyond the mean: Beyond the mean Effects of teaching programme on exam performance. Is there a difference? Tutorials 65 96 84 30 27 Lectures 64 60 47 76 55 Variance: Variance Means are identical, but there’s an important difference: distance from the mean. Variance is (basically) average distance from the mean. Tutorials 65 96 84 30 27 60.4 Lectures 64 60 47 76 55 60.4 Maths - New, possibly: Maths - New, possibly Use of Greek alphabet , , [capital sigma, small sigma, small mu] Mostly used in algebra (e.g. = 9 - 4) is special - it means “sum of” e.g. X might represent a set of numbers { 1, 2, 4, 2 } X = 1 + 2 + 4 + 2 = 9Sample & population: Sample & population Population:The entire set of measurements in which the investigator is interested. Sample: The sub-set of measurements actually collected by the investigator.Variance of a sample: Variance of a sample Note: N-1, not N. Dividing by N underestimates the variance of the population. N-1 does not. More on this on my websiteCalculating variance: Calculating variance Mean Lectures 64 60 47 76 55 60.4 (X-X) 3.6 -0.4 -13.4 15.6 -5.4 (X-X)2 12.96 0.16 179.56 243.36 29.16 (X-X)2 = 465.2 s2 = 465.2 / 4 = 116.3 Homogeneity of variance: Homogeneity of variance Do the groups differ significantly in variance? Variance test: Divide larger variance by smaller F = 971.3 / 116.3 = 8.35 If F exceeds appropriate value in F-table, difference is significant. Mean Var. Tutorials 65 96 84 30 27 60.4 971.3 Lectures 64 60 47 76 55 60.4 116.3 Degrees of freedom: Degrees of freedom “Degrees of freedom” is the number of numbers free to vary given what we know about them. When we calculate variance, we have to know the mean. If we know the mean, only N-1 numbers can freely vary e.g. 1,2,X with a mean of 2. X has to be 3. Hence, each variance has d.f. of N-1.Practice session: Practice session Attempt problems W-3, W-4, V-1, V-2Slide36: Laws of Probability Understanding Probability: Understanding Probability P(Heads) = 0.5 P(Tails) = 0.5 P(Heads)+P(Tails) = 1 Flipping a coin is a “Bernoulli trial” Two outcomes, p and q Sum of 2 probabilities = 1 Multiplicative Law: Multiplicative Law P(H)=0.5 P(H)=0.5 P(H)=0.5 P(H)=0.5 P(H)=0.5 Probability of all 5 being heads? 0.5 x 0.5 x 0.5 x 0.5 x 0.5 = 0.55 = 0.03125 Multiplicative Law: Probability of X events all happening is the product of the individual probabilities.Multiplicative Law: Multiplicative Law T Multiplicative Law: P = 0.03125 Next: Prob. of 3 heads and 2 tails? As a casino manager, what odds would you offer? Additive Law: Additive Law Ten ways of throwing 3 heads and 2 tails.Additive Law: Additive Law Each of the 10 ways, P = 0.03125 Any one of these 10 things could happen P(3 heads, 2 tails) = 0.03125 x 10 =0.31 Additive Law: Probability of any one of X outcomes occurring is the sum of the individual probabilities. Combinations: Combinations Additive Law can get quite tricky e.g. Accumulator bets: exactly 4 of 7 next cards turned over will be red (ignoring Jokers). 0.57 = 0.008 Writing out all the possibilities would take ages and is prone to error. Mathematicians have given us a short cut. “Factorial”: “Factorial” Three factorial: 3! 7! = 7 x 6 x 5 x 4 x 3 x 2 x 1 3! = 3 x 2 x 1 2! = 2 x 1 1! = 1 0! = 1Combination Rule: Combination Rule 7 cards - 1 2 3 4 5 6 7 N = 7 3 reds e.g. cards 1, 3, 4 and 5 How many ways of picking four things from seven? (r = 4)Combination Rule: Combination Rule P=35x0.008=0.27Slide46: Binomial test Binomial Test: Binomial Test Derives from multiplicative law, additive law & combination rule. Use in psychology: Siutations where each participant contributes information that can be coded as the result of a single Bernoulli trial, OR a single participant contributes multiple pieces of information that can each be coded as the result of a single Bernoulli trial Binomial Test...: Binomial Test... …gives the probability of a specific number of P outcomes over N trials. For example: Four-choice multi-choice test. What’s the probability of getting exactly four out of six correct by guessing? Bernoulli N = 6 X = 4 p = 0.25 q = 0.75Binomial Test: Binomial Test Multiplicative Law Probability of N events all happening is the product of the individual probabilities. Prob. = px q(N-X) = 0.254.0.752 = 0.002 But… don’t forget the Additive Law Probability of any one of X outcomes occurring is the sum of the individual probabilities N = 6, X = 4, p= 0.25, q = 0.75Binomial Test: Binomial Test Including the additive law Prob. = (No. of combinations) . pX . q(N-X) Using the Combination Rule, we get: which is the equation for the Binomial test N = 6, X = 4, p= 0.25, q = 0.75Binomial Test: Binomial Test Test example: N =6, X =4, p = 0.25, q = 0.75 That’s it…. more or less… “Exactly” vs. “at least”: “Exactly” vs. “at least” In many cases, we’re not interested in someone getting exactly 4 right. We’re interested in them getting at least 4 right. For example: Chances of passing a test by guessing. In an ESP experiment, chances of being at least as good as the participant’s 30 out of 40 score. Nearly all psychological uses of Binomial test are “at least”.“Exactly” vs. “at least”: “Exactly” vs. “at least” Simple procedure e.g. For our multi-choice test: P(4) = 0.03 P(5) = 0.012 P(6) = 0.002 P(Pass) = 0.03 + 0.012 + 0.002 = 0.044 (assuming pass mark is 4/6)Slide54: Beyond the binomialReminder...: Reminder... BEDMAS Binomial test Probability theory Multiplicative Law Additive LawBeyond the Binomial Test: Beyond the Binomial Test Binomial test Gives probability of occurrence by chance where each event has two possible outcomes. What if there are more than two outcomes? e.g. The Astronomer Royal’s assistant...Chi-square: Chi-square Chi-square test A variant of the binomial test. Advantage: Works for any number of groups. Disadvantage: It only works for largish N. Why? - See handout.N-by-1 chi-square: N-by-1 chi-square Derives from binomial test. Uses in psychology: Siutations where each participant contributes information that can be coded as the result of a single multi-outcome trial, OR a single participant contributes multiple pieces of information that can each be coded as the result of a single multi-outcome trial Calculating Chi-square: Calculating Chi-square O = Observed E = Expected Expected = Expected by null hypothesis (e.g. equal frequency) Rule of thumb: As long as all expected values are greater than 5, the normality assumption is O.K. Both as COUNTSSignificance of 2: Significance of 2 Use a chi-square table. d.f. = N - 1 = 10 - 1 = 9 Chi-square is multi-tailed There are many directions in which data could differ from the expecteds. Chi-square tables are appropriate for all common uses. 2 = 41, d.f. = 9. Result is significant.New type of question: New type of question Group differences e.g. Wilcoxon RS: Whether two groups differ in their average score. Relationship between two measures “Does age affect play style?” Chi-square can be applied to questions of this type. Expected values: Expected values If we knew the values expected when the two variables are independent ( did not affect each other ), we could apply the Chi-square test. Probability theory gives us the answerExpected values: Expected values Chance of co-operative play is 15/48 = 0.3125 Chance of child being 3 years old is 18/48 = 0.375 Chance of a co-operative 3 year old = 0.31 x 0.375 = 0.1172 if age and play-style are independent (Multiplicative law). No. of co-operative 3 year olds expected = 0.1172 x 48 = 5.625Chi-square contingency: Chi-square contingency Short-cut: E = (Row total x Column total ) / Grand total Chi-square contingency: Chi-square contingency When you want to know whether two different variables are significantly related AND each participant contributes a two pieces of information (one for each variable) that can be coded as the result of a multi-outcome trial, OR a single participant contributes, on multiple occasions, two pieces of information (one for each variable) that can be coded as the result of a single multi-outcome trial. Process: Process 18 30 15 33 48Process: Process 18 30 15 33 48 E = ( 15 x 18 ) / 48 = 5.625 E = ( 15 x 30) / 48 = 9.375 E = ( 33 x 18 ) / 48 = 12.375 E = (33 x 30 ) / 48 = 20.625 5.625 9.375 12.375 20.625 Process: Process 18 30 15 33 48 2 = (12 - 5.625)2 / 5.625 + (3 - 9.375)2 / 9.375 + (6-12.375)2 / 12.375 + (27-20.625)2 / 20.625 2 = 7.225 + 4.335 + 3.284 + 1.970 = 16.8 5.625 9.375 12.375 20.625 Process: Process 2 = 16.8 df = (Rows - 1) x (Columns - 1) = 1 Significant. Finger tapping! Slide70: Normal distributionReminder: Reminder Sample & population Variance New: Standard deviation The square root of variance Standard deviation Variance“Normal” distribution: “Normal” distribution Many types of data are normally distributed. Height, Astronomical observations, IQ 1 6 11 16 21 26 Score FrequencyNormal distribution: Normal distribution Example application A patient incurs brain damage following a car crash. His IQ on the WAIS is now 85. He was not tested prior to the crash. What’s the likelihood that his IQ has been adversely affected by the crash? IQ scores (WAIS) Normally distributed for population of non-brain damaged people. = 100, = 15Slide74: IQ Frequency We know the population is normal, mean 100, s.d. 15 For a mathematician, that’s enough to draw this frequency plot From the frequency plot, you can work out the proportion of non-brain damaged people who have a score of 85 or lower.Z-tests: Z-tests That proportion is the probability that our patient comes from the non-brain-damaged population. All this represents an awful lot of work. Fortunately, there’s a short-cut: Z-tests Our patient: Our patient P = 0.16 Doesn’t reach conventional levels of significance.Z-test: Z-test In psychology, use a Z-test where: There is just one participant That participant’s data is just one number The population the participant comes from is known to be normally distributed The standard deviation of the population is known. All fairly rare, but if true then few other tests would work.Slide78: Exploratory data analysisExploratory Data Analysis: Exploratory Data Analysis Discovering things about data by inspection. Test selection Many tests assume the population is normally distributed EDA helps you determine whether these assumptions are O.K. for your data set Detecting outliers (“weird” data points)Normality of distribution: Normality of distribution Scores out of 60 (whole numbers) on a behavioural problems index 1,6,12,16,20,13,8,3,6,14,7,15,12,9,13 1-5 || 6-10 ||||| 11-15 |||||| 16-20 || Create roughly N/4 equal sized “bins” Make a mark for each number in the data set It’ll never look great with small samples. This data set is roughly normal. Main things to look for: bimodality and asymmetry (skew) If N<10 then there’s not really enough data to do this. Bimodality: Bimodality Two peaks: 1-5 ||| 6-10 ||||||||| 11-15 ||| 16-20 |||||||||| Skew: Skew Severe lack of symmetry: 1-5 |||||||||| 1-5 | 6-10 |||||| 6-10 ||||| 11-15 ||| 11-15 |||||| 16-20 || 16-20 |||||||| Normality of distribution: Normality of distribution Scores out of 60 on a behavioural problems index (ADHD children) 50, 49, 25, 27, 29, 45, 52, 51, 48, 26, 27, 30, 43, 51 20 ||||| 30 | 40 |||| 50 |||| This data set does not look particularly normal: some evidence of bimodality and/or skew.Outliers: Outliers Data collection is not perfect: Calculation and recording errors Subject distracted by an outside event Accidental inclusion of a subject from a different population (e.g. non-native English speaker in a language experiment). This “rogue” data can often ruin an experiment by making all effects n.s.Outlier detection: Box plot: Outlier detection: Box plot Place data in rank order 3, 5, 5, 7, 10, 10, 11, 11, 35 Work out the median (middle value when data is placed in rank order) Median = 10 -5 0 5 10 15 20 25 30 35Outlier detection: Box plot: Outlier detection: Box plot 3, 5, 5, 7, 10, 10, 11, 11, 35 Work out the lower quartile (half way between the median and the start) Work out the upper quartile (half way between the median and the end) -5 0 5 10 15 20 25 30 35Outlier detection: Box plot: Outlier detection: Box plot 3, 5, 5, 7, 10, 10, 11, 11, 35 Work out the inter-quartile range (11-5 = 6) Work out the length of the whiskers (1.5 x IQR) = 9 -5 0 5 10 15 20 25 30 35Outlier detection: Box plot: Outlier detection: Box plot 3, 5, 5, 7, 10, 10, 11, 11, 35 Any point more than 2 whiskers beyond the nearest “hinge” is considered an outlier. -5 0 5 10 15 20 25 30 35 Detected: What now?: Detected: What now? Ideally - go back to lab notes and investigate whether there is any reason for the outlier. Exclude if a problem is discovered. Pragmatically - Sometimes not possible. Assume there was a problem and remove. Where to look for outliers: Where to look for outliers Between-subject tests On each group of data separately Within-subject tests On the differences between the two groups Tests not amenable to this procedure: Binomial Chi-square Not in the handbook or handout.Slide91: Between-subjects and within-subjects tests. Sample and population. Wilcoxon matched-pairs test (within-Ss) Variance and Standard deviation Z-test Probability of a score at least as high as X coming from a normal distribution with known mean and standard deviation. Interim revisionRelated samples t-test: Related samples t-test Does the same basic job as a Wilcoxon matched-pairs: Within-subjects comparison of means. Is more powerful than Wilcoxon but... makes some assumptions which can be hard to test. It’s based on the central limit theorem.Central limit theorem: Central limit theorem Part 1: If there’s no difference between the groups in the population as a whole (the null hypothesis), then the mean difference in a sample will, on average, be zero. Part 2: The standard deviation of these mean differences in the population can be estimated as s / N s = Standard deviation of the sample N = sample size Part 3: The distribution of these mean differences is near-normal if N is large (regardless of how the population itself is distributed). Applying CLT: Applying CLT Reaction time to auditory and visual warning signals (within-subjects design) Part 1: If there’s no difference between the groups in the population as a whole (the null hypothesis), then the mean difference in a sample will, on average, be zero.Applying CLT: Applying CLT Take differences as before Calculate the mean (do not ignore sign) Mean difference = 31. Significant?Applying CLT via Z-test: Applying CLT via Z-test Part 1: If there’s no difference between the groups in the population as a whole (the null hypothesis), then the mean difference in a sample will, on average, be zero. Z = (X - ) / Z = (X - 0 ) / Part 3: The distribution of these mean differences is near-normal if N is large (regardless of how the population itself is distributed). Assuming N is large enough, Part 3 allows use of the Z-test (which assumes normality)Applying CLT via Z-test: Applying CLT via Z-test Part 2: The standard deviation of these mean differences in the population can be estimated as s / N s = Standard deviation of the sample N = sample size Z = (X - 0 ) / Z = X / ( s / N ) Calculation: Calculation Z = X / (s / N) = 31 / (58.9/ 20 ) = 2.35 p < 0.05 Experiment worked. Sample bias problem : Sample bias problem Estimates of s.d. from a sample are slightly biased, in the sense that most of the time our estimate will be lower than the true (population) value. William Gossett created a modified Z table that corrects for this problem. t = 2.35, d.f. = N - 1 p < 0.05. Part 2: The standard deviation of these mean differences in the population can be estimated as s / N s = Standard deviation of the sample, N = sample size.Slide100: Related samples t-testRelated samples t-test: Related samples t-test 1. Calculate the difference between each pair. 2. Calculate the mean differenceSlide102: 3. Calculate the standard deviation of the differences. 4. Calculate the standard error: standard error = s / N = 58.93 / 20 = 13.2 Related samples t-test: Related samples t-test 5. Divide the mean difference by the standard error. t = 31 / 13.2 = 2.35 6. Calculate d.f. ( = N-1) 7. If t exceeds the appropriate value in the table then the result is significant. t-test or Wilcoxon?: t-test or Wilcoxon? Use a t-test if... N is large (more than 30) (because then CLT part 3 will be correct) or, you know the population is roughly normal (symmetrical with only one peak) Use EDA histogram to assess N.B.: EDA histogram can’t be done for N<10 Otherwise.. Use a WilcoxonSlide105: Unrelated-samples t-test Reminder: Related samples t-test: Reminder: Related samples t-test 1. Calculate the difference between each pair. 2. Calculate the mean differenceReminder: Related samples t-test: Reminder: Related samples t-test 3. Calculate the standard deviation of the differences. 4. Calculate the standard error: standard error = s / N = 58.93 / 20 = 13.2 Reminder: Related samples t-test: Reminder: Related samples t-test 5. Divide the mean difference by the standard error. t = 31 / 13.2 = 2.35 6. Calculate d.f. ( = N-1) 7. If t exceeds the appropriate value in the table then the result is significant. Unrelated samples t-test: Unrelated samples t-test Just like Wilcoxon, there are between-subjects and within-subjects versions of the t-test Similar procedure to related t-test except there are now 2 means and 2 s.d. Still based on CLT, but more complicated to demonstrate - won’t bother.Unrelated samples t-test: Unrelated samples t-test 1. Calculate mean & variance for each groupUnrelated samples t-test: Unrelated samples t-test 2. Calculate the difference between the means: = 58.45 - 52.85 = 5.6 3. Calculate the standard error (defined differently for unrelated t-test): N = Sample size of each groupUnrelated samples t-test: Unrelated samples t-test 4. Calculate t 5. Calculate d.f. d.f. = 2N - 2 = 14 Critical value is 2.131 No significant difference Unrelated samples t-test: Unrelated samples t-test 2. Calculate the difference between the means: = 58.45 - 52.85 = 5.6 3. Calculate the standard error (defined differently for unrelated t-test): N = Sample size of each group What if N for the two groups is different?Pooled variance estimate: Pooled variance estimate Where sample size unequal, the larger sample should contribute proportionately more to the calculation of standard error. This is done via the pooled variance estimate. The appropriate equation is:Unequal N procedure: Unequal N procedure 1. Calculate mean and variance for each group. 2. Calculate difference between means 3. Calculate the pooled variance estimate: Unequal N procedure: Unequal N procedure 4. Use the pooled variance estimate to calculate the standard error, taking into account the differing sample sizes: Unequal N procedure: Unequal N procedure 5. Calculate t 6. Calculate d.f. d.f. = 2N - 2 Reminder: t-test or Wilcoxon?: Reminder: t-test or Wilcoxon? Use a t-test if... N is large (more than 30) (because then CLT part 3 will be correct) or, you know the population is roughly normal (symmetrical with only one peak) Use EDA histogram to assess for each group N.B.: EDA histogram can’t be done for N<10 Otherwise.. Use a Wilcoxont-test or Wilcoxon: Additional: t-test or Wilcoxon: Additional In a unrelated-samples t-test, use of t-tables only valid if the two populations are of equal variance. Variance test required to see whether this assumption is violated. If significant, assumption violated Use Wilcoxon rank-sum If n.s., assumption might be OK Use unrelated t-test if other conditions metReminder: Variance test: Reminder: Variance test Variance test: Divide larger variance by smaller F = 466.88 / 116.48 = 4.01 If F exceeds appropriate value in F-table, difference is significant.Group difference vs. relationship: Group difference vs. relationship Group differences e.g. Unrelated t-test: Whether two groups differ in their average score. Relationship between two measures “Does age affect play style?” Chi-square CorrelationReminder: variance: Reminder: variance Mean Lectures 64 60 47 76 55 60.4 (X-X) 3.6 -0.4 -13.4 15.6 -5.4 (X-X)2 12.96 0.16 179.56 243.36 29.16 (X-X)2 = 465.2 s2 = 465.2 / 4 = 116.3 Correlation: Correlation Degree of relationship between two continuous variables. Quantified by r - a correlation co-efficient. r ranges from -1 to +1. 50 55 60 65 70 75 80 85 0 200 400 600 Distance % correct r =0.97 r = -0.95 r = 0.16Calculation of r: Calculation of r Calculation of r is based on the concept of co-variance. Co-variance: The extent to which changes in one variable are reflected in changes of the other. Co-variance: Co-variance If X increases as Y increases, co-variance will be positive. If X increases as Y decreases, co-variance will be negative. If X and Y are independent, co-variance will the close to zero.Co-variance: Worked e.g.: Co-variance: Worked e.g. covXY = 889.5 / 9 = 98.8Pearson Product-Moment: Pearson Product-Moment Co-variance then needs to be scaled by the total amount of variability in the data. Pearson showed that doing this meant r always ranged from -1 to +1.Significance & assumptions: Significance & assumptions Significance testing: Use an r table. If you’re testing r 0, then it’s two-tailed. X and Y should be normally distributed (quite important unless N is large).Uses: Uses Use Pearson correlation when you want to assess the relationship between two variables, and... …both variables are continuously valued, and …both variables are approximately normal Otherwise, use contingency chi-square (if categorical) or Spearman’s (see later) if non-normal Method summary: Method summary sx = ( 1961 / 9) = 14.76 sy = ( 424.5 / 9) = 6.87 covxy = 889.5 / 9 = 98.83 r = 98.83 / ( 14.76 x 6.87 ) = 0.97Slide131: Linear regression Linear regression: Linear regression The values calculated for Pearson can also be used to calculate the best-fitting straight line through the points.Line of best fit: Line of best fit Equation for a straight line: y = b x + a Assumption: y is normally distributed at each value of x, and all distributions have equal variance. Assumption is very hard to test, and is widely ignored as a result.Spearman’s r: Spearman’s r If only ranks are available, the same equations can be applied. The test is then called Spearman’s r or rs Where N > 9 the critical value of rs is numerically close to the values in a Pearson’s r table. Even if actual data are available, we sometimes use just the ranks. The main advantage is that it avoids the assumption that X and Y are normally distributed. Slide135: Test selection Normality of distribution: Normality of distribution Scores out of 60 (whole numbers) on a behavioural problems index 1,6,12,16,20,13,8,3,6,14,7,15,12,9,13 1-5 || 6-10 ||||| 11-15 |||||| 16-20 || Create roughly N/4 equal sized “bins” Make a mark for each number in the data set It’ll never look great with small samples. This data set is roughly normal. Main things to look for: bimodality and asymmetry (skew) If N<10 then there’s not really enough data to do this. Outlier detection: Box plot: Outlier detection: Box plot 3, 5, 5, 7, 10, 10, 11, 11, 35 Any point more than 2 whiskers beyond the nearest “hinge” is considered an outlier. -5 0 5 10 15 20 25 30 35 Test selection: Test selection All methods now covered (for this course). How do you select the correct test? Classify tests for usage 8 questions that will help your learning This information not available in exam.Classification of tests: Classification of tests Group differences (means) Between-subjects Wilcoxon rank-sum test (Small N / non-normal / het. var.) Unrelated samples t-test Within-subjects Wilcoxon matched-pairs test (Small N / non-normal ) Related samples t-test Group differences (variance) Variance test Classification of tests: Relationships Contingency chi-square (Categorical) Pearson’s correlation (Continuous & normal) Spearman’s correlation (Ranks, or continuous & non-normal) Linear regression (Best fitting straight line) Classification of testsClassification of tests: Other situations Binomial Bernoulli trial & small N N-by-1 chi-square Multi-outcome trial & all E >= 5 Z-test Single data point Normal population with known s.d. Classification of testsJargon: Jargon To use the questions, you need some jargon: Dependent variables Independent variables Categorical vs. quantitative Sufficiently normal Homogeneous variance Example 1: Example 1 No->No->No->Quantitative ->One ->No (N/4 =1.5) -> Wilcoxon (rank-sum test because it’s between-subjects) A neuroscientist hypothesises that the hippocampus (a small brain region) is the site of the mammalian ability to learn the spatial location of objects. To test this hypothesis, he puts rats (one at a time) in a paddling pool full of milk with a submerged platform. Rats seek out the platform because they do not like swimming. He then removes the rats and places them in a holding cage for 30mins. He then times how long it takes them to find the platform a second time. He gives the same task to a different group of rats who have had their hippocampi surgically removed. Times taken to find the platform (to the nearest second) are as follows: Hippocampus intact 15, 30, 11, 30, 12, 47 Hippocampus removed 90, 120, 42, 382, 178, 87 Does the neuroscientist have any support for his hypothesis? Outliers?Example 2: Example 2 No->No->Yes -> Z-test An elderly stroke patient is referred to a clinical psychologist for testing. In order to develop a programme of rehabilitation, the psychologist needs to know where the patient’s greatest difficulties lie. The psychologist decides to use significant deviation from normal performance as a “yard stick”. One well-used test of fluent vocabulary is to ask the patient to name as many things beginning with the letter “C” as they can in 30 seconds. Strokes frequently reduce performance on this task. They have never been known to improve it. The test has been administered to thousands of members of the general public (of a similar age), and it is known that scores are normally distributed, with a mean of 15.3 and a standard deviation of 6.6. The patient scores 4. Is the patient significantly impaired on the vocabulary fluency task? Outlier test not neededExample 3: Example 3 No->No->No->Categorical (up or down) -> No (E = 9 / 2 = 4.5) -> Binomial test In clinical trials of a drug for hypertension, use of the drug is found to increase blood pressure in 2 patients but reduce it in 7. Previous studies suggest the drug is effective. Assuming blood pressure is equally likely to increase or drop from the first to the second readings in the absence of any medication, would you conclude that the drug significantly reduces blood pressure? Outlier test not neededExample 4: Example 4 No->No->No->Quantitative->Yes -> No -> Spearman correlation A cognitive psychologist believes that when people have to decide whether two objects are identical they can "mentally rotate" one of them until it is the same orientation as the other. He also believes that this is basically analogous to rotation in the real world, in that it takes more time to rotate something through a large number of degrees than a small number of degrees. To this end, he tests 10 different groups of people on tasks where they have to say whether two objects are identical and the required degree of rotation differs. Here are his mean data: Degrees 0, 20, 40, 60, 80, 100, 120, 140, 160, 180, 200 RT (s) 1.1, 1.3, 1.5, 1.7, 1.9, 2.1, 2.3, 2.5, 2.4, 2.9, 3.2 Are reaction time and degree of rotation significantly correlated? Outlier test not appropriatePractice session 2 : Practice session 2 Attempt problems F-1, F-2, F-3 and F-4 These are exam-level questions Don’t forget about outliers Don’t forget about checking for normality Don’t forget to homogeneity of variances where appropriate This is the last stats session before the exam. Ask now or forever hold your peace (sort of…)