Presentation Transcript
Chapter 14 : Chapter 14 Chi Square - 2
Chi Square : Chi Square Chi Square is a non-parametric statistic used to test the null hypothesis.
It is used for nominal data.
It is equivalent to the F test that we used for single factor and factorial analysis.
… Chi Square : … Chi Square Nominal data puts each participant in a category. Categories are best when mutually exclusive and exhaustive. This means that each and every participant fits in one and only one category.
Chi Square looks at frequencies in mutually exclusive and exhaustive categories into which participants are assigned after a single measurement.
Expected frequencies and the null hypothesis ... : Expected frequencies and the null hypothesis ... Chi Square compares the expected frequencies in categories to the observed frequencies in categories.
“Expected frequencies”are the frequencies in each cell predicted by the null hypothesis
… Expected frequencies and the null hypothesis ... : … Expected frequencies and the null hypothesis ... The null hypothesis:
H0: fo = fe
There is no difference between the observed frequency and the frequency predicted (expected) by the null.
The experimental hypothesis:
H1: fo fe
The observed frequency differs significantly from the frequency predicted (expected) by the null.
Calculating 2 : Calculating 2 Calculate the deviations of the observed from the expected. For each cell: Square the deviations.
Divide the squared deviations by the expected value.
Calculating 2 : Calculating 2 Add ‘em up.
Then, look up 2 in Chi Square Table
df = k - 1 (one sample 2)
OR df= (Columns-1) * (Rows-1)
(2 or more samples)
Slide8 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2
Slide9 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2 Degrees of
freedom
Slide10 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2 Critical values
= .05
Slide11 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2 Critical values
= .01
Example : Example If there were 5 degrees of freedom, how big would 2
have to be for significance at the .05 level?
Slide13 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2
Using the 2 table. : Using the 2 table. If there were 2 degrees of freedom, how big would 2
have to be for significance at the .05 level?
Note: Unlike most other tables you have seen, the critical
values for Chi Square get larger as df increase. This is
because you are summing over more cells, each of which
usually contributes to the total observed value of chi square.
Slide15 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2
One sample example: Party: 75% male, 25% femaleThere are 40 swimmers. Since 75% of people at party are male, 75% of swimmers should be male. So expected value for males is .750 X 40 = 30. For women it is .250 x 40 = 10.00 : One sample example: Party: 75% male, 25% female There are 40 swimmers. Since 75% of people at party are male, 75% of swimmers should be male. So expected value for males is .750 X 40 = 30. For women it is .250 x 40 = 10.00 Male
Female Observed
20
20 Expected
30
10 O-E
-10
10 (O-E)2
100
100 (O-E)2/E
3.33
10 df = k-1 = 2-1 = 1
Slide17 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2 2 (1, n=40)= 13.33 Men go swimming
less than expected. Gender does affect who goes
swimming. Exceeds critical value at = .01
Reject the null hypothesis. Women go swimming
more than expected.
2 sample example : 2 sample example Freshman and sophomores who like horror movies. Likes horror films Dislikes horror films 150 200 100 50
Slide19 : There are 500 altogether. 200 (or a proportion of .400 like horror movies, 300 (.600) dislike horror films. (Proportions appear in parentheses in the margins.) Multiplying by the proportion in the “likes horror films” row by the number in the “Freshman” column yield the following expected frequency for the first cell. The formula is:
Expected Frequency = (Proprowncol).
(EF appears in parentheses in each cell.) Likes horror films Dislikes horror films 150 200 (150) 100 (150) 50 (100) 200
(.400) 300
(.600) 250 250 500 (100)
Computing 2 : Computing 2 Fresh Likes
Fresh Dislikes
Soph Likes
Soph Dislikes Observed
150
100
50
200 Expected
100
150
100
150 df = (C-1)(R-1) = (2-1)(2-1) = 1
Slide21 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2 2 (1, n=500)= 83.33 Fresh/Soph dimension does affect
liking for horror movies. Critical at = .01
Reject the null hypothesis. Proportionally, more freshman
than sophomores
like horror movies
The only (slightly)hard part is computing expected frequencies : The only (slightly)hard part is computing expected frequencies In one sample case, multiply n by a hypothetical proportion based on the null hypothesis that frequencies will be random.
Simple Example - 100 teenagers listen to radio stations : Simple Example - 100 teenagers listen to radio stations H1: Some stations are more popular with teenagers than others.
H0: Radio station do not differ in popularity with teenagers.
Expected frequencies are the frequencies predicted by the null hypothesis. In this case, the problem is simple because the null predicts an equal proportion of teenagers will prefer each of the four radio stations.
Is the observed
significantly different
from the expected?
Slide24 : Observed Expected
df = k-1 = (4-1) = 3
2(3, n=100) = 20.00, p<.01
Station 1
Station 2
Station 3
Station 4
40
30
20
10
25
25
25
25
15
5
-5
15
225
25
25
225
9.00
1.00
1.00
9.00 Differential popularity of Radio station among teenagers
The only (slightly)hard part is computing expected frequencies : The only (slightly)hard part is computing expected frequencies In the multi-sample case, multiply proportion in row by numbers in each column to obtain EF in each cell.
A 3 x 4 Chi Square : A 3 x 4 Chi Square Women, stress, and seating preferences.
(and perimeter vs. interior, front vs. back Very Stressed Females
Moderately Stressed Females
Control Group Females Front Front Back Back
Perim Inter Perim Inter 10 60 15 35 30 50 70 5 10 15 15 25 20 300 60 30 150 100 100 100
Expected frequencies : Expected frequencies Women, stress, and perimeter versus interior
seating preferences. Very Stressed Females
Moderately Stressed Females
Control Group Females 10 60 15 35 30 50 70 5 10 15 15 25 20 300 60 30 150 100 100 100 (20) (20) (20) Front Front Back Back
Perim Inter Perim Inter
Column 2 : Column 2 Women, stress, and perimeter versus interior
seating preferences. Very Stressed Females
Moderately Stressed Females
Control Group Females 10 60 15 35 30 50 70 5 10 15 15 25 20 300 60 30 150 100 100 100 (20) (20) (20) (50) (50) (50) Front Front Back Back
Perim Inter Perim Inter
Column 3 : Column 3 Women, stress, and perimeter versus interior
seating preferences. Very Stressed Females
Moderately Stressed Females
Control Group Females 10 60 15 35 30 50 70 5 10 15 15 25 20 300 60 30 150 100 100 100 (20) (20) (20) (50) (50) (50) (10) (10) (10) Front Front Back Back
Perim Inter Perim Inter
All the expected frequencies : All the expected frequencies Women, stress, and perimeter versus interior
seating preferences. Very Stressed Females
Moderately Stressed Females
Control Group Females 10 60 15 35 30 50 70 5 10 15 15 25 20 300 60 30 150 100 100 100 (20) (20) (20) (50) (50) (50) (10) (10) (10) (20) (20) (20) Front Front Back Back
Perim Inter Perim Inter
Slide31 : FrontP
FrontI
BackP
BackI Observed
10
70
5
15 Expected
20
50
10
20 df = (C-1)(R-1) = (4-1)(3-1) = 6 Very Stressed FrontP
FrontI
BackP
BackI
15
50
10
25
20
50
10
20
-5
0
0
5
25
0
0
25
1.25
0.00
0.00
1.25 Moderately Stressed FrontP
FrontI
BackP
BackI
35
30
15
20
20
50
10
20
15
-20
5
0
225
400
25
0
11.25
8.00
2.50
0.00 Control Group
Slide32 : df 1 2 3 4 5 6 7 8
.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51
.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09
df 9 10 11 12 13 14 15 16
.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30
.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00
df 17 18 19 20 21 22 23 24
.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42
.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98
df 25 26 27 28 29 30
.05 37.65 38.89 40.14 41.34 42.56 43.77
.01 44.31 45.64 46.96 48.28 49.59 50.89
Critical values of 2 2 (6, n=300)= 41.00 There is an effect
between stressed women and
seating position. Critical at = .01
Reject the null hypothesis.
Slide33 : FrontP
FrontI
BackP
BackI Observed
10
70
5
15 Expected
20
50
10
20 O-E
-10
20
-5
-5 (O-E)2
100
400
25
25 (O-E)2/E
5.00
8.00
2.50
1.25 2 = 41.00 df = (C-1)(R-1) = (4-1)(3-1) = 6 Very Stressed FrontP
FrontI
BackP
BackI
15
50
10
25
20
50
10
20
-5
0
0
5
25
0
0
25
1.25
0.00
0.00
1.25 Moderately Stressed FrontP
FrontI
BackP
BackI
35
30
15
20
20
50
10
20
15
-20
5
0
225
400
25
0
11.25
8.00
2.50
0.00 Control Group Very stressed women avoid
the perimeter and
prefer the front interior. The control group prefers
the perimeter and avoids
the front interior.
Summary: Different Ways of Computing the Frequencies Predicted by the Null Hypothesis : Summary: Different Ways of Computing the Frequencies Predicted by the Null Hypothesis One sample
Expect subjects to be distributed equally in each cell. OR
Expect subjects to be distributed proportionally in each cell. OR
Expect subjects to be distributed in each cell based on prior knowledge, such as, previous research.
Multi-sample
Expect subjects in different conditions to be distributed similarly to each other. Find the proportion in each row and multiply by the number in each column to do so.
Conclusion - Chi Square : Conclusion - Chi Square Chi Square is a non-parametric statistic,used for nominal data.
It is equivalent to the F test that we used for single factor and factorial analysis.
Chi Square compares the expected frequencies in categories to the observed frequencies in categories.
… Conclusion - Chi Square : … Conclusion - Chi Square The null hypothesis:
H0: fo = fe
There is no difference between the observed frequency and frequency predicted by the null hypothesis.
The experimental hypothesis:
H1: fo fe
The observed frequency differs significantly from the frequency expected by the null hypothesis.
The end. Hope you found the slides helpful!RK : The end. Hope you found the slides helpful! RK
Catch the
buzz on authorSTREAM
Copyright © 2002-2008 authorSTREAM. All rights reserved.