AP Stats Chapter 4 day 3 categorical

Category: Education

Presentation Description

No description available.


Presentation Transcript

Relations in Categorical Data:

Relations in Categorical Data 4.3

Categorical Data:

Categorical Data Some variables are inherently categorical: Gender, race, occupation,… Others are formed by grouping quantitative values: Age groups, income ranges… To analyze categorical data, we use the counts or %s that fall into various categories

What do we use?:

What do we use? Two-way table (both variables are categorical) Row variable: education Column variable: age Marginal totals found at the right and bottom of chart. **If row and column totals are missing, the first step in studying a 2-way table is to calculate them!

Example of the Day – Education and Age (in 1000s of persons) Page 241:

Example of the Day – Education and Age (in 1000s of persons) Page 241 Education Age 25 – 34 35 – 54 55 + Total No high school 4,474 9,155 14,224 27,853 Completed HS 11,546 26,481 20,060 58,087 1 to 3 years college 10,700 22,618 11,127 44,445 4+ years of college 11,066 23,183 10,596 44,845 Total 37,786 81,435 56,008 175,230 Your first task is to compute the row and column totals if they are not given. **Check out the column totals! Hmm…

Percents are often easier to grasp than counts…:

Percents are often easier to grasp than counts… To convert to an overall percent, simply divide any given row/column total by the table total. TIP: If NOT finding an overall percent ask “What group am I asked to find the percent of?” That groups count is the denominator! What percent of 35-54 year olds have 4 or more years of college?

Tasks with our example:

Tasks with our example Compute the marginal distribution for educational level. Compute the marginal distribution for age. Describe the relationship between educational level and age What percent of people aged 55+ have 4 or more years of college? What about people aged 35 – 54? 25 – 34? Graph page 243.

Oh, the irony of it all: Simpson’s Paradox:

Oh, the irony of it all: Simpson’s Paradox Remember lurking variables? They can be present in categorical data, too. Sometimes, lurking variables can alter or even REVERSE relationships between categorical variables. That reversal is called Simpson’s Paradox. We see it happen when data from several groups are combined to form a single group.

Which Hospital would you choose?:

Which Hospital would you choose? Hospital A Hospital B Died 63 16 Survived 2037 784 Total 2100 800

The True Picture:

The True Picture Good Condition Poor Condition Hosp. A Hosp. B Hosp. A Hosp. B Died 6 8 Died 57 8 Survived 594 592 Survived 1443 192 Total 600 600 Total 1500 200

What Happened?:

What Happened? Hospital A tends to attract more patients who are already in poor condition. Those patients are more likely to die. Hospital A had a higher death rate despite its superior performance for this class of patients. The original two-way table hid that lurking variable.

authorStream Live Help