# Make a picture! First, create a frequency table

 Date 03.05.2016 Size 47.32 Kb.           AP Statistics: Chapter 3 Categorical Data

MAKE A PICTURE!

First, create a frequency table

Example: number of students at CB South in each grade: Proportion = decimal: .30, .05 Percent = %: 30%, 5%
Frequency = # of things (count) Relative frequency = % of things
Distribution (of a variable)- shows the values of the variable ad how often the sample takes each value

Examples: bar chart, pie chart, histogram, stemplot, etc.

Categorical Distributions:

1. Bar Chart

Notice the spaces

In between bars

Relative frequency

%

#

1. Pie Chart

Be sure to use labels and percents!

Contingency tables (aka 2-Way tables)
Frosh

Soph

Junior

Senior

Total

Male

cells

Female

margins

Total

gender

Identify:

• Row variable  gender

• Values of the variable  the different rows/columns

• Total (n)  bottom right of chart

• # of Cells  8 (don’t count totals)

• Totals  margins

Example: Hospitals
Hospital A

Hospital B

Died

63

2821

79
16

Survived

2037

2900
784

2100 800

• What percent of people died? Notation:

Probability: P(event) Given/Of: And: (overlap) Or: Probability of A given B • Given that someone went to Hospital B, what is the chance that they died? • Of those people who died, what percent went to Hospital A? • What percent of people died and went to Hospital B? • What percent of people survived or went to Hospital A? 2 types of Distributions for Categorical Variables

1. MARGINAL DISTRIBUTIONS

• Example: Hair color vs. Gender
 Brown Blonde Black Red Total MALE 26 24 10 3 63 FEMALE 20 35 12 6 73 TOTALs 46 59 22 9 136

margins
Find the marginal distribution for the HAIR COLOR variable

Brown: Blonde: Black: Red: • Find the marginal distribution for the GENDER variable

Male: Female: • MAKE A PICTURE! BAR CHART

1. CONDITIONAL DISTRIBUTIONS

• Look at … one variable

• Then look at … each value of the variable individually
 Brown Blonde Black Red Total MALE 26 24 10 3 63 FEMALE 20 35 12 6 73 TOTALs 46 59 22 9 136

• ALWAYS … in %

• Example: Hair Color vs. Gender

• Find the conditional Distribution for the HAIR COLOR variable

Brown: Blonde: Black: Red:    • Find the conditional Distribution for the GENDER variable

Male: Female:  • Represented visually: SEGMENTED (or STACKED) BAR GRAPH

• Each bar = 100%

• Values of variable on the x-axis

• Bars are segmented into parts of each value

Independence: When one variable does not affect the other variable

How do we tell independence? Independence exists when the conditional distributions looks the same throughout all values of the variable (when the sections look approximately the same). There is generally less than a 5 % difference between percentages. When categorical variables are dependent, they are said to be associated.
Independent: Dependent:  AP Stat- worksheet 3A- Categorical Variables practice

In a survey of adult Americans, people were asked to indicate their age and to categorize their political preference (liberal, moderate, conservative). The results are as follows:

 Liberal Moderate Conservative Total under 30 83 140 73 296 30 - 50 119 280 161 560 over 50 88 284 214 586 total 290 704 448 1442

1. What are the row and column variables?

2. What percent of Liberals are under 30?

3. Of those over 50, what percent are Liberals?

4. Of those that are moderates, what percent are 30-50?

5. What percent of respondents are moderate and under 30?

6. Calculate the marginal distribution for the AGE variable. Write these down. Then make a bar graph of the marginal distribution for age.

1. Calculate the marginal distribution for the PREFERENCE variable. Write these down. Then make a bar graph of this marginal distribution.

1. Calculate the conditional distribution of the AGE variable. Write these down. Then make a segmented bar graph of this marginal distribution.

1. Calculate the conditional distribution of the PREFERENCE variable. Write these down. Then make a segmented bar graph of this marginal distribution.

1. Are the two variables independent?

AP Stat- worksheet 3B- Categorical Variable practice

A 4-year study reported in The New York Times, on men more than 70 years old analyzed blood cholesterol and noted how many men with different cholesterol levels suffered nonfatal or fatal heart attacks.

 Low cholesterol Medium cholesterol High cholesterol Nonfatal heart attacks 29 17 18 Fatal heart attacks 19 20 9

1. Calculate the marginal distribution for cholesterol level and make a bar graph.

2. Calculate the marginal distribution for severity of heart attack and make a bar graph.

3. Calculate three conditional distributions for the three levels of cholesterol and make a stacked bar graph.

4. Calculate the conditional distributions for the type of heart attack and make a stacked bar graph.

5. Are the two variables independent?