# Stat 101 Agresti Homework 3 Solutions

 Date 20.05.2016 Size 45.58 Kb. #62070
STAT 101 - Agresti

Homework 3 Solutions

9/27/10
Chapter 4
4.27. (a) The sampling distribution of the sample proportion of heads for flipping a balanced coin once is
 p 0 1 Probability 0.5 0.5

(b) The sampling distribution of the sample proportion of heads for flipping a balanced coin twice is
 p 0 0.5 1 Probability 0.25 0.5 0.25

(c) The sampling distribution of the sample proportion of heads for flipping a balanced coin three times is
 p 0 1/3 2/3 1 Probability 0.125 0.375 0.375 0.125

(d) The sampling distribution of the sample proportion of heads for flipping a balanced coin four times is
 p 0 0.25 0.5 0.75 1 Probability 0.0625 0.25 0.375 0.25 0.0625

(e) As the number of flips increases, the sampling distribution of the sample proportion of heads seems to be getting more normal, with the probabilities concentrating more closely around 0.50.
4.29. (a) . (b) If actually 50% of the population voted for DeWine, it would be surprising to obtain 44% in this exit poll, since 44% is 6% lower than 50%, and the standard error for the sampling distribution is 1.04%; that is, the sample proportion of 0.44 is nearly 6 standard errors below 0.50. (c) Based on the information from the exit poll, I would be willing to predict that Sherrod Brown would win the Senatorial election.
4.33. (a) The probability that PDI is below 90 is

.

(b) The probability that the sample mean PDI is below 90 is

.

(c) An individual PDI of 90 is not surprising, since the probability is 0.2514 of that value or lower. However, a sample mean PDI of 90 would be surprising since this value would happen almost never. (d) The sketch of the sampling distribution should be less spread out and have a taller peak and thinner tails than the sketch of the population distribution.

4.36. (a) The population distribution is skewed to the right with mean 5.2 and standard deviation 3.0. (b) The sample data distribution based on the sample of 36 families and is skewed to the right with mean 4.6 and standard deviation 3.2. (c) The sampling distribution of is approximately normal with mean 5.2 and standard error . This distribution describes the theoretical distribution for the sample mean.
4.41 (b) Even though the population distribution is not normal (there are only two possible values), the sample proportions for the 1000 samples of size 100 each should have a histogram with an approximately bell shape.
4.42 (a) The population distribution is skewed, but the empirical distribution of sample means probably has a bell shape, reflecting the Central Limit Theorem.

(b) The Central Limit Theorem applies to relatively large random samples, but here n = 2 for each sample.

4.46. (a) The sample data distribution tends to resemble the population distribution more closely than the sampling distribution. A random sample of data from a population should be representative of the population, and its distribution should be similar to the population distribution. (b) The sample data distribution is the distribution of data that we actually observe. The sampling distribution of is the probability distribution for the possible values of the sample statistic .
4.47. (a) A lower bound for the mean is

.

(b) Since we know the category of ideal number of children that falls at the 50% point, we can find the median. The median is 4 children.

4.50. When n = 100, . The interval 0.35 to 0.65 is the interval within which the sample proportion is almost certain to fall. When n = 1000, . The interval 0.453 to 0.547 is the interval within which the sample proportion is almost certain to fall. When n = 10,000, . The interval 0.485 to 0.515 is the interval within which the sample proportion is almost certain to fall.
4.51. a, c, d
4.52. c
4.53. False. As the sample size increases, the standard error of the sampling distribution of decreases, since decreases as n increases.

4.54. (a) Group A: . Almost 16% of students from Group A are not admitted to Lake Wobegon Junior College. Group B:

. Almost 31% of students from Group B are not admitted to Lake Wobegon Junior College. (b) Of the students who are not admitted, 0.3085/(0.3085 + 0.1587) = 0.3085/0.4672 = 0.6603, or about 66%, are from Group B. (c) If the new policy is implemented, the proportion of students from Group A that are not admitted would be 0.0228, while the proportion of students from Group B that are not admitted would be 0.0668. In this case, about 75% of the students who are not admitted would be from Group B. Relatively speaking, this policy would hurt students from Group B more than the current policy.
4.55. (a) . (b) ; . (c) The standard error for a sample proportion for a random sample of size n is .
4.57. (a) The finite population correction is . (b) If n = N, the finite population correction is , so . (c) When n = 1, the finite population correction is , so . Thus, the sampling distribution of and its standard error are identical to the population distribution and its standard deviation.

Chapter 5

5.4. The estimated standard error is .
5.7. (a) The estimated standard error in 2004 is . (b) The margin of error is , or 3%. (c) The 95% confidence interval is 36% - 3% = 33% to 36% + 3% = 39%. We are 95% confident that the population proportion of people agreeing that it is much better for everyone involved if the man is the achiever outside the home and the woman takes care of the home and family falls in the interval 33% to 39%.
5.8. The 99% confidence interval is

.
5.12. (a) “Sample prop” = 1885/2815 = 0.6696. (b) Since we are 95% confident that the interval 65.2% to 68.7% contains the population proportion of American adults who are in favor of the death penalty and the entire interval exceeds 50%, it is reasonable to conclude that more than half of all American adults are in favor of the death penalty. (c) A 95% confidence interval for the proportion of American adults who opposed the death penalty is 31.3% to 34.8%.
5.13. (a) The proportion that said legal is 0.364; the proportion that said not legal is 0.636. (b) The 95% confidence interval is . We are 95% confident that the interval 0.331 to 0.397 contains the population proportion that thinks marijuana should be made legal. Since this interval is entirely below 50%, we can conclude that a minority of Americans felt this way. (c) The proportion that said marijuana should be legal dropped until 1990 and the has increased each year since.
5.18. If the sample size had been one-fourth as large, the confidence interval would be twice as wide and would be 0.23 to 0.31.
5.21. (a) The standard error is . (b) We are 95% confident that the interval 21.5 to 28.0 contains the population mean number of female partners males have had sex with since their eighteenth birthday. (c) The mean is quite high compared to the median and the mode, which means that there were a few male respondents with a very large number of female sex partners. In addition, the standard deviation is more than twice the mean, confirming the right skew of the distribution of the number of female sex partners. A confidence interval based on the mean does not seem to be the best idea.
5.22. (a) The point estimate is 3.02 children. (b) The standard error is . (c) We are 95% confident that the interval 2.9 to 3.2 contains the population mean ideal number of children for a family to have. (d) Since the confidence interval is entirely above 2.0, it does not seem plausible that the population mean equals 2.0 children.
5.24. (a)

; . (b) The standard error is . (c) The t-score that is in the df = 16 row and column is 2.120. (d) The 95% confidence interval is . We are 95% confident that the interval 3.6 to 11.0 pounds contains the population mean change in weight for this therapy.
5.25. A confidence is not about any one subject or about 95% of the subjects, it is an interval estimate for our population parameter. The correct interpretation is that we are 95% confident that the interval 2.60 to 2.93 hours is the population mean number of hours of TV watched on the average day.
5.28. (a) The 95% confidence interval is . We are 95% confident that the interval 1.67 to 1.95 contains the population mean number of days in the past 7 days that women have felt sad. (b) Since the standard deviation is larger than the mean, the variable is most likely skewed to the right. Since t procedures are robust against violations of normality and our sample size is large, our findings in part (a) are probably okay, unless there are extreme outliers.
5.32. (a) = 1.5 days. (b) The 95% confidence interval is = 1.4 to 1.6. We are 95% confident that the interval 1.4 to 1.6 contains the population mean number of days in the past 7 days that people have felt lonely.
5.34. (a) The confidence interval is 4.3 to 6.3. We are 95% confident that the interval 4.3 to 6.3 days contains the population mean length of stay for all inpatients in that hospital. (b) If the administrator wants the confidence interval to be half as wide, she needs to take a random sample of 400 records.
5.35. The necessary sample size is 157.

5.38. The sample size was about 1534.

5.39. The sample size was about 602.
5.41. We estimate the standard deviation to be (18 – 0)/6 = 3. The sample size calculation is , so a sample of size 35 is needed.
5.44. (a) , . (b) Since the number in each category (0 like tofu and 5 do not like tofu) is less than 15, we cannot use the large-sample formula for a confidence interval. An appropriate confidence interval uses and the 95% confidence interval is . We are 95% confident that the interval 0 to 0.49 contains the population proportion of students who like tofu.

5.47. a. You would expect about 95 of them to contain the parameter value.

5.48. a. With 95% confidence intervals, if the method worked properly, only about 5 of the 100 CIs would fail to contain the true parameter value.
5.49. (a) SPSS gives us the following output:
 Statistic Std. Error Mean 7.267 .8672 95% Confidence Interval for Mean Lower Bound 5.531 Upper Bound 9.002

We are 95% confident that the mean weekly number of hours spent watching TV is between 5.5 and 9.0 hours.

5.54. A 95% confidence interval for the population mean is 0 to 30.4 (actually, the lower bound is –10.43, but we report this as 0). Outliers have a tremendous impact on confidence intervals for means, since they affect both the mean and the standard deviation.
5.62. The formula for sample size for a mean tells us that the needed sample size is proportional to the population variance. More diverse populations have greater standard deviations, and thus the sample size needed to estimate the population mean to within a particular margin of error is larger.
5.66. (a)
5.67. (a)
5.68. (b)
5.69. (b) and (e) are correct
5.70. (a) A confidence interval for the mean is about the population mean, not the sample mean. (b) A confidence interval for the mean is about the population mean, not individuals. (c) We can actually be 100% confident that the sample mean is in the interval we construct (it is the midpoint of the interval). (d) This statement implies that the population mean changes.
5.71. We are 95% confident that the interval 21.5 to 23.0 years contains the mean age at first marriage of women in a certain country.
5.73. Since , . If we know of the observations, we can add these up and subtract from to find the value of the remaining observation.
5.76. (a) If one flipped a coin twice, the possible responses would be (H,H), (H,T), (T,H), (T,T), each with probability 1/4. Thus, the possible outcomes (H,H) and (H,T) each have probability 1/4. Of the half who flip a tail the first time, the probability equals π of reporting heads for the second flip (since this is the probability of yes) and (1 – π) of reporting tails. Since this happens for half the population, the overall probabilities are π/2 and (1 – π)/2. (b) The expected proportion of heads for the second response equals 0.25 + π/2. If we set and solve for π, we get the estimate . (c) (i) p = 50/200 = 0.25, so = 2(0.25) – 0.50 = 0; (ii) p = 70/200 = 0.35, so = 2(0.35) – 0.50 = 0.20; (iii) p = 100/200 = 0.50, so = 2(0.50) – 0.50 = 0.50; (iv) p = 150/200 = 0.75, so = 2(0.75) – 0.50 = 1.
5.77. Given and n = 20, . Squaring both sides gives us . This equation simplifies to . The roots that solve this equation are and .