More from Chapters 14, 15 and 16

The test assumes an SRS of size from a Normal population with known population standard deviation σ. Pvalues can be obtained either with computations from the standard Normal distribution or by using technology (calculator or software).

The essential reasoning of a significance test is as follows. Suppose for the sake of argument that the null hypothesis is true. If we repeated our data production many times, would we often get data as inconsistent with H_{0} as the data we actually have? Data that would rarely occur if H_{0} were true provide evidence against H_{0}.

The Pvalue of a test is the probability, computed supposing H_{0} to be true, that the test statistic will take a value at least as extreme as that actually observed. Small Pvalues indicate strong evidence against H_{0}. To calculate a Pvalue we must know the sampling distribution of the test statistic when H_{0} is true.

A specific confidence interval or test is correct only under specific conditions. The most important conditions concern the method used to produce the data. Other factors such as the shape of the population distribution may also be important.

Whenever you use statistical inference, you are acting as if your data are a random sample or come from a randomized comparative experiment.

Always do data analysis before inference to detect outliers or other problems that would make inference untrustworthy.

Other things being equal, the margin of error of a confidence interval gets smaller as
− the confidence level C decreases,
− the population standard deviation σ decreases,
− the sample size n increases.

The margin of error in a confidence interval accounts for only the chance variation due to random sampling. In practice, errors due to nonresponse or undercoverage are often more serious.

There is no universal rule for how small a Pvalue in a test of significance is convincing evidence against the null hypothesis. Beware of placing too much weight on traditional significance levels such as α = 0.05.

Very small effects can be highly significant (small P) when a test is based on a large sample. A statistically significant effect need not be practically important. Plot the data to display the effect you are seeking, and use confidence intervals to estimate the actual values of parameters.

On the other hand, lack of significance does not imply that H_{0} is true. Even a large effect can fail to be significant when a test is based on a small sample.

If many tests are run, you will probably produce some significant results by chance alone, even if all the null hypotheses are true.

When you plan a statistical study, plan the inference as well. In particular, ask what sample size you need for successful inference.

The z confidence interval for a Normal mean has specified margin of error when the sample size is. Here is the critical value for the desired level of confidence. Always round up when you use this formula.
Student study time. A class survey in a large class for firstyear college students asked, “About how many minutes do you study on a typical weeknight?” The mean response of the 269 students was = 137 minutes. Suppose that we know that the study time follows a Normal distribution with standard deviation σ = 65 minutes in the population of all firstyear students at this university.
(a) Use the survey result to give a 99% confidence interval for the mean study time of all firstyear students.
(b) What condition not yet mentioned must be met for your confidence interval to be valid?
Student study times. Exercise 14.34 describes a class survey in which students claimed to study an average of = 137 minutes on a typical weeknight. Regard these students as an SRS from the population of all firstyear students at this university. Does the study give good evidence that students claim to study more than 2 hours per night on the average?
(a) State null and alternative hypotheses in terms of the mean study time in minutes for the population.
(b) What is the value of the test statistic z?
(c) What is the Pvalue of the test? Can you conclude that students do claim to study more than two hours per weeknight on the average?
I want more muscle. Young men in North America and Europe (but not in Asia) tend to think they need more muscle to be attractive. One study presented 200 young American men with 100 images of men with various levels of muscle.^{7} Researchers measure level of muscle in kilograms per square meter (kg/m^{2}) of fatfree body mass. Typical young men have about 20 kg/m^{2}. Each subject chose two images, one that represented his own level of body muscle and one that he thought represented “what women prefer.” The mean gap between selfimage and “what women prefer” was 2.35 kg/m^{2}. 14.41 I want more muscle. If young men thought that their own level of muscle was about what women prefer, the mean “muscle gap” in the study described in Exercise 14.35 would be 0. We suspect (before seeing the data) that young men think women prefer more muscle than they themselves have.
Suppose that the “muscle gap” in the population of all young men has a Normal distribution with standard deviation 2.5 kg/m^{2}.
(a) State null and alternative hypotheses for testing this suspicion.
(b) What is the value of the test statistic z?
(c) You can tell just from the value of z that the evidence in favor of the alternative is very strong (that is, the Pvalue is very small). Explain why this is true.
ANSWERS (a) H_{0}: µ = 0; H_{a}: µ < 0. (b) z 13.29. (c) This is far outside the range we would expect from a N(0, 1) distribution (more than 3 or 5 standard deviations away from the mean).
Pulling wood apart. How heavy a load (pounds) is needed to pull apart pieces of Douglas fir 4 inches long and 1.5 inches square? Here are data from students doing a laboratory exercise:

We are willing to regard the wood pieces prepared for the lab session as an SRS of all similar pieces of Douglas fir. Engineers also commonly assume that characteristics of materials vary Normally. Make a graph to show the shape of the distribution for these data. Does the Normality condition appear safe? Suppose that the strength of pieces of wood like these follows a Normal distribution with standard deviation 3000 pounds.
(b) Give a 90% confidence interval for the mean load required to pull the wood apart.
Pulling wood apart. Exercise 14.50 gives data on the pounds of load needed to pull apart pieces of Douglas fir. The data are a random sample from a Normal distribution with standard deviation 3000 pounds.
(a) Is there significant evidence at the α = 0.10 level against the hypothesis that the mean is 32,000 pounds for the twosided alternative?
(b) Is there significant evidence at the α = 0.10 level against the hypothesis that the mean is 31,500 pounds for the twosided alternative?
Pulling wood apart. You want to estimate the mean load needed to pull apart the pieces of wood in Exercise 14.50 (page 389) to within ±1000 pounds with 95% confidence. How large a sample is needed?
Tests from confidence intervals. A confidence interval for the population mean µ tells us which values of µ are plausible (those inside the interval) and which values are not plausible (those outside the interval) at the chosen level of confidence. You can use this idea to carry out a test of any null hypothesis H_{0}: µ = µ_{0} starting with a confidence interval: reject H_{0} if µ_{0} is outside the interval and fail to reject if µ_{0} is inside the interval.
The alternative hypothesis is always twosided, H_{a}: µ ≠ µ_{0}, because the confidence interval extends in both directions from x. A 95% confidence interval leads to a test at the 5% significance level because the interval is wrong 5% of the time. In general, confidence level C leads to a test at significance level α = 1 − C.
(a) In Example 14.9, a medical director found mean blood pressure = 126.07 for an SRS of 72 executives. The standard deviation of the blood pressures of all executives is σ = 15. Give a 90% confidence interval for the mean blood pressure µ of all executives.
(b) The hypothesized value µ_{0} = 128 falls inside this confidence interval. Carry out the z test for H_{0}: µ = 128 against the twosided alternative. Show that the test is not significant at the 10% level.
(c) The hypothesized value µ_{0} = 129 falls outside this confidence interval. Carry out the z test for H_{0}: µ = 129 against the twosided alternative. Show that the test is significant at the 10% level.
A test goes wrong. Software can generate samples from (almost) exactly Normal distributions. Here is a random sample of size 5 from the Normal distribution with mean 10 and standard deviation 2:

6.47

7.51

10.10

13.63

9.91

These data match the conditions for a z test better than real data will: the population is very close to Normal and has known standard deviation σ = 2, and the population mean is μ = 10. Test the hypotheses
H_{0} : μ = 8
H_{a} : μ ≠ 8
(a) What are the z statistic and its Pvalue? Is the test significant at the 5% level?
(b) We know that the null hypothesis does not hold, but the test failed to give strong evidence against H_{0}. Explain why this is not surprising.

z 1.70, P = 0.0891. (b) The sample size is small, so the test has low power.
Pulling wood apart. How heavy a load (pounds) is needed to pull apart pieces of Douglas fir 4 inches long and 1.5 inches square? Here are data from students doing a laboratory exercise:
(a) We are willing to regard the wood pieces prepared for the lab session as an SRS of all similar pieces of Douglas fir. Engineers also commonly assume that characteristics of materials vary Normally. Make a graph to show the shape of the distribution for these data. Does the Normality condition appear safe? Suppose that the strength of pieces of wood like these follows a Normal distribution with standard deviation 3000 pounds.
(b) Give a 90% confidence interval for the mean load required to pull the wood apart.
The data are a random sample from a Normal distribution, but suppose that the standard deviation was unknown, but we still want to find out if the mean strength is different from 32,000 pounds? What do you think we should do?
Find a 90, 95, 99% CI for the average load of a Douglas fir in this example.
Is the mean different from 32,000 pounds?
Draw an SRS of size n from a large population having unknown mean µ. A level C confidence interval for µ is
where t* is the critical value for the t(n − 1) density curve with area C between −t* and t*. This interval is exact when the population distribution is Normal and is approximately correct for large n in other cases. You can use the ttable or use the InvT function on your calculator (under distribution) to find the t critical values.
tdistribution with n Degrees of Freedom: T(n)
P (a ≤ X ≤ b)

tcdf (a, b, n)

CONDITIONS FOR INFERENCE ABOUT A MEAN

We can regard our data as a simple random sample (SRS) from the population. This condition is very important.

Observations from the population have a Normal distribution with mean µ and standard deviation σ. In practice, it is enough that the distribution be symmetric and singlepeaked unless the sample is very small. Both μ and σ are unknown parameters.
USING THE t PROCEDURES

Except in the case of small samples, the condition that the data are an SRS from the population of interest is more important than the condition that the population distribution is Normal.

Sample size less than 15: Use t procedures if the data appear close to Normal (roughly symmetric, single peak, no outliers). If the data are clearly skewed or if outliers are present, do not use t.

Sample size at least 15: The t procedures can be used except in the presence of outliers or strong skewness.

Large samples: The t procedures can be used even for clearly skewed distributions when the sample is large, roughly n ≥ 40.
