Chapter 12 Section 1
Homework Set B
12.23 The importance of recreational sports to college satisfaction. The National IntramuralRecreational Sports Association (NIRSA) performed a survey to look at the value of recreational sports on college campuses.^{6} One of the questions asked each student to rate the importance of recreational sports to college satisfaction and success. Responses were on
a 10point scale with 1 indicating total lack of importance and 10 indicating very high importance. The following table summarizes these results:
Class

n

xbar

Freshman

724

7.6

Sophomore

536

7.6

Junior

593

7.5

Senior

437

7.3
 
To compare the mean scores across classes, what are the degrees of freedom for the ANOVA F statistic?
Numerator d.f. = 4 – 1 = 3
Denominator d.f. = 724 + 536 + 593 + 437 – 4 = 2286
(b) The MSG = 11.806. If S_{p} = 2.16, what is the F statistic?
F = = 2.53 Recall that S_{p}^{2} = MSE.
(c) Give the Pvalue by using Excel ,Fdist(f, df num, df denom), or the Fcalculator found at http://www.stat.tamu.edu/~west/applets/fdemo.html . What do you conclude?
Fdist(2.53, 3, 2286)
P(F > 2.53) = 0.05556 This suggests that we should consider the fact that at least one of the classes scored differently. Mainly it looks like the seniors have a different mean. Notice that the difference in means is tiny, and the pooled standard deviation suggest overlap between the data values. But the sample sizes are very large, and thus any small deviation from the perfect (no difference in means) can be detected. However, I have a feeling something is wrong here. I calculated xbar (the mean of all the data to be equal 7.51, and calculating MSG I get 9.810, which produces a pvalue of 0.0978.
BF4

BF5

BF6

499

490

585

620

395

647

469

402

477

485

177

445

660

475

485

588

617

703

675

616

528

517

587

465

649

528


209

518


404

370


738

431


628

518


609

639


617

368


704

538


558

519


653

506


548


 12.45 How long should an infant be breastfed?
Recommendations regarding how long infants in developing countries should be breastfed are controversial. If the nutritional quality of the breast milk is inadequate because the mothers are malnourished, then there is risk of inadequate nutrition for the infant. On the other hand, the introduction of other foods carries the risk of infection from contamination.
Further complicating the situation is the fact that companies that produce infant formulas and other foods benefit when these foods are consumed by large numbers of customers. One question related to this controversy concerns it amount of energy intake for infants who have other foods introduced into the diet at different ages. Part of one study compared the energy intakes, measured in kilocalories per day (kcal/d) for infants who were breastfed exclusively for 4, 5, or 6 months.'^{6} Here are the data:
(a) Make a table use data already typed into Excel, giving the sample size, mean and standard deviation (=qrt(variance) ) for each group of infants. Is it reasonable to pool the variances? Write down the table or use the copy and paste feature of our computer
SUMMARY





Groups

Count

Sum

Average

Variance

s

BF4

19

10830

570

15118.55556

122.9575

BF5

18

8694

483

12757.29412

112.9482

BF6

8

4335

541.875

8828.982143

93.96266

We do meet the rule that 2(93.96) > 122.95 thus it is not unreasonable to assume equal standard deviations, .
(b) Make a Normal quantile plot (Use CrunchIt!, upload data using file 12_45 at spot) for the data in each of the four treatment groups. Summarize the information in the plots and draw a conclusion regarding the Normality of these data. Make a copy of the plots.
The plot of BF6 is closest to being perfectly straight, indicating that the population that we are sampling from is close to a normal distribution. The other two are also close to straight except for the endpoints to the left. Since there is no pattern before the dip occurs that suggest a serious move away from a straight line, we will say that there is no strong evidence that the distributions we are sampling from are not close to normal.
(c) Make a dotplot of the data using Excel as done in class. Does the dotplot indicate that the means of all the groups are equal or does it indicate that at least one of the means is not equal?
Two extreme points from BF4 and BF5. A researcher would look at those two data values more carefully and try to understand why they are so much lower than the rest of the data points. I see that the means are very close together and that the spread of each are close to being equal. The amount of overlap in the dot plots along with the proximity of the means, and relatively small sample sizes, would indicate to me that the pvalue will be high leading to no evidence that the population means are not the same.
(d) Run the analysis of variance using Excel. Report the F statistic with its degrees of freedom and Pvalue. What do you conclude? Copy the table below.
ANOVA







Source of Variation

SS

df

MS

F

Pvalue

F crit

Between Groups

71288.325

2

35644.16

2.717910798

0.077625

3.219938

Within Groups

550810.875

42

13114.54











Total

622099.2

44





The pvalue of 0.0776 shows that my evidence is not strong against the null hypothesis. It would be interesting to see what would happen if I removed those two extreme points and ran the test again.
Culture

n

Mean

s

European American

16

4.39

1.03

Asian American

33

4.35

1.18

Japanese

91

4.72

1.13

Indian

160

4.34

1.26

Hispanic American

80

5.04

1.16
 1. Do we experience emotions differently? Do people from different cultures experience emotions differently? One study designed to examine this question collected data from 416 college students from five different cultures.^{9} The participants were asked to record, on a 1 (never) to 7 (always) scale, how much of the time they typically felt eight specific emotions. These were averaged to produce the global emotion score for each participant. Here is a summary of this measure:
(a) Is it reasonable to used a pooled standard deviation for these data? Why or why not?
Yes, since 2(1.03) > 1.26.

Draw a rough sketch denoting the location of the mean for each group and use the value of the sample standard deviation of each group to indicate how spread the data is.
I used roughly three standard deviations away from each sample mean.
(c) From the information given (and your sketch in (b) allowing you visualize the information), do you think that we need to be concerned that a possible lack of Normality in the data will invalidate the conclusions that we might draw using ANOVA to analyze the data? Give reasons for your answer.
The data is not normal because the measurements are discrete (the only possible numbers are the integers 1 through 7) similar to the chapter 8 situation when dealing with proportions. Also the means hover around 4 and the standard deviations around one, so if you use three standard deviations away from the mean to encapsulate 99.7% of the data (about) you reach the ends of the possible values in our measurements. Thus, you hope the sample size is large enough to overcome the measurement type. The sample sizes of 16 and 33 are the more worrisome of the five.
(d) Fill out the table given below. Sketch a picture of the F distribution (using CrunchIt!) that illustrates the Pvalue. What do you conclude? Show your work.
How to calculate the sample mean of entire data set regardless of group.

d.f.

SS

MS

F

P

Group

4

30.25

7.56

5.31

0.000361

Error

375

534.12

1.42



= 4.58
SSG = 16(4.39 – 4.58)^{2} + 33(4.35 – 4.58)^{2} + 91(4.72 – 4.58)^{2} + 160(4.35 – 4.58)^{2 }+ 80(5.04 – 4.58)^{2}
= 30.25
SSE = 15(1.03)^{2} + 32(1.18)^{2} + 90(4.72)^{2} +159(1.26)^{2} +79(1.16)^{2} = 534.12
16+ 33 + 91 + 160 + 80 = 380
P(F > 5.31) = 0.000361 The result says that at least one of the means is different.
(e) Without doing any additional formal analysis, describe the pattern in the means that appears to be responsible for your conclusion in part (d). Are there pairs, of means that are quite similar?
The Hispanic American group has the largest mean and the second is the Japanese group.
2. If a supermarket product is offered at a reduced price frequently, do customers expect the price of the product to be lower in the future? This question was examined by researchers in a study conducted on students enrolled in an introductory management course at a large Midwestern University. For 10 weeks subjects received information about the products. The treatment conditions corresponded to the number of promotions (1, 2, 3, or 4) that were described during this 10week period. Students were randomly assigned to four groups.^{ } Below are three possible outcomes of this study. Which one do you think produces the smallest pvalue and why?
Column

n

Mean

1

20

4.1405

2

20

4.027

3

20

3.828

4

20

3.583

Column

n

Mean

1

40

4.224

2

40

4.06275

3

40

3.759

4

40

3.54875

Column

n

Mean

1

7

4.257143

2

7

4.04

3

7

3.7042856

4

7

3.602857

The group with the largest sample size should produce the smallest pvalue. Why? Notice that the means of each group from the three situations are about the same, thus, SSG for each group is about the same. But SSE, is created by the standard deviation of each group (s) which is about the same in the same situations, but, the sample size change produces a different degrees of freedom.
3. Is there a relationship between the amount of time a battery lasts measured in minutes and the battery manufacturer? Four different manufacturers of batteries were tested under the same conditions.
Culture

n

Mean

s

Manufacturer 1

10

265.31

5.32

Manufacturer 2

10

277.2

4.18

Manufacturer 3

10

268.2

5.13

Manufacturer 4

10

275.03

4.26

(a) Is it reasonable to used a pooled standard deviation for these data? Why or why not?
(b) Fill out the table given below. Sketch a picture of the F distribution (using CrunchIt!) that illustrates the Pvalue. What do you conclude? Show your work.
How to calculate the sample mean of entire data set regardless of group.

d.f.

SS

MS

F

P

Group






Error






= 4.58
4. Is there a relationship between the amount of time a battery lasts measured in minutes and the battery manufacturer? Four different manufacturers of batteries were tested under the same conditions.
Culture

n

Mean

s

Manufacturer 1

100

265.31

5.32

Manufacturer 2

100

277.2

4.18

Manufacturer 3

100

268.2

5.13

Manufacturer 4

100

275.03

4.26

(a) Is it reasonable to used a pooled standard deviation for these data? Why or why not?
(b) Fill out the table given below. Sketch a picture of the F distribution (using CrunchIt!) that illustrates the Pvalue. What do you conclude? Show your work.
How to calculate the sample mean of entire data set regardless of group.

d.f.

SS

MS

F

P

Group






Error





 