Paired Samples versus Independent Samples
Paired Design 1
With paired data, we are interested in comparing the responses within each pair. We will analyze the differences of the responses that form each pair.
Paired Data: Response = Annual Salary (in $1000s)
Paired Design 2
DEFINITION:
We have paired or matched samples when we know, in advance, that an observation in one data set is directly related to a specific observation in the other data set. It may be that the related sets of units are each measured once (Paired Design 1), or that the same unit is measured twice (Paired Design 2). In a paired design, the two sets of data must have the same number of observations.
Independent Samples Design
Independent Samples Data:
Response = Annual Salary (in $1000s)
In the two independent samples scenario, we will compare the responses of one treatment group as a whole to the responses of the other treatment group as a whole. We will calculate summary measures for the observations from one treatment group and compare them to similar summary measures calculated from the observations from the other treatment group.
DEFINITION:
We have two independent samples when two unrelated sets of units are measured, one sample from each population, as in Independent Samples Design 11.3. In a design with two independent samples, although the same sample size is often preferable, the sample sizes might be different.
Let’s Do It! Paired Samples versus Independent Samples

Three hundred registered voters were selected at random, 30 from each of 10 midwestern counties, to participate in a study on attitudes about how well the president is performing his job. They were each asked to answer a short multiplechoice questionnaire and then they watched a 20minute video that presented information about the job description of the president. After watching the video, the same 300 selected voters were asked to answer a followup multiplechoice questionnaire. The investigator of this study will have two sets of data: the initial questionnaire scores and the followup questionnaire scores. Is this a paired or independent samples design?
Circle one: Paired Independent
Explain:
(b) Thirty dogs were selected at random from those residing at the humane society last month. The 30 dogs were split at random into two groups. The first group of 15 dogs was trained to perform a certain task using a reward method. The second group of 15 dogs was trained to perform the same task using a rewardpunishment method. The investigator of this study will have two sets of data: the learning times for the dogs trained with the reward method and the learning times for the dogs trained with the rewardpunishment method. Is this a paired or independent samples design?
Circle one: Paired Independent
Explain:
Let’s Do It! 2 D
p. 676
esign a Study
For each of the following research questions, briefly describe how you might design a study to address the question (discuss whether paired or independent samples would be obtained):
(a) Do freshmen students use the library to study more often than senior students?
(b) Do books cost more on average at the local bookstore or through Amazon.com?
(c) Will taking summer school improve reading levels for Kindergarteners going into first grade?
Paired Samples
In a paired design, units in each par are alike (in fact, they may be the same unit), whereas units in different pairs may be quite dissimilar.
Since we are interested in the difference for each pair, the differences are what we analyze in paired designs.
Example Weight Change
A study was conducted to estimate the mean weight change of a female adult who quits smoking. The weights of eight female adults before they stopped smoking and five weeks after they stopped smoking were recorded. The differences, computed as “after before,” are given below.
Subject 
1

2

3

4

5

6

7

8
 After 
154

181

151

120

131

130

121

128

Before

148

176

153

116

129

128

120

132

Difference

6

5

2

4

2

2

1

4

Here we have another example of a paired design.
(a) Compute the sample mean difference in weight.
(b) Compute the sample standard deviation of the differences.
Solution
(a) The sample mean difference is =1.75 pounds. Note that the differences computed as “after  before” represent the weight gain for a subject. A positive value indicates weight gain and a negative value indicates a weight loss.
(b) The sample standard deviation is SD =3.412 pounds
The Paired TTest
Paired tTest
Hypotheses:
_{ } versus _{ } or
_{ } versus _{ } or
_{ } versus _{ }.
Data: The sample of n differences, generically written as from which the sample mean difference and the sample standard deviation of the differences can be computed.
Observed Test Statistic: and the null distribution for the T variable is a t_{(n1)} distribution.
pvalue: We find the pvalue for the test using the t(n  1) distribution.
The direction of extreme will depend on how the alternative hypothesis is expressed.
Decision: A pvalue less than or equal to leads to rejection of H_{0}
Notes:

If we are interested in assessing if _{D} is equal to some hypothesized value that is not 0, we would replace 0 in the test statistic expression with this other null value.

The test statistic is the same no matter how the alternative hypothesis is expressed.
p. 629
Example Comparing Test Scores
A group of 10 randomly selected children of elementary school age among those in the Mankato County who were recently diagnosed with asthma was tested to see if a new children’s educational video is effective in increasing the children’s knowledge about asthma. A nurse gave the children an oral test containing questions about asthma before and after seeing the animated video. The test scores are given below:
Child:

1 2 3 4 5 6 7 8 9 10

Mean = 60
Mean =64.5

Before:

61 60 52 74 64 75 42 63 53 56

After:

67 62 54 83 60 89 44 67 62 57

(a) Explain why we have paired data here and not two independent samples.
(b) We are interested in examining the differences in the scores for each child. Compute the differences and find the sample mean difference and the sample standard deviation of the differences.
(c) The researchers wish to assess if the data provide sufficient evidence to conclude that the mean score after viewing the educational video is significantly higher than the mean score before the viewing. The test will be conduced at the 5% level of significance. State the appropriate hypotheses to be tested in terms of the population mean difference in test scores .
(d) Compute the observed ttest statistic value.
(e) Find the corresponding pvalue.
(f) State the decision and conclusion using a 5% significance level.
Solution
(a) Since we have two observations from the same child, we have paired data.
(b) The observed differences computed here as are as follows: “afterbefore”.
Child:

1 2 3 4 5 6 7 8 9 10


d = After  Before

6 2 2 9 4 14 2 4 9 1

Mean diff =4.5

The first observed difference is 6 and is represented by d_{1}, and the last difference is also positive and is represented by d_{10} = 1. The observed sample mean difference is , which is our estimate of the unknown mean difference, . The observed sample standard deviation of the differences is , which is our estimate of the unknown population standard deviation .
(c) Since we defined our differences as diff =after  before, it is positive differences that would show some support that the video is effective in improving the mean test score. Thus the corresponding hypotheses to be tested are versus .
(d) The observed ttest statistic is given by .
This means we observed a sample mean difference that was about 2.78 standard errors above the hypothesized mean difference of zero.
Is this large enough (that is, far enough above zero) to reject the null hypothesis?

The pvalue is the probability of getting a test statistic as large as or larger than the observed test statistic of 2.78, computed using a tdistribution with nine degrees of freedom.
With the TI Using the tcdf: pvalue == tcdf(2.78, E99, 9) = 0.0107
Using the TTest function under the STAT TESTS menu.
In the TESTS menu located under the STAT button, we select the 2:TTest option. With the sample mean of 4.5, the sample standard deviation of 5.126, and the sample size of n = 10, we can use the Stats option of this test. The corresponding input and output screens are shown. Notice that the null or hypothesized value is zero.
pvalue == 0.01077.
(f) Decision and Conclusion
Since our pvalue is less than 0.05, at the significance level we would reject, and conclude there is sufficient evidence to say that the mean score after viewing the educational video is significantly higher than the mean score before the viewing.
Let’s Do It!
Two creams are available by prescription for treating moderate skin burns. A study to compare the effectiveness of the two creams is conducted using 15 patients with moderate burns on their arms. Two spots of the same size and degree of burn are marked on each patient’s arm. One of the two creams is selected at random and applied to the first spot, while the remaining spot is treated with the other cream. The number of days until the burn has healed is recorded for each spot. These data are provided with the difference in healing time (in days).
Consider the data and interval estimate for comparing the two burn cream treatments in
Patient Number

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Cream1= C1

16

2

10

7

6

10

5

4

19

7

12

9

10

20

12

Cream2= C2

14

4

10

4

5

12

5

6

23

10

12

7

11

24

10

Diff =C1 C2

2

2

0

3

1

2

0

2

4

3

0

2

1

4

2

We wish to test the claim that there is no difference between the two creams at the 5% significance level.
Homework page 344: 67, 68, 69, 72, 73, 76, 77 