The student achievement guarantee



Download 0.91 Mb.
Page2/8
Date conversion15.02.2016
Size0.91 Mb.
1   2   3   4   5   6   7   8

Table 5. Number of SAGE Classrooms by Type, Grade, and School Year

Regular 2-Teacher

Team

Floating

Teacher

Shared

Space

Split

Day

3-Teacher

Team

96-97 97-98 96-97 97-98 96-97 97-98 96-97 97-98 96-97 97-98 96-97 97-98

Kindergarten 50 89 24 22 3 2 2 4 0 0 1 0

Grade 1 61 84 18 23 7 2 8 8 2 0 0 1

Grade 2 NA 82 NA 21 NA 3 NA 6 NA 0 NA 1



Data Collection Instruments

To provide information about the processes and product of the SAGE program for 1996-

97 and 1997–98, a number of instruments were used as part of the evaluation.1 A description of

the test and non-test instruments used in 1996-97 and 1997-98 follows. The data collection

instruments and the plan for their use throughout the evaluation are displayed in Tables 6 and 7.

1. Comprehensive Test of Basic Skills (CTBS). The Comprehensive Test of Basic Skills

(CTBS) complete Battery, Terra Nova edition, Level 10, was administered to first

grade students in SAGE schools and comparison schools in October 1996 and May

1997. In 1997-98, level 10 was administered in October and Level 11 in May to firstgrade

students and level 12 to second-grade students. The purpose of the first-grade

October administration of the CTBS was to obtain baseline measures of achievement

for SAGE schools and comparison schools. The complete battery includes sub-tests

1See the Evaluation Design Plan for the Student Achievement Guarantee in Education (SAGE) Program, August 13,

1996, for complete details.

15

in reading, language arts, and mathematics. The CTBS was chosen as an



achievement measure because it is derived from an Item Response Theory (IRT)

model that allows comparison of performance across time. Moreover, it is one of a

few instruments that attempts to minimize items biased against minorities and

educationally disadvantaged students. Kindergarten students were not tested because

of (1) concerns over the reliability and validity of standardized test results for

kindergarten-aged children and (2) the view expressed by many kindergarten teachers

that standardized tests would have a traumatizing effect on their students. The effects

of SAGE on kindergarten students will be determined when they are tested as firstgrade

students the following year.

Table 6. Cohort CTBS Testing by Grade Level 1996-01

1996-97 1997-98 1998-99 1999-00 2000-01

K K K K K

Cohort 1 Cohort 2 Cohort 3

1 (fall & spring) 1(fall & spring) 1(fall & spring) 1 1

2(spring) 2(spring) 2(spring) 2

3(spring) 3(spring) 3(spring)

2. Student Profiles. This instrument completed in October and May, provided

demographic and other data on each SAGE school and comparison school student.

3. Classroom Organization Profile. Completed in October, this instrument was used to

record how SAGE schools attained a 15:1 student-teacher ratio.

4. Principal Interviews. These end-of-year interviews elicited principals' descriptions

and perceptions of effects of their schools' rigorous curriculum, lighted-schoolhouse

activities, and staff development program, as well as an overall evaluation of the

SAGE program.

16

5. Teacher Questionnaire. Administered in May, this instrument obtained teachers'



descriptions and judgments of the effects of SAGE on teaching, curriculum, family

involvement, and professional development. It also was used to assess overall

satisfaction with SAGE.

6. Teacher Activity Log. This instrument required teachers to record classroom events

concerning time use, grouping, content, and student learning activities for a typical

day three times during the year.

7. Student Participation Questionnaire. In both October and May, teachers used this

instrument to assess each student's level of participation in classroom activities.

8. Classroom Observations. A group of first-grade and second-grade classrooms

representing the various types of 15:1 student-teacher ratios and a range of

geographic areas was selected for qualitative observations to provide descriptions of

classroom events.

9. Teacher Interviews. Although in-depth teacher interviews were not part of the

original SAGE evaluation design, they were added in 1997 because it became

apparent that teachers had important stories to tell about their SAGE classroom

experiences. The interviews dealt with teachers' perceptions of the effects of SAGE

on their teaching and on student learning.

17

Table 7. SAGE Non-Test Data Collection by Grade Level, 1996–01



1996–97 1997–98 1998–99 1999-2000 2000-2001

Student Participation

Questionnaire

Fall, Spring

K, 1 K, 1, 2 K, 1, 2, 3 K, 1, 2, 3 K, 1, 2, 3

Teacher Questionnaire

Spring

K, 1 K, 1, 2 K, 1, 2, 3 K, 1, 2, 3 K, 1, 2, 3



Teacher Log

Fall, Winter, Spring

K, 1 K, 1, 2

Classroom Observation

Fall, Spring

1

(Selected)



1, 2,

(Selected)

Teacher Interview

Spring


1

(Selected)

1, 2

(Selected)



Principal Interview

Spring


K, 1 K, 1, 2

School Case Study

Continuous

1, 2, 3


(Selected)

1, 2, 3


(Selected)

1, 2, 3


(Selected)

Principal Questionnaire

Spring

K, 1, 2, 3 K, 1, 2, 3 K, 1, 2, 3



18

ANALYSES OF STUDENT ACHIEVEMENT OUTCOMES 1997-98



Methods Introduction

Statistics Utilized

The 1997-98 SAGE evaluation design utilizes descriptive statistics and multivariate

inferential statistics, including linear regression and hierarchical linear modeling. Descriptive

statistics, including means and standard deviations, are incorporated into this report to provide a

less complicated, general analysis which the non-technical reader can use as a basis to interpret

the findings. Regression analyses (at the individual level), specifically the use of ordinary least

squares regression models, are employed frequently in this 1997-98 report. Regression models

enable “control” variables to be entered in blocks with the variable of interest, i.e. the

“SAGE/Comparison” variable entered last thus isolating its effects from the other variables.

Finally, hierarchical linear modeling is pertinent to the SAGE evaluation because this technique

focuses on the class effects of SAGE; that is, these analyses will specifically assess classroom

effects rather than those of individuals within the classroom. The classroom effects examined by

this approach are of primary importance to the SAGE evaluation.

The 1996-97 Report

In its 1996-97 evaluation, the SAGE evaluation team also utilized descriptive statistics

and multivariate analyses, including linear regression and hierarchical linear modeling.

However, there are two essential differences between the 1997-98 quantitative evaluation and the

1996-97 quantitative evaluation. First, the 1996-97 report included national percentile scores as

well as normal curve equivalent scores. National percentile scores are not reported in the 1997-

98 summary because the use of national percentile scores in regression analysis is potentially

misleading due to the non-equal interval nature of this scale. Instead, normal curve equivalents

are included in the descriptive sections of the current report to help clarify the analytical results.

19

Normal curve equivalents are not reported among the inferential analyses because the results of



such analyses would be redundant with those analyses utilizing the scale scores. Second,

sections of the 1996-97 report presented analyses based on the exclusion of the top scoring

quartile because the post-test given to 1996-97 first graders proved to be too easy, which in

essence created a test ceiling effect for top scoring students at this grade level. However, this

problem was corrected in the 1997-98 testing with an appropriate post-test level, and therefore

the inclusion of these analyses is not necessary (there was no ceiling effect).

General Findings 1996-97

Some general findings from 1996-97 quantitative analysis show that first-grade

classrooms in SAGE schools scored higher on the CTBS Complete Battery, Terra Nova Level 10

than first-grade students in comparison schools. As a group, when adjusted for pre-test scores,

SAGE students scored significantly higher on the post-test in the areas of reading, language arts,

and mathematics as well as total score. At the individual level of analysis, after controlling for

pre-test score, SES, attendance, and race, SAGE first-grade students scored statistically

significantly higher than comparison school students on the CTBS post-test in the areas of

language arts and mathematics as well as total score. At the class level of analysis, SAGE

classrooms scored significantly higher in language arts, mathematics, and reading as well as total

score after adjusting for individual pre-test results, SES, and attendance.

Score Metrics 1997-98

A brief discussion of the metrics reported in the 1997-98 SAGE evaluation is warranted.

The SAGE report presents the findings using two metrics, scaled scores and normal curve

equivalents. A scaled score provides a means for comparison across subjects or groups on a

specific task or trait. A scaled score provides a common yardstick by which scores may be

compared reasonably, subject to subject or group to group. The primary reason scaled scores are

20

used in the SAGE quantitative analysis is to anchor the scores from test level to test level (level



10, 11, etc.) so that year-to-year results can be compared.

When comparing the scores to those of other individuals (or groups) to obtain meaning,

we make a norm-referenced interpretation. Here the use of normal curve equivalents is useful.

A norm-referenced interpretation involves comparing a person’s score with those of some

relevant group of people. The normal curve equivalent scale ranges from 1 to 100 and thus

provides a comparative index of the performance of an individual or group to the reference

group. In this case, the reference group is the Terra Nova norm reference group (for norm

referencing population data see (CTB/McGraw-Hill, 1991). Normal curve equivalents are

generally not good indicators of longitudinal progress, however. With these scores, the group

average could remain at, for example 50, across pre-test and post-test with the reader erroneously

concluding that no gain was made. Actually, the focus group, in this example, did not “gain”

more than the reference group and thus the score remained constant.

Structure of 1997-98 Report

The descriptive analyses utilize both scale scores and normal curve equivalents. The

inferential analyses (regressions and hierarchical linear models) utilize only scale scores. For the

inferential tests, a significance level of .05 was used and significant results are denoted by an

asterisk (*). SAGE versus comparison analyses are divided into two major sections: (1) First-

Grade Results and (2) Second-Grade Results. The following are delineated within each of these

sections: (1) descriptive statistics (pre-test and post-test), (2) ordinary least squares regressions,

(3) analyses of the scores of African-American students, and (4) hierarchical linear modeling.

In addition, the quantitative section includes “within SAGE” analyses for first-grade

students. SAGE student achievement is examined in relation to teacher experience, student

participation, proximity to curriculum, and class organization.

21

SAGE School/Classroom vs. Comparison School/Classroom Analyses

First-Grade Results 1997-98

Descriptive Statistics

Valid Test Scores. The number of first-grade students for whom the valid test scores are

available is substantially less than the total number of students. There are four main explanations

for this. First, the evaluation team presented schools with the option of allowing EEN and ESL

students to take the test, even though the test may be inappropriate for these students. These

scores were invalidated based on a “Nonvalid/Missing Test Report,” developed by the evaluation

team and completed for all first grade classes. Second, given withdrawals and enrollments

during the school year, a number of students had valid pre-test scores, but no post-test scores and

vice versa. Third, some students took the reading and language arts components of the CTBS, or

the mathematics component, but not both. Consequently, total scores are unavailable for these

students. Finally, some of the students did not complete the pre-test, post-test, or both the preand

post-tests. The number of valid test scores for the 1997-98 school year are presented in

Table 8.


Table 8. Number of 1997-98 First-Grade Students with Valid Test Scores

Fall 1997

Pre-Test

Spring 1998

Post-Test

Total SAGE Comparison Total SAGE Comparison

Reading 2246 1383 863 Reading 2162 1318 844

Language Arts 2245 1383 862 Language Arts 2163 1319 844

Mathematics 2239 1382 857 Mathematics 2175 1334 841

Total 2211 1367 844 Total 2140 1310 829

Pre-Test (Baseline) Results. Table 9 provides descriptive statistics from the pre-test

(baseline) results. Both Scale Scores and Normal Curve Equivalents are presented. Given the

22

longitudinal nature of the SAGE evaluation, scale scores serve as the primary measure of student



achievement.

Table 9. Combined SAGE and Comparison Population Descriptive Statistics on CTBS PRETEST

Results for 1997-98 First-Grade Students



SCALE SCORES NORMAL CURVE EQUIVALENT

MEAN STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Reading 533.99 36.31 44.47 19.86

Language Arts 529.84 43.62 43.73 21.34

Mathematics 492.58 41.04 43.28 19.11

Total 519.20 34.59 43.31 19.11

Difference of Means Test. The results from difference of means tests between SAGE and

comparison student scale scores from the Fall 1997 CTBS Level 10 Pre-Test are reported in

Tables 10-13. Comparison school students scored slightly higher than SAGE school students on

the reading sub-test, mathematics sub-test, and total scale, and slightly lower on the language arts

sub-test. However, none of these differences is statistically significant at the .05 level. We fail

to reject the null hypothesis that there is no difference between SAGE and comparison school

students on the pre-test. As a result of SAGE and comparison students essentially being equal in

achievement at the beginning of the SAGE program, any differences in the post-test scores

benefiting SAGE students may be more assuredly attributed to the student-teacher ratio of 15:1

in the SAGE classroom.

Table 10. Differences of Means Test on Language CTBS FALL PRE-TEST for 1997-98 First-

Grade Students



SCALE SCORES NORMAL CURVE EQUIVALENTS

N MEAN STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Comparison

Schools

862 528.97 43.39 43.25 21.13



SAGE

Schools

1383 530.50 43.78 44.08 21.48

*Significant at .05 level

23

Table 11. Differences of Means Test on Reading CTBS FALL PRE-TEST for 1997-98 First-

Grade Students

SCALE SCORES NORMAL CURVE EQUIVALENTS

N MEAN STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Comparison

Schools

863 535.06 36.18 45.21 19.10



SAGE

Schools

1383 533.35 36.43 44.02 20.33

*Significant at .05 level

Table 12. Difference of Means Test on Mathematics CTBS FALL PRE-TEST for 1997-98 First-

Grade Students



SCALE SCORES NORMAL CURVE EQUIVALENTS

N MEAN STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Comparison

Schools

857 493.02 38.38 43.36 18.15



SAGE

Schools

1382 492.34 42.51 43.25 19.66

*Significant at .05 level

Table 13. Difference of Means Test on Total CTBS FALL PRE-TEST for 1997-98 First-Grade

Students


SCALE SCORES NORMAL CURVE EQUIVALENTS

N MEAN STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Comparison

Schools

844 519.51 33.35 43.47 18.34



SAGE

Schools

1367 519.06 35.34 43.25 19.56

*Significant at .05 level

As noted above, student populations varied in SAGE and comparison schools due to

withdrawals and within-year enrollments. The post-test results are based only on those firstgrade

students who remained in their schools for the entire 1997-98 school year. CTBS allows

for measurement of performance over time and therefore pre-test and post-test scores are

comparable from a measurement position. The CTBS Complete Battery, Terra Nova Level 10

24

was administered to first-grade students in the fall and the CTBS Complete Battery, Terra Nova



Level 11 was administered to first graders in the spring.

Results of the difference of means test between SAGE and comparison schools on the

CTBS Level 11 post-test are presented in Tables 14-17. Unlike the difference of means tests for

the CTBS Level 10 pre-test, which showed no statistically significant differences between SAGE

and comparison students, statistically significant differences are found in favor of SAGE

students for each sub-test, and for total scale scores on the post-test.



Table 14. Differences of Means Test on Language CTBS SPRING POST-TEST for 1997-98

First-Grade Students



SCALE SCORES NORMAL CURVE EQUIVALENTS

N MEAN* STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Comparison

Schools

844 573.98 46.84 50.07 21.53



SAGE

Schools

1319 586.02 45.33 55.78 21.17

*Significant at .05 level

Table 15. Differences of Means Test on Reading CTBS SPRING POST-TEST for 1997-98

First-Grade Students



SCALE SCORES NORMAL CURVE EQUIVALENTS

N MEAN* STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Comparison

Schools

844 570.80 45.52 47.81 21.87



SAGE

Schools

1318 580.33 41.33 52.50 20.77

*Significant at .05 level

Table 16. Differences of Means Test on Mathematics CTBS SPRING POST-TEST for 1997-98

First-Grade Students



SCALE SCORES NORMAL CURVE EQUIVALENTS

N MEAN* STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Comparison

Schools

841 525.14 42.53 45.21 19.90



SAGE

Schools

1334 538.63 40.09 51.72 19.24

*Significant at .05 level

25

Table 17. Difference of Means Test on Total CTBS SPRING POST-TEST for 1997-98 First-

Grade Students

SCALE SCORES NORMAL CURVE EQUIVALENTS

N MEAN* STANDARD

DEVIATION

MEAN STANDARD

DEVIATION

Comparison

Schools

829 556.87 38.83 47.54 21.01



SAGE

Schools

1310 568.63 36.66 53.91 20.17

*Significant at .05 level

The largest gain in SAGE student scores from pre-test to post-test, relative to

comparison school students, was on the mathematics sub-test shown in Table 18. The smallest

relative gain for SAGE students from pre-test to post-test was on the language arts sub-test.



Table 18. Change in Mean Score from PRE-TEST to POST-TEST for 1997-98 First-Grade

Students


Scale Scores Normal Curve Equivalents

SAGE Gain Comparison

Gain

Gain

Difference

SAGE Gain Comparison

Gain

Gain

Difference

Language


Arts

52.69 44.11 8.57* 10.33 6.40 3.93

Reading 45.32 34.99 10.33* 7.54 2.04 5.51

Mathematics 43.64 32.44 11.20* 7.30 1.91 5.39

Total 47.26 37.73 9.53* 9.36 4.11 5.25

*significant at .05 level

Regression Analysis

Regression Models. The effect of the SAGE program on student achievement, controlling

for other factors, was tested through a series of ordinary least squares regression models for each

sub-test and for total scale scores. Control variables were entered into the models in blocks, with

the SAGE/comparison student variable entered into the models last.

The first block of control variables included student score on the pre-test and school

attendance, measured as number of days absent, as reported by teachers in Spring 1998. The

second block of control variables included dummy variables for race/ethnicity, coded 1 if a

student was of a certain race/ethnicity, and 0 if not. Dummy variables were included for African

Americans and whites. A residual category, “other”, is included in the constant term in the

regression equations. Eligibility for subsidized lunch, as an indicator of family income, is also

26

included in the second block of control variables. This variable is coded 0 if student is ineligible,



1 if student is eligible for reduced price lunch, and 2 if the student is eligible for free lunch (this

variable is assumed to be interval level). In the final block, a dummy variable for SAGE or

comparison school student was entered on the third block. This variable is coded 0 if a student is

from a comparison school and 1 if a student is from a SAGE school.

Regression Results. Results of the regression analyses are presented in Tables 19-22.

For all analyses, membership in a SAGE school emerges as a significant predictor of student

achievement on the post-test, while controlling for pre-test scores, family income, school

attendance, and race/ethnicity. The magnitude of the effect of SAGE on student achievement, as

denoted by the “b” coefficient, varies depending on the CTBS sub-test.

The largest effects of SAGE are found on the on the language sub-test, while the smallest

effects of SAGE are found on the reading sub-test. When all cases are analyzed the goodness-offit

of the models (as denoted by the adjusted R square statistic), ranges from .270 (reading subscale

score) to .550 (total scale score). This means that when predicting the reading score and

total score, the variables included in the model explain 27% and 55% of the variance

respectively. Most of the variance in the post-test scores is, of course, explained by the pre-test

scores.


Explained Variance in Achievement Scores. Attendance (as represented by “days

absent”) emerges as a consistent and statistically significant predictor of performance on all subtests

and total scale score. “Family Income” and “Race” show some relatively large effects (as

denoted by the b coefficients), but the effects are highly variable and are only sometimes

statistically significant (race is discussed further below). Membership in SAGE schools has a

consistently positive, statistically significant effect on achievement on the CTBS.

27

1   2   3   4   5   6   7   8


The database is protected by copyright ©essaydocs.org 2016
send message

    Main page