Results of the regression analyses are presented in Tables 24 through 31. For all analyses, for each of the CTBS sub-tests and total score, membership in a SAGE school emerges as a significant predictor of student achievement on the post-test, while controlling for pre-test scores, family income, school attendance, and race/ethnicity. The magnitude of the effect of SAGE on student achievement, as denoted by the "b" coefficient, varies depending on the CTBS sub-test, and whether all students are analyzed, or the top scoring quartile on the pre-test is excluded.
Consistent with the difference of means tests, the largest effects of SAGE are found on the mathematics sub-test, while the smallest effects of SAGE are found on the reading sub-test.
However, unlike the difference of means tests, regression results show the effect of SAGE is consistently larger when the top scoring quartile on the pre-test is excluded. For example, Table 28 shows the effect of SAGE on mathematics. The model predicts that a SAGE student will score 3.876 scale points higher than a comparison school student, after controlling for pre-test scores, family income, school attendance, and race/ethnicity. In Table 29, where the top scoring quartile
on the mathematics pre-test is excluded, the model predicts that a SAGE student will score 4.63 scale points higher than a comparison school student, after controlling for pre-test scores, family income, school attendance, and race/ethnicity.
When all cases are analyzed the goodness-of-fit of the models (as denoted by the adjusted R2 statistic), ranges from .24 in reading to .48 for total scale score. That is, in predicting total scale score on the post-test, the variables included in the model explain 48 percent of the variance in total scale scores. Most of the variance in the post-test scores is explained of course by the pre-test scores. Family income and absenteeism emerge as consistent and statistically significant
predictors of performance on all sub-tests and total scale score. Race and ethnicity show some relatively large effects (as denoted by the b coefficients), but the effects are highly variable and are generally statistically insignificant (race is discussed further below)5.
When the top scoring quartiles on the pre-test are excluded from analyses, the magnitude of the SAGE effect (b) increases for all sub-tests and for total scale score. In the case of total scale score, for example, the estimated effect of SAGE membership on post-test performance is +4.60, as opposed to an estimated effect of 3.30 when all cases are analyzed. The relationship between SAGE and post-test scores is more variable however, when the top quartile is excluded (as denoted by the lower values of T). Indeed, the goodness-of-fit of the models (adjusted R2) is lower when the top quartile is excluded.
Whether all cases are analyzed, or the top scoring quartiles are excluded, membership in SAGE schools has a consistently positive, statistically significant effect on achievement on the CTBS.
African American Students
A precursor to the SAGE program is the Tennessee STAR experiment in reduced class size, a statewide initiative involving 7000 students over four years in grades kindergarten though grade 3. One of the conclusions reached in the Tennessee experiment in reduced class size is that "the advantage of being in a small class is greater for minority students than for whites," (Finn and Achilles, 1990: 567). Note that no distinction is made among minority sub-groups. For reasons discussed earlier, analyses of SAGE effects by race and ethnicity are problematic, particularly for Asians, Hispanics, and Native Americans. Still, the "achievement gap" between white and minority students on standardized measures of achievement remains a source of considerable interest, among both scholars and policy makers.
Among minority students in SAGE and comparison schools, African Americans clearly comprise the largest sub-group -- roughly 25 percent of all SAGE students, and 33 percent of all comparison school students. The African American student population does not present the analytical problems of interpretation raised by other minority groups. In the analyses to follow, African American students are first compared across SAGE and comparison schools on CTBS sub-test and total scale scores. Second, African American students are compared to white students across SAGE and comparison schools on CTBS total scale scores.
Table 32 provides comparisons of means on the CTBS post-test, and change scores from pre-test to post-test. African American SAGE students scored higher than comparison school students on every sub-test, and on total scale score. The differences are, in the main, not statistically significant. The change scores, however, consistently favor SAGE students and are statistically significant. In other words, African American SAGE students scored lower on the CTBS pre-test than African American comparison school students, but made significantly larger gains than comparison school students from pre- to post-test, and surpassed African American comparison school students on the post-test.
Concern over the minority achievement gap on standardized tests has occasionally been focused on African American male students. Table 33 further distinguishes African American SAGE and comparison school students by gender. A clear pattern emerged during the first year of the SAGE program. African American male SAGE students attained comparable or higher change scores from pre- to post-test than African American female SAGE students. At year's end African American male and female students scored virtually the same on the CTBS post-test. This result is quite unlike the scenario in comparison schools, where change scores for females exceeded change scores for males on every sub-test, and on total scores. Thus at year's end comparison school females scored substantially higher than males on the CTBS post-test.
African American and White Student Achievement on the CTBS
African American students, as a group, scored significantly lower than white students, as a group, on the CTBS pre-test total scale score, as shown in Table 34. This result holds for both SAGE and comparison schools, though the gap between African Americans and whites is larger in SAGE schools. When all cases are analyzed, African American SAGE students achieved greater gains on the CTBS total scale score than white SAGE students from pre- to post-test, closing the achievement gap (though the gap remains statistically significant). In contrast, African Americans in comparison schools achieved lesser gains and in comparison schools the achievement gap widened.
Given the ceiling effect discussed earlier, the analysis was repeated excluding the top scoring quartile on the pre-test total scale score. Regarding pre-test comparisons the achievement gap is narrower for both SAGE and comparison school students, though still statistically significant. Change scores, however, vary considerably between African American SAGE and comparison school students. In SAGE schools, African American students who performed at, or below, the 75th percentile on the pre-test achieved the same change score as white students who performed at, or below, the 75th percentile on the pre-test. An achievement gap remained, but grew no larger over the course of the first year of SAGE. In comparison schools, the achievement gap widened, as was observed when all cases where analyzed6.
Finally, the analysis was repeated excluding the top two scoring quartiles on the pre-test total scale score. The results are almost identical to those found when only the top quartile was excluded. Thus, even among the lowest scoring 50 percent of students on the pre-test, the achievement gap between African American and white students widened in comparison schools, but remained essentially unchanged in SAGE schools.
Hierarchical Linear Models
Many social science research endeavors involve hierarchical data structures. Hierarchical data structures are those in which individual units are nested within larger units, the latter being the unit of interest. The SAGE data are a prime example: students are nested within classrooms, and it is the classroom effect that is of particular interest to the SAGE project. Hierarchical data structures pose special analytical challenges in that data analysis at the individual level may result in a biased impression of the effect of the nesting unit (in the SAGE case, the classroom). At the root of this problem is the fact that different classrooms often contain a different number of students, thus those classrooms that contain a greater number of students have greater influence over the results of analyses done at the individual level. In general, if the effects of the nesting unit, the class, is of interest this is not a desirable outcome. An analytical approach known as
hierarchical linear modeling (Bryk & Raudenbush, 1992) is designed specifically to accommodate these types of data structures. Essentially, hierarchical linear modeling (HLM) estimates individual effects by analyzing data within each class and then provides a weighted average of these effects. The effects of the class are then estimated as if all classes contained the same number of students. HLM was used with the SAGE data to provide an alternative and less biased accounting of the initial effects of the SAGE experience on test scores. In these models variables associated with individual students are referred to as level-1 variables and those associated with the class are referred to as level-2 variables.
Analyses were conducted for each of the relevant criterion post-test scores: reading, mathematics, language arts, and total. For all analyses, the level-1 variables were pre-test score, socioeconomic status (SES) measured as eligibility for subsidized lunch, and attendance measured as number of days absent. The post-test scores were adjusted for these three variables at the individual level, therefore the effects may be thought of as being statistically independent of the effects of these variables. A number of different level-2 models, each containing different level-2 variables, were specified for each variable of interest.
Model A. Class Size
These models examined the effect of class size on the adjusted criterion score.
Model B. SAGE
These models examined the effect of SAGE participation on the adjusted criterion score.
Model C. Class Size, SAGE
These models examined the effect of SAGE participation on the adjusted criterion score after the classrooms were class size adjusted, viewed as the effect of SAGE participation beyond the class size effect.
Model D. Class SES, Class Size
These models examined the effect of class size on the adjusted criterion score after the classrooms were SES adjusted, be viewed as the effect of class size once the effects of the classroom SES are removed.
Model E. Class SES, SAGE
These models examined the effect of SAGE participation on the adjusted criterion score after the classrooms were SES adjusted, viewed as the effect of SAGE participation once the effects of the classroom SES are removed.
Model F. Class SES, Class Size, SAGE
These models examined the effect of SAGE participation on the adjusted criterion score after the classrooms were class size and SES adjusted, viewed as the effect of SAGE participation beyond the class size and SES effects.
It is important to note that the "class size" variable used in these analyses measures the number of students in each class, and not the student-teacher ratio. As discussed earlier, some SAGE classes contain a relatively large number of students (e.g., 30), and some comparison school classes contain a relatively small number of students (e.g., 16). Table 35 provides a summary of the effects of each of the level-1 and level-2 variables for each of these analyses. Level-1 effects can be interpreted as the weighted average of the within classroom effects of the level-1 variables. Level-2 effects can be interpreted as the classroom effects of the level-2 variables of interest. Level-1 coefficients may be thought of as the average effect of the modeling variable on the criterion score at the individual level. For example, for the total scale score, each day absent resulted in a .205 point drop on the post-test (-.205 coefficient) for the individual. These effects vary from classroom to classroom, however. Results for all three sub-tests and the total score are fairly consistent. On average, attendance loss resulted in a drop in post-test scores, lower SES resulted in lower scores and higher pre-test scores resulted in higher post test scores. No dramatic differences in these coefficients were observed across sub-test scores.
The coefficients associated with the level-2 variables can be thought of a classroom effects. For example in model A for the total score, an increase of one student in class size resulted in a drop of .879 points for the class average (-.879 coefficient). Likewise, SAGE participation resulted in a 6.397 point gain in the class average. A discussion of each model follows.
Model A. Depending on the test, an increase in class size of one person can be expected to produce a .5 to 1.2 point loss in average post test performance. The results for all scores shows this effect to be significant.
Model B. Participation in SAGE shows significant class average increases in post-test performance for the total score (6.4 points) and the math subscore (8.0 points). Results for the reading and language arts scores were somewhat below this and were not statistically significant.
Model C. Combining class size and SAGE participation in a single analysis isolates the effects that SAGE might have beyond those produced by lower class sizes. Again, with the exception of the language test, class size has a significant effect on class average post test performance. Once class size has been accounted for, SAGE has no significant effect on class average performance.
Model D. Since socio-economic status (SES) is known to have an influence on academic test scores, a surrogate for this variable was used as both a level-1 and level-2 predictor. The level-2 variable was the average SES for the class and estimates the effect of the overall class SES level beyond that associated with the individual, which is accounted for in the level-1 model. This model combines class SES and class size and the results indicate that both have a significant effect on class average post test performance. The effect of a 1 point class average gain in SES equates to between 16 and 21 points on the average post test score, depending on the test (keep in mind that SES was measured on a three point scale - thus a one point difference on average would quite pronounced). The effect of class size in this context is not much different than when entered alone (see model A).
Model E. When class SES and SAGE participation are entered in the same level-2 model, both variables have a significant effect on class average post test performance. In this context, SES has a slightly greater effect that in Model D, possibly indicating that SAGE participation and SES are less highly correlated than class size and SES. The effect of SAGE participation on class average post-test scores beyond those produced by SES differences ranges from about 7 points to about 12 points depending upon the sub-test. In general, these effects are larger than when SAGE is the sole variable in the model (see Model B). The likely explanation for this is that, in general, SAGE classrooms have a lower SES that control classrooms, and once this is accounted for, the benefit of SAGE participation is amplified.
Model F. This model combines SES, SAGE participation and class size in a single analysis. For all sub-tests, SES once again has significant effects on the class average post test score. For the total score, both class size and SAGE participation have a significant effect of class average performance. Class size has a significant effect on class average performance for the mathematics sub-test. Neither variable made a significant contribution to the average for the language arts and reading subscores once SES was accounted for. This is likely due to the fact that these two variables (and SES) are relatively highly correlated.
Analysis for Truncated Group
As noted earlier, the use of CTBS Level 10 used for post-test resulted in a significant proportion of ceiling effects for project participants. These effects likely had an undetermined influence on the results displayed in Table 35. As with the individual level analyses, the HLM models were also applied to the data after removing the top quartile of scorers on the pre-test. This procedure eliminated most of the cases that were observed with a ceiling effect, and therefore the results are expected to be free from any bias introduced by these effects. In addition, for purposes of the HLM analyses, classrooms with fewer than 5 students after the elimination procedure were dropped from the analysis. This was done to avoid having a very few individuals in the classroom determine the effects for the classroom.
It should be noted that the regression coefficients generated from this truncated sample may be biased. Assuming linearity, it can be shown (e.g., Linn, 1982) that the regression coefficient resulting from the variable used for truncation (in this case pre-tests) will be unaffected by this procedure. However, the coefficients associated with other variables, which were subjected to incidental selection (to the extent that they correlate with pretest) can be expected to be attenuated. In addition, in all cases the standard error of the coefficients can be expected to be higher, therefore statistical significance is more difficult to attain. In all cases, then, the results from the truncated sample can be thought of as conservative estimates.
Table 36 shows the HLM modeling results for the 75 percent sample. In the majority of cases, the regression coefficients in Table 36 are attenuated with respect to the corresponding values in Table 35, as expected. There are several instances where the values actually increase, however. These differences are most likely due to sampling error. Even though the standard errors are expected to rise in the truncated situation, the pattern of significant coefficients is quite similar across the models in Table 35 and Table 36. Consequently, the interpretation of the results changes little from the full sample.
III. LIFE IN SAGE CLASSROOMS:
THE REDUCED STUDENT-TEACHER RATIO
To accurately comprehend the SAGE program it is important to understand how SAGE schools structure classrooms and implement features of the SAGE initiative (i.e., 15:1 student-teacher ratio, rigorous curriculum, lighted-schoolhouse, and staff development). The focus of this section is on process, i.e, what went on in SAGE schools and classrooms rather than on the effect SAGE had on student achievement. In particular, this section is centered on the reduced student-teacher ratio implemented by SAGE schools.
Contained in this section of the report is a description of teaching and learning in SAGE kindergarten and first-grade classrooms. Data collected from teacher interviews, classroom observations, teacher activity logs, and teacher questionnaires are reported.
Thirty first-grade teachers from 13 schools in 8 districts were interviewed during May 1997. This sample consisted of 18 individual interviews, and 6 interviews of teacher teams who taught in 30:2 student-teacher ratio classrooms. The teachers selected to be interviewed were those who served as the observation sample of the SAGE evaluation effort, except for two teachers whose schedules did not permit interviews. Of the represented classrooms, 14 were 15:1 Regular classrooms, including 3 classrooms that contained both first grade and kindergarten students, 4 were 15:1 Shared Space classrooms, 5 were 30:2 Team Taught classrooms, and 1 was a 30:2 Floating Teacher classroom.
The interviews, which were tape recorded and transcribed, were 20 to 45 minutes in length and focused on three main questions. Teachers were asked to describe 1) the extent to which their teaching changed as a result of having fewer students, 2) the extent to which they believed their students’ achievement improved as a result of being in a class with fewer students, and 3) changes they anticipated in their teaching for the 1997-98 school year. Findings regarding each of the questions follow.
All of the interviewed teachers, except two teacher teams, indicated that their teaching had changed as a result of having a reduced class size. These two teams stated that their basic teaching style had not been altered, but they described many adjustments that they had made in teaching, which were consistent with the changes described by the other 22 teachers. The changes that the teachers described related to discipline, instruction, and personal enthusiasm.
Discipline. Although one teacher felt that the amount of time devoted to discipline had not changed from previous years when she taught a larger class, all of the other interviewed teachers said that they spent much less time in dealing with student misbehavior. Some teachers stated that misbehavior had nearly vanished from their classrooms.
Several explanations were given for the reduction in student misbehavior. With only 15 students they can get the attention of the class more easily, teachers indicated. They can see what every student is doing. They can have direct eye contact with students and can be physically close to students. This leads to identifying problems early and dealing with them instantly, teachers said. Further, because the class is small a family atmosphere develops in the classroom. A different relationship emerges as students come to respect each other. In addition, teachers who team taught in 30:2 Team Taught classrooms remarked that during those portions of classroom time when all 30 students were being taught as a group by one teacher, the other teacher was able to focus exclusively on student behavior and take action if needed.
Well, it’s wonderful not having to stop instruction to do discipline. I mean that’s probably one of the biggest plusses, that learning still goes on while another adult deals with the problem. Behavior is probably not very much of a problem any longer .... It’s basically because you got a small number and you’re on top of them all the time. You’re monitoring them all the time. So, they know how to behave now. Instruction. A result of greatly reduced need to discipline students was substantially more time devoted to instruction, teachers indicated. Every SAGE teacher interviewed remarked that he or she was able to devote more time to instruction this year. A few suggested that less “paper work” associated with small class size also contributed to increased instructional time. More instructional time, teachers stated, permitted them to be less rushed in their teaching. They could spend more time interacting with students, reteaching when necessary, and providing more and varied learning activities. The main consequence of increased instructional time, however, was an increase in individualized instruction.
There is definitely more time on instruction. Just having fewer bodies in the classroom, there are fewer, ah, fewer problems arise and so there can be more time devoted to instruction. It definitely changed, you know. I do have more time that we’re spending, you know, specifically doing instruction. Now I feel as if I have time to really facilitate as well as interact with kids. When teachers talked about how having a student-teacher ratio of 15:1 affected their teaching, the topic of individualization was mentioned the most often and generated the most emotion. All of the teachers agreed that they now could turn to the needs of individual students.
A class with fewer students enabled teachers to diagnose the learning needs of individual students and to diagnose them earlier. The teachers remarked that they knew their students’ abilities better and that they came to know each student as a person. In addition to diagnosis, having fewer students also permitted teachers to teach students on an individual basis.
They were able to get around to work with each student and they could do it frequently. Students were not required to wait idly for the teacher’s attention. Those students who understood the lesson were given accelerated tasks while those who had difficulties or problems were remedied.
Besides this type of tutoring individualization, small class size resulted in individualization in another sense, teachers indicated. With fewer students each student gets more turns, to share ideas, to answer teacher questions, to ask questions of others, and to read aloud. Increased participation of this sort permits teachers to see individuals’ present level of understanding and to take needed action, and it permits students to clarify their thinking on the basis of the feedback they receive.
It was much easier to pinpoint what students need. Oh my gosh. I get to return so many more times. I mean it could be the same lesson, but I come back to them more than once to see how they’re doing. So, I might work with them and have the time to work with them one-on-one. Most of the time everybody gets to have something they’re really interested in brought out. I mean, even if we’re just having a discussion on a topic everybody will get to say something about it because there’s time for that, because there are only 14 kids. Well, with comparing this year to last year, I think that this year I was able to get around to more kids and see the mistakes right away and address them right away instead of waiting until I pick the papers up. In addition to individualization, another area that most teachers believed had changed was content emphasis. All but one teacher said that because they had smaller classes they were able to teach more content and teach it in greater depth. Several mentioned that they had moved into the second-grade reading curriculum and books. A few also mentioned that they were able to introduce thematic teaching.
This year we finished up with grade one and we went through the second book.
I think the kids are getting so much further than I’ve seen first graders at this point in the year. Another instructional change mentioned by some of the interviewed teachers was
increased use of student-centered activities, however, far fewer teachers mentioned this area than individualization. These teachers believed that smaller class size enabled them to provide interest centers, more hands-on activities and the use of manipulatives, give students choices in tasks, provide more opportunities to solve problems, and engage in more activities that require
I can do a lot of, like I said, hands-on and that type of thing, things I wouldn’t
dare attempt with a large class. Now I have kids in cooperative groups ... learning from each other, working together, sharing each other’s materials. And, manipulatives, I really really like them to work on manipulatives.
Personal Enthusiasm. An area related to instruction about which teachers had strong feelings was teacher enthusiasm as a consequence of having small classes. Teachers indicated they had a much more positive attitude toward teaching and had much more energy and motivation regarding teaching because they were able to develop personal relationships with students and they could see substantial educational growth in their students. Some teachers also mentioned that they experienced less stress because they had fewer students to whom they had to attend. This resulted in fewer papers to correct and less work to be done at home in the evening.
This year has been much more positive. Part of that is because of the success of the children because that is the goal. When they are successful, then that makes you want to teach. That success is an upper in itself, and that makes the whole experience more enjoyable. I think that it gives you less stress because when you’re teaching and trying to do a good job, you’re worried about the students. You’re worried about them and trying to help them. It’s a lot easier to give your attention and help to 15 kids than it is 30 kids, and that has to bring down the stress level.