The CIERA School Change Project: Supporting Schools as They Implement Home-Grown Reading Reform

Barbara M. Taylor, University of Minnesota
P. David Pearson, University of California-Berkeley
Debra Peterson, University of Minnesota
Michael C. Rodriguez, University of Minnesota

W e know a great deal about what schools and teachers can do to promote reading success in the elementary grades. We also possess a great deal of knowledge about school change, and the importance of professional development. However, we are challenged by our apparent inability to put our knowledge to work. Even though we continue to learn more about effective schools, effective instruction, and effective change efforts, we seem hard-pressed to integrate and apply this knowledge in ways that positively impact the thousands of schools which are struggling to teach all children to read.

Research on Effective Schools

In the past, numerous studies of high-performing high-poverty schools have pointed to important building-level factors that must be in place in order for all children to achieve at high levels in reading. Emphasizing outcomes in reading achievement, Hoffman (1991) summarized the research on effective schools from the 1970s and early 1980s (e.g., Venezky & Winfield, 1979; Weber, 1971; Wilder, 1977). He discussed eight recurring attributes of effective schools:

  1. a clear school mission;
  2. effective instructional leadership and practices;
  3. high expectations;
  4. a safe, orderly, and positive environment;
  5. ongoing curriculum improvement;
  6. maximum use of instructional time;
  7. frequent monitoring of student progress; and
  8. positive home-school relationships.

In recent years, we have seen a revival of effective schools research, most likely due to widespread national concerns about student reading achievement. Taylor, Pressley, and Pearson (2002) summarized findings from five large-scale research studies on effective, high-poverty elementary schools, which were published between 1997 and 1999 (Charles A. Dana Center, 1999; Designs for Change, 1998; Lein, Johnson, & Ragland, 1997; Puma, Karweit, Price, Ricciuiti, Thompson, & Vaden-Kiernan, 1997; Taylor, Pearson, Clark, & Walpole, 2000). The six recurring themes that emerge from these five studies both support and extend the earlier research on effective schools.

Putting the students first to improve student learning. In four of these studies (Charles A. Dana Center, 1999; Designs for Change, 1998; Lein et al., 1997; Taylor et al., 2000), improved student learning was cited as the schools' overriding priority. Also, schools reported a collective sense of responsibility for school improvement. Teachers, parents, the principal, and other school staff members worked as a team to achieve their goal of substantially improved student learning and achievement.

Strong building leadership. Three of the studies (Designs for Change, 1998; Lein et al., 1997; Puma et al., 1997) documented the importance of strong building leadership. The principal may have worked to redirect people's time and energy, to develop a collective sense of responsibility for school improvement, to secure resources and training, to provide opportunities for collaboration, to create additional time for instruction, and to help the school staff persist in spite of difficulties.

Strong teacher collaboration. In addition to, or perhaps because of, strong leadership, strong staff collaboration was highlighted in four of the studies (Charles A. Dana Center, 1999; Designs for Change, 1998; Lein et al., 1997; Taylor et al., 2000). Teachers planned and taught together, with a focus on how to best meet students' needs. They reported a strong sense of building communication, talking and working across, as well as within, grades, which contributed to better understanding of one another's curricula and expectations.

Focus on professional development and innovation. Four of the studies (Charles A. Dana Center, 1999; Designs for Change, 1998; Lein et al., 1997; Taylor et al., 2000), stressed ongoing professional development and the implementation of new research-based practices. Many of the successful schools in these studies, emphasized a type of sustained professional development in which teachers learned together within a building and collaborated to improve their instruction.

Consistent use of student performance data to improve learning. Four of the studies (Charles A. Dana Center, 1999; Designs for Change, 1998; Lein et al., 1997; Taylor et al., 2000) found that teachers in effective schools systematically shared student assessment data, usually on curriculum-embedded measures, as a part of the process of making instructional decisions to improve pupil performance. Teachers also worked together to carefully align instruction to standards and state or district assessments.

Strong links to parents. All five studies (Charles A. Dana Center, 1999; Designs for Change, 1998; Lein et al., 1997; Puma et al., 1997; Taylor et al., 2000) reported strong efforts within schools to reach out to parents. Schools worked to win the confidence of parents and then built effective partnerships with them in order to support student achievement. Parents were treated as valued members of the school community. Schools also reported a positive school climate, good relations with the community, and high levels of parental support.

Research on effective school reform and professional development. Research on effective school reform and teacher professional development is consistent with the research on effective schools in general, in that it stresses the importance of teachers learning and changing together over an extended period of time, as they reflect on their practice and implement new teaching strategies (Fullan, 2000; Fullan & Hargreaves, 1996; Louis & Kruse, 1995; Richardson & Placier, in press.) In successful schools, which typically operate as strong professional learning communities, teachers systematically study student assessment data, relate the data to their instruction, and work with others to refine their teaching practices (Fullan, 2000). Reflective dialogue, deprivatization of practice, and collaborative efforts all enhance shared understandings and strengthen relationships within a school (Louis & Kruse, 1995).

Research on Effective Teachers of Reading

The knowledge base for effective teaching, especially teaching reading in the elementary grades, is equally as strong. In a recent NEA research report, Taylor, Pressley, and Pearson (2002) summarize this research, noting several distinct historical waves of work. From the process-product research of the 1960s and 1970s (Brophy,1973; Dunkin & Biddle, 1974; Flanders, 1970; Soar & Soar,1979; and Stallings & Kaskowitz, 1974) we learned that more effective teachers maintained an academic focus, kept a high incidence of pupils on task, and provided direct instruction. Effective direct instruction included making learning goals clear, asking students questions as part of monitoring their understanding of what was being covered, and providing feedback to students about their academic progress. Effective classrooms were found to be warm, democratic, and cooperative, with more teacher instruction devoted to weaker students, who were also given more time to complete tasks.

A second wave of research on teaching reading, which began with the work of Duffy and Roehler in the 1980s, taught us about the cognitive processes used by outstanding teachers. More effective teachers engaged in modeling and explanation to teach students strategies for decoding words and understanding texts. Knapp and associates (Knapp, 1995) found that effective teachers stressed higher-level thinking skills more than lower-level skills. Continuing in this tradition, Taylor et al. (2000) found that accomplished primary grade teachers provided more small-group than whole-group instruction, had high pupil engagement, had a preferred teaching style of coaching as opposed to telling, and engaged students in more higher-level thinking related to reading than other teachers.

In the most recent wave of research, Pressley and his colleagues (Pressley, Wharton-McDonald, Allington, Block, Morrow, Tracey, Baker, Brooks, Cronin, Nelson, & Woo, 2001) have focused our attention on the characteristics of teachers nominated as exemplary in practice by their peers and supervisors. These researchers found that effective primary grade teachers did provide a balanced literacy program: they taught skills and got their students actively engaged in a great deal of actual reading and writing. They also encouraged students to self-regulate their use of strategies. Interestingly, the National Reading Panel Report (2000) implicated balanced literacy instruction in its conclusion that instructional attention to systematic phonics, phonemic awareness, fluency, and comprehension strategies was important to a complete reading program. (pp. 2-89).

In short, we have learned different, but complementary, lessons about the teaching practices of excellent elementary literacy teachers from the last four decades of research on effective teaching. The overall picture is consistent with the earlier process-product research to some extent, especially with regard to engagement, but goes beyond it in ways consistent with Duffy, Roehler, et al.'s (1987) direct explanation approach and Knapp and associates' (1995) emphasis on higher-order literacy instruction (i.e., instruction which emphasizes comprehension and communication). Excellent elementary literacy teachers balance skills instruction with more holistic teaching (Pressley, 1998). In the best classrooms, students are engaged much of the time in reading and writing, with the teacher monitoring student progress, encouraging continuous improvement and growth, and providing scaffolded instruction to help students improve their use of various strategies.

Objectives of the Current Project

Amidst pressure for schools to adopt off-the-shelf reform programs as a way of improving student achievement (Herman, 1999), it is interesting to note that, by and large, the schools in the studies summarized by Taylor, Pressley, and Pearson (2002) did not necessarily view packaged reforms as the key ingredient for improving student achievement (Charles A. Dana Center, 1999; Designs for Change, 1998; Taylor et al., 2000).1 The common denominators seem to be commitment and hard work focused on research-based practices at both the classroom level and the school level.

The overall objective of this project is to test the efficacy of a school reform framework which was designed to be used by elementary schools, in order to develop local reading programs that would improve students' reading achievement. The study is guided by two fundamental questions:

  1. Will a research-based, action-oriented, Internet-supported change framework designed to promote a grass-roots reading program reform produce robust changes in: a) school-wide approaches to delivering reading instruction; b) classroom teaching practices; and c) student learning and reading achievement?
  2. Across schools, what classroom practices and schoolwide efforts are most effective in improving the teaching of reading and increasing students' reading achievement?

In our attempt to answer these questions, we did not, nor do we think that we can or should, use a classic experimental paradigm. We did not, for example, randomly assign programs or even particular programmatic components to schools and teachers; to do so would have violated what we have learned from the last 20 years of research on school change--that school staffs must be involved in creating the programs for which they will be held accountable. However, it is neither necessary nor desirable to invite each and every school to rediscover the wheel. Therefore, what we did was to offer school staffs a framework for making their own decisions about how they might redesign their reading program. The framework consists of a set of six components derived from research-based knowledge about how to build an effective reading program. These components include classroom reading instruction, school reading programs, reading interventions, school-home-community relations, school change processes, and professional development. Each component is made available to a school via an Internet-based multimedia program offering research summaries, readings, video clips of effective practice, and learning activities to guide local action. Support for implementation of the framework is provided to schools through assessment tools and the data obtained with those tools, an external facilitator, an internal leadership team, schoolwide efforts, and study groups that focus on implementing effective practices.

Method

Participants

Five schools served as project schools and used the CIERA School Change Framework in 1999-2000. Three of these schools continued with the project in 2000-2001, and six additional schools joined the project at this time. All were high-poverty schools, with 70-95% of the students qualifying for subsidized lunch. Across schools, 2-68% of the students were non-native speakers of English, and 67-91% of the students were members of minority groups. The 11 project schools were from eight different school districts spread across a rural area in the southeastern U.S.; an eastern city; two small towns in the Midwest; a large city in the Midwest; and a large city in the southwestern U.S. In order to become a project school, at least 75% of the teachers in a building had to agree to participate in the project. In all schools, two teachers per grade were randomly selected and invited to participate in the classroom observations, interviews, and completion of instructional logs. If a teacher declined, children from her classroom remained in the school-level analyses. Because the grade levels within buildings differed, children in 7 schools came from grades K-5, in 3 schools from K-6, and in 1 school from grades K-3. Within the designated classrooms, teachers were asked to divide their classes into thirds (high, average, and low) in terms of perceived reading ability; children were then randomly selected from each third. In the fall, 9 children were randomly selected as target students: 3 each from the high, middle, and low thirds of the classroom continuum. In the spring of 1999-2000, due to resource limitations, 2 high, 2 average, and 2 low children were randomly selected from each classroom for post-testing. In the spring of 2000-2001 as many as possible of the 9 children per class who remained at the school were tested: the average was 8 children per class.

Student Assessments

The children who were randomly selected for participation were assessed in the fall and spring on a number of literacy measures, which varied depending on grade level. Assessments included a standardized reading comprehension test (grades 1-6) as well as tests considering letter-name knowledge (K-1), rhyme (K-1), phonemic awareness (K-1), word dictation (K-1), concepts of print (K-1), fluency (words correct per minute; Deno, 1985) (1-6), and writing (responding to a common prompt) (1-6). See Table 1 for details.

Assessments Used in Years 1 and 2

Assessment Tool

Description

Fall

Spring

Letter Names

Letter Name subtest from the Emergent Literacy Survey (Pikulski, 1996). Students identified both lower- and upper-case letters.

K, 1

K

Phonemic Awareness

Rhyming subtest from Emergent Literacy Survey (Pikulski, 1996). Students were given a word and asked to say another word that rhymed with it. Nonsense words were acceptable. Total of 8 points.

 

Classroom Segmentation and Blending Test (Taylor, 1991). Children were given six words to blend (e.g., "What is /c/ /a/ /t/?"). Then they were given six more words and asked to identify the first, middle, and last sound they heard in each word (e.g., "What sound do you hear first in sad? What sound do you hear in the middle of sad? What sound do you hear at the end of sad?").

K

 

 

 

 

 

 

1

 

 

 

 

 

 

 

K

Concepts of Print

Concepts of print subtest from Emergent Literacy Survey (Pikulski, 1996). Children were asked to identify a letter, word, and sentence, as well as demonstrate knowledge of tracking during reading. Total of 8 points.

K

K

Word-Level Dictation

Graded lists from Right Start Project (Colt, 1997). Students were asked to write 15 dictated words. If they could write at least seven words correctly from the first list, they went on to a second list of 15 words. (Administered in a group.)

1

K

Writing Prompt

Michigan Writing Assessment (administered in a group). Students were asked to write to a prompt (e.g., "Tell about a favorite place."). The same prompt was used in the fall and spring. A scoring rubric was used to score papers from high (4) to low (1).

2-6

1-6

Instructional Reading
Level

Basic Reading Inventory (Johns, 1997). Students read graded passages until they reached frustration level. Instructional reading level was determined to be the highest level at which they could read with better than 90% accuracy in word recognition.

 

1

Fluency

Words correct per minute on passage from Johns (1997). In fall, students read for 1 minute from a passage that was one level below grade level. In spring they read from a passage that was at grade level.

2-6

1-6

Retelling

Retelling of Johns (1997) BRI passage read in winter.

 

 

Houghton Mifflin
Baseline Test

Houghton Mifflin Baseline Test (narrative only, administered in a group). After reading a three-page story, students answered five short answer questions and five multiple-choice questions. Possible score is 20 (0, 1, or 2 points for each short answer question, and 2 points for each multiple choice question correct). A score of 0-10 is considered low, 11-15 average, and 16-20 high.

2-6

1-6

Standardized Reading Comprehension Test

Gates-MacGinitie Reading Test (MacGinitie, MacGinitie, Maria, & Dreyer, 2000). Only the passage comprehension subtest was administered. Students read short passages and answered multiple-choice questions. (Administered in a group.)

2-6

1-6

In the fall, kindergarten children were individually assessed (Pikulski, 1996) on letter-name knowledge (students were asked to give the names of the upper- and lowercase letters); concepts of print (students responded to 8 items dealing with concepts related to words, letters, sentences, tracking, etc.); and rhyming ability (students responded to eight items, each giving a word that rhymed with a prompt). In the spring, kindergarten students were individually assessed on letter-name knowledge, concepts of print, and rhyme. They also completed an individually-administered, 12-item phonemic segmentation and blending test, in which they segmented words into phonemes and blended phonemes into words (Taylor, 1991), and a group-administered word dictation test in which they wrote 15 pre-primer and 15 primer words (Colt, 1997).

In the fall, children in grade 1 were individually assessed on letter-name knowledge, and phonemic segmentation and blending, and children were assessed in small groups on word dictation. In the spring all students were individually assessed on reading fluency (in which students read aloud for 1 minute to obtain a score for the number of words read correctly in 1 minute; Deno, 1985) based on a grade-level passage from the Basic Reading Inventory (BRI) (Johns, 1997). In a group setting students took the reading comprehension subtest of the Gates-MacGinitie Reading Test (MacGinitie, MacGinitie, Maria, & Dreyer, 2000) and responded to a writing prompt in which papers were scored according to a 4-point scoring rubric (Michigan Literacy Progress Profile, 1998).

In the fall, children in grades 2-6 were individually assessed on fluency (words correct per minute) based on their reading of a BRI passage (Johns, 1997) that was one grade level below their grade placement. As a group they took the comprehension subtest of the Gates-MacGinitie Reading Test (MacGinitie, MacGinitie, Maria, & Dreyer, 2000) and a writing prompt. In the spring, all children were assessed on fluency using a passage at grade level (Johns, 1997), on reading comprehension (Gates), and on writing (using the same prompt as was used in the fall).

Each response to the writing prompt was scored by one person from a team of trained scorers according to a rubric. Twenty-five percent of the writing samples at each grade level were scored by a second scorer, with 83% agreement between the 2 scorers.

Use of the School Change Framework

Teachers voted by secret ballot on whether to participate in the school change project. Schools were eligible to join the project if 75% of the teachers in the building voted to participate. Staff agreed to meet for a minimum of 1 hour a month as a large group to work on the school change effort, and 1 hour a week, on average, in smaller and more focused study groups. A school leadership team made up of teachers, the principal, and an external facilitator (who was to spend a minimum of 8 hours a week in the building) was responsible for guiding the staff through the school change activities. Large-group activities were to include discussion and action on the schoolwide reading program, early reading interventions, and parent partnerships, as well as on issues related to school change and professional development. Reports were also expected from the study groups. Small-group activities were to include within-grade and across-grade study groups which focused on particular aspects of classroom reading instruction and student work (e.g. comprehension instruction, phonemic awareness instruction). Groups were encouraged to review the research on the CIERA School Change website; to download, read, and discuss articles on research-based practices related to their focus area; to view and discuss video clips of effective practice on the site; and to share video clips of their own practice. Members of study groups also raised issues, solved problems, and developed action plans related to their focus area to make changes in their classroom reading instruction.

In addition to these components, schools agreed to several other practices and commitments in this multi-year project: cross-grade collaboration; the development of a plan to involve parents as partners in the delivery of their school reading program; and the commitment to continue with the project for at least 2 years, addressing, across that time span, all six major categories of the school change framework.

Use of data emanating from the project was also an important component. At the beginning of Year 1, facilitators received a summary of the Beating the Odds research (Taylor et al., 2000) on characteristics of effective schools and teachers, which they were asked to share with the teachers in their schools. Teachers also completed a checklist asking about various topics they felt they should cover during the year on characteristics of effective schools and teachers; these topics were covered on the Internet site. The purpose of both of these activities was to help schools set priorities for study groups and large-group meetings.

At the beginning of the second year, returning schools received a summary of the Beating the Odds research and a personalized school report that focused on their performance on school and classroom variables, as compared to the mean of other schools in the study. Schools new to the project received a generic version of the school report that included the cross-school means for school and classroom variables. The report included correlations identifying the school and classroom factors which are related to growth in students' reading and writing ability. Finally, the teachers also completed a questionnaire about their perceptions regarding the presence of various school and classroom characteristics, and their opinions about where their school should focus its reform efforts during the upcoming year. This questionnaire--like its predecessor, the previous year's checklist--was tied to topics covered on the Internet site, and was designed to help schools set priorities for their study groups and overall reform efforts.

Documenting Program Characteristics and Classroom Practices

Teachers were interviewed in the fall, winter, and spring; principals were interviewed in the fall and spring. The interview data were used primarily to document program features and participant beliefs. Each interview lasted about 30 minutes.

Teachers meeting in study groups were asked to complete a common study group meeting form after each session and develop an action plan. The external facilitators were asked to keep brief monthly notes summarizing the activities pertaining to the school change project that had transpired at their school. They were also asked to write an end-of-year report. The data from the notes, action plans, and end-of-year report were used to document the change process at the school level.

On three occasions (fall, winter, spring), each teacher who agreed to be in the data collection sample was observed for an hour during reading instruction time, to document their classroom practices in the teaching of reading. All observations were scheduled. The observers were trained to use the CIERA Classroom Observation Scheme (Taylor et al., 2000), and were expected to demonstrate at least 80% agreement with a "standard" coding at each of the seven levels of the coding scheme (Taylor et al., 2000), prior to conducting classroom observations.

The observation system (influenced by the work of Scanlon & Gelzheiser, 1992; Greenwood et al., 1995; and Ysseldyke & Christenson, 1993-96) combined qualitative note-taking with a more quantitatively-oriented coding process. The observer took field notes for a 5-minute period, recording a narrative account of what was happening in the classroom, including, where possible and appropriate, what the teacher and children were saying. At the end of the note-taking period the observer recorded the proportion of children in the classroom who appeared to be on task (i.e., doing what they were supposed to be doing). They then coded the three or four most salient literacy events (Category 4 codes) that occurred during that 5-minute episode. For every Category 4 event, the observer also coded who was providing the instruction (Category 1), the grouping pattern in use for that event (Category 2), the major literacy activity (Category 3), the materials being used (Category 5), the teacher interaction styles observed (Category 6), and the expected responses of the students (Category 7). An example of a 5-minute observational segment is provided in Table 2. (See Table 3 for a list of the codes for all the categories.) In Table 2, the codes "c/s/r" refer to categories 1--3, and codes "r/t/a/r", "wr/t/c/or-I", and "v/t/r/or" each refer to categories 4-7.

Sample of observational notes

9:38--Small group continues. T is taking running record of child's reading. Others reading familiar books. Next, T coaches boy on sounding out "discovered." Covers up word parts as he says remaining parts. T: "Does that make sense?" T: "What is another way to say this part [`cov' with short `o']"? T passes out new book: My creature. T has students share what the word "creature" means. Ss: animals, monsters, dinosaurs, Dr. Frankenstein. 11/12 OT (On Task) C/s/r r/t/a/r wr/t/c/or-i (indv) v/t/r/or

 

 

 

 

 

 

Codes for Classroom Observation

Level 1--Who

Code

Classroom teacher

c

Reading specialist

r

Special education

se

Other specialist

sp

Student teacher

st

Aide

a

Volunteer

v

No one

n

Other

0

Not applicable

9

Level 2--Grouping

Code

Whole class/large group

w

Small group

s

Pairs

p

Individual

i

Other

0

Not applicable

9

Level 3--General Focus

Code

Reading

r

Composition/writing

w

Spelling

s

Handwriting

h

Language

l

Other

0

Not applicable

9

Level 4--Specific Focus

Code

Reading connected text

r

Listening to text

l

Vocabulary

v

Meaning of text, lower
m1 for talk
m2 for writing

 

m1

m2

Meaning of text, higher
m3 for talk
m4 for writing

 

m3

m4

Comprehension skill

c

Comprehension strategy

cs

Writing

w

Exchanging ideas/oral production

e/o

Word ID

wi

Sight words

sw

Phonics
p1 = letter sound
p2 = letter by letter
p3 = onset/rime
p4 = multisyllabic

 

p1

p2

p3

p4

Word recognition strategies

wr

Phonemic awareness

pa

Letter ID

li

Spelling

s

Other

o

Not applicable

9

Level 5--Material

Code

Textbook, narrative

tn

Textbook, informational

ti

Narrative trade book

n

Informational trade book

i

Student writing

w

Board/chart

b

Worksheet

s

Oral presentation

o

Pictures

p

Video/film

v

Computer

c

Other/not applicable

o/9

Level 6--Teacher Interaction

Code

Tell/give info

t

Modeling

m

Recitation

r

Discussion

d

Coaching/scaffolding

c

Listening/watching

l

Reading aloud

ra

Check work

cw

Assessment

a

Other

o

Not applicable

9

Level 7--Expected Pupil Response

Code

Reading

r

Reading turn-taking

r-tt

Orally responding

or

Oral turn-taking

or-tt

Listening

l

Writing

w

Manipulating

m

Other/not applicable

o/9

Number of students on task/
total number of students in room

 

At the end of the observation, the observer wrote a summary addressing seven key features of the classroom ecology: (a) the general instructional approach used in the classroom, instructional sequences observed, approaches to word recognition, vocabulary, and comprehension instruction; (b) curriculum materials used; (c) teacher's style of interacting with the children; (d) teacher's grouping practices, and activities of children not with the teacher; (e) student engagement; (f) classroom management; and (g) classroom climate.

The observations were used as a source of data for individual teachers. In February of Year 1 teachers were invited by letter to receive, upon request, copies of their first two classroom observations and an explanation of the codes used in the observations. At one school, 75% of the teachers asked for copies of their observations; at three more schools, requests for feedback ranged from 42 to 50% of teachers; and at the fifth school, only one teacher out of 14 asked for feedback. In Year 2, based on requests from teachers for regular feedback related to their observations, teachers received a copy of each observation, a description of the codes, a brief summary of research related to the major coding categories being analyzed for the project (e.g., incidence of whole group instruction, and incidence of higher-level questioning; see page 17), and a table summarizing the codes from their observations (e.g., the incidence of whole-group instruction, the incidence of higher-level questioning, etc.) and comparing them to the means at their grade level across all schools. External facilitators received training in how to interpret observations so that they, in turn, could help teachers understand the information contained in these observations without making the interpretations for them. Teachers were encouraged to go to the facilitators with questions.

In addition to the observations, in 2000-2001 teachers also completed six logs, one each for a high-, average-, and low-ability student, for an entire week in the winter, and once again for the same three students in the spring. Teachers used the log to record time spent on various literacy activities, types of texts and materials used, and grouping practices.

Data Analyses

Data from the interviews were used to document the characteristics of various school-level factors at each school site, as well as the extent of the reform effort at that school. The meeting notes and action plans completed by the study groups, along with the monthly notes and end-of-year report completed by the facilitators, were used to further describe the extent of implementation of the change process at the project schools.

We applied a coding rubric to the interviews in order to evaluate teachers' perceptions of the degree to which the following factors, previously found to be important in effective schools (see pp. 1-3), existed in their schools: (a) building collaboration in the delivery of reading instruction; (b) links to parents; (c) reflection and change pertaining to instruction; (d) collaborative professional development; and (e) strong building leadership (and the extent to which this leadership was invested in the teachers, as well as the principal).2

Summary Data from the Teacher Interviews and Descriptions of Categories Analyzed

Teacher Perceptions

Mean rating for all schools
(based on 4-point rubric, in which
0 = low and 3 = high)

Links to Parents

1.89 (.11)

Collaboration

1.85 (.36)

Professional Development

1.84 (.32)

Reflection on Teaching

1.86 (.27)

Building Leadership

1.83 (.23)

Schoolwide Assessment System

1.99 (.03)

Total

11.3 (.9)

Each teacher's set of three interviews was used to rate school-level factors (collaborative leadership, building collaboration in the teaching of reading, reflection on teaching, collaborative professional development, and links to parents) on a 4-point scale, which was designed to capture the strength or degree to which each factor was perceived to be present in that school: (0 = very low perception, 1 = low, 2 = moderate, and 3 = high). This coding rubric is presented in Table 4. All of the interviews were coded by one member of the research team. A second team member independently coded the interviews from a random sample of 25% of the teachers; the mean agreement on overall rubric scores was 87% across the five variables.

The five ratings were summed to generate a school effectiveness score for each school in the study. The 11 schools from Year 2 and the three schools from Year 1 had a mean school effectiveness score of 8.3, (SD = 1.7), for a range from 5.4 to 10.2 (out of a total possible score of 15).

Although schools had agreed, in principle, to the conditions for the study, they exhibited considerable variability in their degree of adherence to the reform framework. Factors considered important to the reform included the following: (a) meeting for 1 hour a week in study groups; (b) meeting in cross-grade groups; (c) reflecting on teaching in study groups; (d) considering research-based "best practices" in study groups; (e) completing action plans in study groups; (f) selecting substantive topics for study; (g) maintaining topics over time; (h) meeting as a whole faculty once a month to discuss reform efforts; (i) working on parent partnerships and making effective use of the external facilitator; and (j) making effective use of the internal leadership team. Using the comments of each teacher across the three interviews, the study group meeting notes, study group action plans, facilitator logs, and the end-of-year reports, we built a scale indicating the degree to which a school was perceived to be implementing the various components of the school change framework (see Table 5). We then calculated a mean reform effort score for each school. The mean score was 4.2 (SD = 1.6) out of a possible 10 points. One member of the research team rated each school on each of the 10 dimensions of implementing the reform. A second member of the research team also read through the artifacts and rated each school. The Pearson correlation coefficient across the two scorers' ratings was .92.

Reform Implementation Rubric

Meeting 1 hour per week in study groups (at least 80% of the time).

Meeting in cross-grade study groups (at least 80% of the time).

Reflecting on instruction and student work (demonstrated at least 80% of the time).

Considering research-based practices (demonstrated at least 80% of the time).

Being guided by action plans (yes or no).

Sticking with substantive topics for 3-4 months or more (yes or no).

Meeting once a month as a whole faculty to share and set goals (at least 80% of the time).

Working on a plan to involve parents as partners (yes or no).

Effective use of an external facilitator (yes or no).

Effective use of an internal leadership team (yes or no).

Note: One point was awarded for each of the reform components if the criteria in parentheses for a particular component were judged to be met.

Coding the observations. As the first author of this paper visited research sites, she joined each observer in an observation, in order to establish inter-rater reliability data on the observation coding scheme. Across these 12 observations, agreement with the senior author ranged from 77% to 98%: 98% at Level 2 (grouping), 87% at Level 3 (major literacy focus), 80% at Level 4 (specific literacy activity), 90% at Level 5 (material), 80% at Level 6 (teacher response), and 77% at Level 7 (student response).

An expert observer who had done many classroom observations using this scheme and helped to refine it read through all of the observations to further assess the degree to which observers were using the codes in a similar manner. For example, although decision rules had been established in order to help an observer distinguish between similar codes, one observer may have coded a teacher's reference to the main idea of a story as a comprehension skill, while another observer might have coded a very similar exchange as a higher-level question about the story. The expert observer did not code the observations "blind." Instead, she recorded a different code only if she could not agree with the observer's code after reading the narrative description of a particular 5-minute segment. For a random sample of 10% of the observations, the agreements between the observers and expert observer at each of the levels of coding were measured as follows: 97% at Level 2 (grouping), 96% at Level 3 (major literacy focus), 80% at Level 4 (literacy activity), 86% at Level 5 (material), 77% at Level 6 (teacher response), and 83% at Level 7 (student response). Since there was variability between the observers and the expert, especially at Levels 4, 6, and 7, a decision was made to use the expert's codes for those instances in which the observer and expert disagreed, in order to ensure maximum consistency across the many observers.

A second expert reviewer, a member of the research team, read through the same random sample of 10% of the observations. The agreement between the first and second expert at each of the levels of coding was very high: 98% at Level 2 (grouping), 97% at Level 3 (major literacy focus), 93% at Level 4 (literacy activity), 97% at Level 5 (material), 91% at Level 6 (teacher response), and 93% at Level 7 (student response).

Certain aspects of the data from classroom observations (i.e., those found to be important in previous research--c.f., pp. 2-3) were analyzed in order to investigate the relationship between various classroom instructional practices and students' reading and writing ability. The classroom practices which were analyzed included the following (see Table 6 for descriptions of the categories):

  1. Whole-class/large-group--the percentage of segments in which whole-class or large-group was coded.
  2. Small-group--the percentage of segments in which small-group was coded.
  3. Phonemic awareness skill instruction--the percentage of segments in which phonemic awareness instruction was observed, divided by the number of segments in which the level 3 code was designated as reading.
  4. Phonics skill instruction--the percentage of segments in which an aggregate of phonics skill instruction was observed, divided by the number of segments in which the Level 3 code was designated as "reading." The aggregate variable was formed by summing across the following practices: letter-sound instruction, letter-by-letter decoding instruction, instruction in decoding by onset and rime, instruction in decoding multisyllabic words.
  5. Coaching in word recognition strategies--the percentage of segments in which coaching in word recognition strategies during reading was observed, divided by the number of segments in which the Level 3 code was designated as "reading."
  6. Comprehension skills--the percentage of segments in which comprehension skill instruction was observed, divided by the number of Level 3 reading segments coded.
  7. Comprehension strategies--the percentage of segments in which comprehension strategy instruction was observed, divided by the number of Level 3 reading segments coded.
  8. Lower-level questioning or writing about text--the percent of segments in which lower-level talking or writing about text was observed, divided by the number of Level 3 reading segments coded. An aggregate variable was formed by summing the data from lower-level oral questions about text and lower-level written questions about text.
  9. Higher-level questioning or writing about text--the percentage of segments in which higher-level talking or writing about text was observed, divided by the number of Level 3 reading segments coded. An aggregate variable was formed by summing the data from higher-level oral questions about text and higher-level written questions about text. 3
  10. Teacher-directed stance towards teaching--the percentage of responses in which teachers were coded as engaged in telling or recitation, out of the total number of the following responses coded: telling, recitation, modeling, coaching, and listening/giving feedback.
  11. Student-support stance towards teaching--the percentage of responses in which teachers were coded as engaged in modeling, coaching, listening/giving feedback, out of the total number of the following responses coded: telling, recitation, modeling, coaching, and listening/giving feedback.
  12. Students actively responding--the percentage of responses in which children were coded as engaged in reading, writing, or manipulating, out of the total number of Level 7 responses coded, including reading, writing, manipulating, reading-turn-taking, oral turn-taking, and listening. 4
  13. Students passively responding--the percentage of responses in which children were coded as engaged in reading turn-taking, oral turn-taking, or listening to the teacher out of total number of Level 7 responses coded, including reading, writing, manipulating, reading-turn-taking, oral turn-taking, listening.

For the 10% sample of observations described above, the second expert reviewer agreed with the first about the codes which made up the variables used in the data analyses: 100% whole-group, 99% small-group, 95% vocabulary instruction, 91% phonemic awareness instruction, 91% phonics instruction, 94% coaching in word-level strategies, 96% asking lower-level questions, 82% asking higher-level questions, 100% comprehension skill instruction, 88% comprehension strategies instruction, 94% teacher-directed stance, 92% student-support stance, 95% active responding, and 97% passive responding.

Description of Classroom Observation Categories Used in Data Analysis

Variable

Description

Whole Class or Large Group*

All of the children in the class (except for one or two individuals working with someone else), or a group of more than 10 children. If there are 10 or less in the room, code this as a small group.

Small Group*

Children are working in two or more groups. If there are more than 10 children in a group, call this whole group.

Phonemic Awareness Instruction

 

Phonics Instruction**

Students are focusing on symbol/sound correspondences (p1) or letter-by-letter decoding (p2) or decoding by onset and rime or analogy (p3), but this is not tied to decoding of words while reading. If students are decoding multisyllabic words, code as p4. The total number of phonics activities out of the total number of times that reading was coded at Level 3 was calculated.

Word Recognition Strategies **

Students are focusing on use of one or more strategies to figure out words while reading, typically prompted by the teacher.

Lower Level Text Comprehension (talk or writing about text)**

Students are engaged in talk (m1) or writing (m2) about the meaning of text that is engaging them in lower-level thinking. The writing may be a journal entry about the text, or may be a fill-in-the blank worksheet that focuses on the text's meaning (rather than on a comprehension skill or vocabulary words). The total number of "low level text comprehension" activities at Level 4 out of the total number of times reading was coded at Level 3 was calculated.

Higher Level Text Comprehension (talk or writing about text)**

Students are involved in talk (m3) or writing (m4) about the meaning of text that is engaging them in higher-level thinking. This is talk or writing about text that is challenging to the children, and which is at either a high level of text interpretation or goes beyond the text: generalization, application, evaluation, aesthetic response. Needless to say, a child must go beyond a yes or no answer (e.g., in the case of an opinion or aesthetic response). The total number of "high level text comprehension" activities at Level 4 out of the total number of times reading (as the major focus) at Level 3 was coded.

Comprehension Skill Instruction**

Students are engaged in a comprehension activity (other than a comprehension strategy) which is at a lower level of thinking (e.g., traditional skill work such as identifying main idea, cause-effect, fact-opinion).

Comprehension Strategy
Instruction**

Students are engaged in use of a comprehension strategy that will transfer to other reading and in which this notion of transfer is mentioned (e.g., reciprocal teaching or predicting. If predicting were done, but transfer was not mentioned, this would be coded as "c").

Vocabulary Instruction**

Students are engaged in discussing/working on word meaning(s).

Reading Text**

Students are coded as reading (not reading turn-taking) at Level 7.

Narrative Text*

The number of segments in which a narrative textbook (tn) or narrative trade book (n) was coded, out of the total number of coded segments.

Informational Text*

The number of segments during which an informational textbook (ti) or information trade book (i) was coded as being used, out of the total number of segments coded.

Teacher-Directed Stance***

The teacher is coded as telling or giving children information or engaging them in recitation. The total number of times in which telling or recitation was coded was divided by the total number of responses that were coded at Level 6.

Student-Support Stance***

The teacher is coded as coaching, modeling, or watching and giving feedback. The total number of times in which coaching, modeling, and watching/giving feedback were coded was divided by the total number of responses that were coded at
Level 6.

Active Responding****

Children are engaged in one or more of the following Level 7 responses: reading, writing, oral responding, and manipulating. The total number of "active responding" codes out of the total number of Level 7 responding codes was calculated.

Passive Responding****

Children are engaged in one or more of the following Level 7 responses: reading turn-taking, oral responding turn-taking, listening. The total number of "passive responding" codes out of the total number of Level 7 responding codes was calculated.

Time on Task

At the end of the 5-minute note-taking segment, the observer took a count of the number of children in the room who appeared to be engaged in the assigned task out of all the children in the room. If a child was quiet but was staring out of the window or rolling a pencil on their desk, this was not counted as being on task.

 

 

 

 

 

 

*Percent of time (5-minute segments) coded.

**Percent of all reading segments coded.

***Percent of all codes for teacher interaction.

****Percent of all codes for student responding.

Statistical analysis. Hierarchical linear modeling (HLM; Raudenbush, Bryk, & Congdon, 2000) was used to investigate the impact of school-level and classroom-level characteristics on students' reading growth. Descriptive analyses were also conducted to elaborate on the quantitative findings.

HLM is a method of completing regression at multiple levels. The analyses in this study employed a two-level HLM model in which students were nested within classrooms or schools. Schools and classrooms were analyzed separately because there were different numbers of students at the school level than at the classroom level, since students whose teachers had declined to participate in the classroom observations were still included in the assessments for the school-level analysis. The number of schools was also insufficient to obtain stable results from a three-level model, in which students are nested within classrooms and classrooms are nested in schools.

HLM essentially estimates a regression within each school or classroom and combines these to see if they point to a common regression across schools or classrooms. When regressions (either the intercepts or slopes) vary across schools, then we can examine the school-level or classroom-level characteristics that may explain such variation. This is a common method for evaluating school-level and classroom-level factors and their effects on student outcomes. A simple regression would be inappropriate in these situations, since it would violate the independence assumption.

HLM also partitions variance components across levels, providing an estimate of variance in student performance within and between classrooms or schools. An unconditional HLM is one without an explanatory variable that allows us to answer the question: how much variance in student outcome can be attributed to systematic differences between classrooms and schools on specific factors? This analysis is equivalent to a random-effects analysis of variance. Estimations using HLM rest on assumptions similar to multivariate multiple regression.

Because of the improved estimation enabled by HLM, including the use of maximum likelihood and empirical Bayes estimates, interpretation of statistical results can be broadened to include a larger p-value associated with statistical tests. Furthermore, statistical results with p-values at or near 0.10 should be included in interpretation and explored in further studies with smaller numbers of cases (e.g., with fewer teachers or schools) because such results indicate that there are relationships which merit further exploration. For a more complete description of estimation of HLM, see Bryk and Raudenbush (1992, pp. 32-56). HLM is recognized as a standard program for estimating multi-level models (Bryk & Raudenbush, 1992; Kreft & De Leeuw, 1998).

Results

Variations Across Schools in School-Level Characteristics

School effectiveness. Students' fall and spring scores are presented in Table 7. We began by running a two-level HLM analysis investigating the relationship between students' spring fluency scores and school effectiveness factors (See Table 8). This analysis was based on data from 877 students across all 11 schools (including the students from the three schools that were in the study during both Years 1 and 2). Our HLM analysis revealed that the school-level factor of collaborative leadership accounted for 29% of second- through sixth-graders' variance in spring fluency scores (number of words read correctly in one minute), after accounting for fall scores. This means that of the total variation in students' scores, 71% could be attributed to differences between students after accounting for fall scores irrespective of school, and 29% could be attributed to between-school differences. Differences in collaborative leadership scores in turn accounted for 24% of between-school variance.

Across all schools, the mean school fluency score was 104.5 words correct per minute. One way of gauging the influence of collaborative leadership is to note that for every additional point scored on the collaborative leadership scale, a school's mean fluency score showed an increase of 27.4 words correct per minute. If we note that students increased their scores by an average of 20 words correct per minute per year (see Table 7) and that school scores on the collaborative leadership scale ranged from 1.1 to 1.9 with a mean of 1.7 (out of a high score of 3), then we can surmise that, at least in principle, a school gaining one additional point on the collaborative leadership scale could make up a year's worth of fluency performance.

The HLM analysis of spring writing scores among second- through sixth-graders (see Table 9) revealed that the school-level factor of collaborative leadership accounted for 25% of the variance in students' spring writing after accounting for fall scores. Collaborative leadership scores accounted for 40% of the between-school variance.

Across all schools, the mean school writing score was 1.96 on a four-point scale. A way of gauging the importance of the 42% of variance contributed by collaborative leadership is to note that for every additional point on the collaborative leadership scale a school's mean writing score showed an increase of .85. After including fall scores, HLM analyses investigating the relationship between school effectiveness scores or subscores (e.g., collaborative leadership) and students' spring reading comprehension were not significant.

Means and Standard Deviations for Student Scores K-6*

Assessment Tool

Grade

N

Fall

Spring

Letter Names

K

191

26.88

47.64

 

1

199

(18.06)

(7.81)

 

 

 

49.06

 

 

 

 

(6.02)

 

Rhyme

K

191

3.09

5.39

 

1

200

(3.22)

(3.00)

 

 

 

5.23

 

 

 

 

(3.01)

 

Phonemic Awareness

K

191

 

5.80

 

1

200

6.59

(4.58)

 

 

 

(4.59)

 

Concepts of Print

K

191

4.25

6.84

 

1

200

(2.58)

(1.64)

 

 

 

7.29

 

 

 

 

(1.70)

 

Word-Level Dictation

K

195

 

10.39

 

1

213

14.72

(8.82)

 

 

 

(9.58)

 

Fluency

1

188

 

47.89

 

2

200

52.22

(32.00)

 

3

189

(31.74)

81.30

 

4

192

76.51

(33.94)

 

5

191

(30.69)

89.84

 

6

68

102.73

(31.55)

 

 

 

(36.59)

120.13

 

 

 

123.23

(38.07)

 

 

 

(42.50)

138.42

 

 

 

130.44 (37.08)

(43.15)

 

 

 

 

129.40 (44.86)

Gates Comprehension (NCE)

1

176

 

48.16

 

2

214

40.05 (19.20)

(17.82)

 

3

219

37.40 (14.03)

41.91

 

4

190

35.68

(19.41)

 

5

191

(17.68)

37.86

 

6

68

36.11

(15.78)

 

 

 

(18.83)

35.83

 

 

 

41.98

(17.13)

 

 

 

(18.76)

38.75

 

 

 

 

(17.57)

 

 

 

 

47.22

 

 

 

 

(16.22)

Writing

1

 

 

 

 

2

211

1.61

1.89

 

3

205

(.60)

(.72)

 

4

154

1.76

1.92

 

5

176

(1.55)

(.73)

 

6

50

1.70

2.01

 

 

 

(.67)

(.80)

 

 

 

1.96

2.18

 

 

 

(.70)

(.79)

 

 

 

1.92

2.18

 

 

 

(.72)

(.92)

*Only three of 11 schools had sixth-grade students. However, one of these schools had been with the project for 2 years, so 4 cohorts of grade 6 students were included in the data analyses.

Grades 2-6 Reading Fluency with Collaborative Leadership

Initial random Effects

Variance component

% Variance between

School Means

226.11

29

Fall Score Slope

.01

 

Student Residual

560.07

 

Total

786.19

 

Final random effects

 

% Variance accounted for by model

School Means

172.46

24

Fall Score Slope

.009

 

Student Residual

560.05

 

Final fixed effects

Coefficient

t-ratio

df

p-value

Intercept (Grand Mean)

104.54

28.88

12

.000

Collaborative Leadership

27.39

2.22

12

.046

Fall Score

.82

25.31

13

.000

As reported earlier, all teachers in each building had completed a two-part self-study questionnaire during Year 2. The first part of the questionnaire dealt with teacher's perceptions of various building- and classroom-level factors within their school; the second part dealt with their opinions of the school's needs in regards to its change efforts. In the first part of the questionnaire, teachers rated each of 38 items on a scale from 1 (strongly disagree) to 5 (strongly agree). These items dealt with school change, climate, and leadership; professional development; schoolwide decisions about reading instruction; classroom reading instruction; reading interventions for struggling readers, and school-home-community connections.

 

 

 

 

 

 

Grades 2-6 Writing with Collaborative Leadership

Initial random Effects

Variance component

% Variance between

School Means

.145

25

Fall Score Slope

.029

 

Student Residual

.416

 

Total

.590

 

Final random effects

 

% Variance accounted for by model

School Means

.087

40

Fall Score Slope

.030

 

Student Residual

.416

 

Final fixed effects

Coefficient

t-ratio

df

p-value

Intercept (Grand Mean)

1.96

23.67

12

.000

Collaborative Leadership

.85

3.02

12

.011

Fall Score

.35

 

13

.000

The school leadership ratings from our interviews were positively related to 12 of the 38 items from this first part of the questionnaire. The following is a list of the attributes which teachers saw as salient in cases where they perceived their schools to have strong leadership. The questionnaire items which were positively related to the interviews' school leadership ratings were:

Reform effort. We ran two-level HLM analyses investigating the relationship between students' spring comprehension, fluency, and writing scores and the school reform effort score for the total sample of 877 students in the 11 project schools (with data from three schools included separately across Years 1 and 2). HLM analyses investigating the relationship between the reform effort score and students' spring reading comprehension, fluency, or writing scores (after including the relevant fall scores) were not significant.

Using our coding of the 10 components of reform implementation, we were able to take a look at which schools were successfully implementing various factors. On the whole, we found that schools were having an easier time holding weekly study group meetings than they were holding monthly large-group meetings to share information across study groups and deal with schoolwide reform issues. Generally, schools were having an easier time meeting in grade-level groups than in cross-grade groups. Finally, schools were having an easier time reflecting on instruction and student work in study groups than they were focusing on research-based topics for periods of 3 months or longer. Most schools had not yet turned to the reform component of working with parents as partners (see Table 10).

Reform Effort Ratings

Reform effort variable

Percent of schools demonstrating this reform variable

Meeting 1 hour per week in study groups

90%

Meeting in cross-grade study groups

10%

Reflecting on instruction and student work

90%

Considering research-based practices

20%

Being guided by action plans

30%

Sticking with substantive topics for 3-4 months or more

20%

Meeting once a month as a whole faculty to share, etc.

60%

Working on a plan to involve parents as partners

40%

Effective use of external facilitator

40%

</