Decodable Texts for Beginning Reading Instruction: The Year 2000 Basals

James V. Hoffman
Misty Sailors
Elizabeth U. Patterson
The University of Texas--Austin

O ver the past decade, basal textbooks have become a virtual lightning rod in the "reading wars" (Pikulski, 1997; Strickland, 1995): Should beginning reading instruction be literature-based or skill-based? Should the language in texts be highly literary or highly decodable? Both sides in the debate have resorted to using state textbook adoption policies as an effective leverage point for change (Hoffman, in press). Educators and politicians in Texas and California in particular have played significant roles in pushing early reading instruction from one extreme position to another through shifts in textbook adoption requirements (Farr, Tulley, and Powell, 1987). The textbook policy actions taken in the states of Texas and California are more than just isolated cases and more than a reflection of the national trends. These actions are shaping a national curriculum for reading. Basal publishers target their product development toward these states, and the programs that are marketed successfully in Texas and California are the ones that are most likely to thrive, with minimal changes, in the highly competitive national marketplace.

We have been engaged in a study of the nature and effects of changes in the texts used for beginning reading instruction (Hoffman, McCarthey, Abbott, Christian, Corman, Dressman, et al., 1994; Hoffman, McCarthey, Elliot, Bayles, Price, Ferree, et al., 1998; McCarthey, Hoffman, Elliott, Bayles, Price, Ferree, et al., 1994), and have documented changes in basal reading programs that result from the state mandates in Texas for more literature-based teaching practices and materials. Further, we have described some of the ways in which these changes in the textbooks have influenced instructional practices.

We are continuing to explore the most recent changes in basal texts associated with Year 2000 requirements for reading textbooks in Texas. These changes constitute a dramatic reversal in perspective and priorities from adoptions during the previous two decades. Literature-based teaching principles and practices and the valuing of quality literature have been pushed to the background. In their place, we find a growing emphasis on vocabulary control that is tied to more explicit skills instruction. These events are clearly driven by state policy initiatives. Just as the Texas adoptions of the early 1990's required publishers to use authentic children's literature as the texts for beginning reading instruction, the Year 2000 Texas mandates provided for severe restrictions on vocabulary and explicit skills teaching. This report focuses specifically on the Texas state basal reading adoption for the year 2000 and the impact of these new mandates on program features.

Historical Background and Current Trends in Basal Texts for Beginning Readers

To fully appreciate the magnitude of the changes that have taken place over the past two decades it is necessary to briefly review the history of basals and their use in the United States over the past century. While the "one reading book per grade level" principle can be traced back to the mid-nineteenth century and McGuffy's (1866) readers, the term "basal" was not used to describe commercial programs until the early twentieth century (Hoffman, in press). In its early use the term "basal" was not used as much to identify an "approach" as it was to describe a commercial program, which employed different readers for each grade level. Many of these early series used the term "progressive" in their titles, not to imply a "new approach" to teaching reading, but as a description of the leveled nature of the books in the program. It was the growing consensus surrounding the "look-say" method, popularized in basals in the mid-1950s, that led to the association of basals with a particular approach or philosophy. This consensus was epitomized in Scott Foresman's "Sally, Dick and Jane" readers (Smith, 1965). Repeated practice with the same small set of words was seen as the key to promoting decoding abilities. Vocabulary control was the primary factor used in the leveling of these texts, from the pre-primers through the primers and into the first readers.

In classrooms, basals were dominant through the 1950s and 1960s (Austin & Morrison, 1963). They were not revered in all quarters, however. In fact, traditional, "look-say" basals came under severe attack--both in the public (Flesch,1955) and scholarly press (Chall, 1967). Most of these criticisms focused on the lack of attention to systematic phonics instruction.

Basals changed in the 1970s and 1980s. Helen Popp (1975) described the changes during this period in terms of an increase in vocabulary (a loosening of control) and in the number of skills taught. However, she lamented the mismatch between the skills taught and the words read. Rudolf Flesch (1981) was less generous, describing the changes as superficial, leading the new basals even further off track than their predecessors had been in the 1950s. Flesch's "dirty dozen" (i.e., the dominant and most popular basals) continued to avoid explicit skills instruction and relied too heavily on sight word teaching. Flesch argued for the "fabulous four" (i.e, phonic linguistic programs, such as Lippincott's) that provided for explicit skills instruction, with practice in materials that required the reader to apply the skills taught.

By the mid-1980s, no group seemed willing to defend the status quo in basals; basal "bashing" was on the rise (e.g., Shannon, 1987). Basals were attacked from the "code emphasis" side as being unsystematic (Beck, 1981), and from the "meaning emphasis" side as trivial and boring (e.g, Goodman & Shannon, 1988). Adding fuel to the fire of criticism, national assessments continued to point out the failure of schools to meet the literacy needs of all learners--in particular, the failure of schools to meet the needs of minority children (Mullis, Campbell & Farstrup, 1993). Advocates for a literature-based approach to beginning reading instruction argued for expanded criteria (i.e., beyond vocabulary control and skills match) for judging the adequacy of texts to be used for beginning reading instruction (Galda, Cullinan, & Strickland, 1993; McGee, 1992; Wepner & Feeley, 1993). These expanded criteria included consideration of the quality of the literature, the predictability of the text structures, and the quality of the design. While some attempts, such as Rhode's (1979) criteria for predictable texts, were made to quantify these values into specific standards, more often the call for quality took the form of a call for more "authentic" literature. Operationally, "authentic" was interpreted by policymakers and program developers to mean that the literature used in basals must have first appeared as a published tradebook. Stories written "in-house" by basal authors and editors were discredited. The California and Texas adoptions in the early 1990s required basal publishers to attend to the quality of literature.

Our comparison of the literature-based basals (1993 editions) targeted for the Texas market to the skills-based basals (1987 editions) confirmed that the policy mandate for more quality literature in the texts for beginning reading had been successful (Hoffman, et al., 1994). Ratings on the engaging qualities of text, which focused on content, language, and design features, were found to be significantly higher for the literature-based basals. Further, our analysis revealed that predictability was being used far more often than in the past as a support for students reading challenging texts. However, lost in the enthusiasm for authentic literature was any systematic attention to the decoding demands of the texts. In fact, decoding demands increased dramatically with the new programs, and vocabulary control all but disappeared.

It became clear as we studied the implementation of these programs in Texas classrooms in the mid-1990s that many readers struggled with the challenge level of the materials (Hoffman, et al., 1998). This problem was particularly severe in schools serving large populations of "at-risk" students, especially at the start of the first grade year. In their 1996 adoption, the California legislature demanded that publishers attend to more explicit teaching of skills, but offered no specific requirements for the decodability of the text. Basal publishers responded. At the same time, there was an influx of "little books" specifically designed to support the development of decoding. Early on, these little books were imported directly from New Zealand, where they were used in association with the Reading Recovery program. Basal publishers in the United States began to produce similar materials to support decoding.

Menon and Hiebert (1999) analyzed the basal anthologies and little books (Martin & Hiebert, 1999) published during this period, using a computer-based text analysis program (Martin, 1999) to estimate the degree of decodability of the words presented (Figure 1). Following their procedures for estimating decodability, each word appearing in the text is classified on a scale from 1 (representing the easiest decoding demands--e.g., words with the consonant/vowel/consonant patterns) to 8 (representing the most complex levels of decoding demands--e.g., multisyllabic words and words with irregular phonic patterns). The average decodability rating for a text was the average decodability of the words presented. Despite the concerns expressed in this California adoption, Menon and Heibert found little evidence of any systematic attention to decodability.

CIERA text analysis values for levels of decodability

Level

Pattern

Excludes

Example

1

A, 1

C-V

A, I

Me, we, be, he, my by, so, go, no

 

2

C-V-C

V-C

No words ending in R or L

Man, cat, hot

Am, an, as, at, ax, if, in, is, it,

of, on, ox,

up, us

3

C-C-V

V-C-C-[C]

C-C-[C]-V-C

C-V-C-C-[C]

C-C-[C]-V-C-C-[C]

No words ending in R or L

r-C or 1-C (e.g. fort, mild) or V-gh (e.g. sigh)

She, the who, why, cry, dry

Ash, itch

That, chat, brat, scrap

Back, mash, catch

Crash, track, scratch

4

[C]-[C]-[C]-V-C-e

Bake, ride, mile, plate, strike, ate

 

5

C-[C]-V-V-[C]-[C]

V-V-C-[C]

No words ending in -gh (e.g laugh, through, though)

Beat, tree, say, paid

Eat, each

6

C-[C]-V-r

[C]-[C]-V-r-C

[C]-[C]-V-11

 

C-[C]-V-1-C

C-[C]-V-V-1-C

 

Car, scar, fir

Farm, start, art, arm

All, ball, shall, tell, will,

Told, child

Could, should, field, build

7

Diphthongs

 

Boy, oil, draw, cloud

8

Everything else

 

 

Leveled Texts in Beginning Reading Instruction: A Theoretical Perspective

The historical trends in basals which lead up to the Year 2000 adoption in Texas are only a single reference point for the current study. It is just as important to offer a theoretical reference point for understanding "leveled" texts in beginning reading and the role they play in the development of decoding skills. We use the term "leveled" to refer to texts that are graduated in difficulty or challenge level. The term "leveled text" is inclusive of both the traditional pupil texts found in basal reader programs and the many "little books" that are currently being marketed separately or in conjunction with basal reader programs (Roser, Hoffman, & Sailors, in press). The current study is grounded in a theoretical framework that draws attention to a set of key text factors promoting the acquisition of decoding skills (Hoffman, in press). This theoretical framework posits three major factors as important in the leveled texts used in beginning reading: instructional design, accessibility, and engaging qualities.

Instructional Design

The instructional design factor addresses the question of how the words in leveled texts' various selections reflect an underlying instructional strategy for building decoding skills. Certainly, Beck's (1981, 1997) writings, as well as the recent mandates for decodable text in the State of Texas, reflect a concern tied to instructional design. This valuing of instructional consistency and alignment of skills taught and words read is not the only perspective one might adopt when considering instructional design. A sight word or memorization perspective, for example, might emphasize repetition and frequency over alignment of skills. Hiebert (1998) has argued for the importance of text in providing for practice with words and within-word patterns as a critical force in the development of decoding abilities. Frequent "instantiations" of patterns in a variety of contexts support the development of automaticity and independence in decoding. These instantiations may be in the form of repeated high frequency words, or of repeated common rimes (e.g., -og, -ip). Text with a strong instructional design for beginning readers provides for repeated exposure to these patterns, starting with the simplest, most common, and most regular words, and then builds toward the less common, less regular, and more complex words. Hiebert and her colleagues have developed a software program--the CIERA TexT Analysis Program (Martin, 1999)--that assesses these qualities. The program produces a text analysis that identifies, for example, the number of different rimes and instantiations for each rime, and the repetition rate of high-frequency words. In addition, the program analyzes the proportion of unique words to total words, referred to as the "density" of the text--that is, the average number of words a reader would encounter before meeting a unique (i.e., new) word. Text that supports the development of decoding must attend to all of these factors. The key to evaluating the instructional design of a series of leveled texts rests on an examination of the underlying principles for the development of the program, as they interface with the words which students are expected to read in texts.

Accessibility

As evidenced in this historical review, the leveling of texts to provide for "small steps" in growth has been a primary focal point of debate. Traditional readability formulas--a quantitative estimate of text difficulty--have proven a less than satisfactory tool for differentiating texts at the early grade levels (Klare, 1984). Readability formulae are simply too atheoretical and quantitative to capture many important dimensions of decoding and fluency development. Accessibility, in contrast, considers both the degree of decoding demands placed on the reader to recognize words in the text and the "extra" supports surrounding the words, which assist the reader with identification, fluency, and, ultimately, comprehension. For the analysis reported in this study, accessibility in text is tied to two factors: decodability and predictability. Decodability is focused on the word level, and reflects the use of high-frequency words, as well as words that are phonically regular. Predictability refers to the surrounding linguistic and design support for the identification of difficult words (e.g., rhyme, picture clues, repeated phrases). Decodability and predictability can work in concert to affect the accessibility of the text. Like engaging qualities, decodability and predictability are challenging constructs to measure. However, we have again found that holistic scales, rubrics, and anchor texts lead to reliable leveling. Further, we have found that these scales have validity in relation to student performance (Hoffman, Roser, Salas, Patterson & Pennington, 2001).

Engaging Qualities

No theory of text, even one focused on the development of decoding abilities, can ignore issues of content and motivation. The construct of "engaging qualities" draws on a conception of reading that emphasizes its psychological and social aspects (Guthrie and Alvermann, 1998). Engaging text is interesting, relevant and exciting to the reader. Three factors in the engagingness of text are represented here: content, language, and design. Content refers to what the author has to say. Are the ideas important? Are they personally, socially, or culturally relevant? Is there development of an idea, character, or theme? Does the text stimulate thinking and feeling? Language refers to the author's way of presenting the content. Is the language rich in literary quality? Is the vocabulary appropriate but challenging? Is the writing clear? Is the text easy and fun to read aloud? Does it lend itself to oral interpretation? Design refers to the visual presentation of the text. Do the illustrations enrich and extend the text? Is the use of design creative and attractive? Is there creative use of print? Of course, all of these factors are discussed with reference to an assumed audience of beginning readers. Higher levels of engaging qualities are associated with greater effectiveness in supporting the development of decoding. The measurement of these qualities is a formidable, but not impossible, task: we have achieved high levels of reliability in their coding by using a combination of rubrics, anchor texts, and training, (Hoffman, et al., 1995). We have also validated these measures in relation to student preferences for text and found support for their salience (McCarthey, et al., 1994).

Whereas the presence of engaging qualities is viewed as a positive attribute for all leveled texts, the scaling for accessibility features and instructional design vary as an implied function of reader development. At the earliest levels, the optimal mix for accessibility may place fewer decoding demand on the reader while providing more support through predictive features. At higher levels, the decoding demands may increase while the amount of support offered through predictable features decreases. In the leveling of text, accessibility and instructional design must work together. Text that is highly accessible but does not push the reader to new discoveries is not useful in promoting automaticity (the instructional design factor). In contrast, text that pushes the reader into more complex patterns too quickly or haphazardly, without regard for accessibility, is of little help in promoting independence in decoding.

Two cautions are important before closing this discussion of leveled texts and decoding. First, our identification of the text factors that support of decoding is not meant to devalue the role of the teacher. The three factors--instructional design, accessibility, and engaging qualities--are reference points for leveled text only. The text is a tool which helps the teacher and reader reach the goal of early reading development. The success of this effort depends directly on careful and responsive teaching. Second, these text factors for leveled text may not be useful for characterizing the optimal structure of other kinds of texts important to the classroom literacy environment (e.g., trade books, reference materials, content area textbooks). Instructional design, accessibility and engaging qualities are factors that apply to leveled texts aimed specifically at the development of decoding skills and strategies. Leveled texts must work in concert with other texts and instructional experiences to promote independent reading.

The Year 2000 Texas Basal Adoption

While our research does not specifically focus on the policy formation activities surrounding the recent basal adoption in Texas, some background information is useful. Five publishers submitted complete K-3 basal programs in response to the Texas Textbook Proclamation of 1998 for the Year 2000 adoption. Stringent requirements were imposed on these programs for compliance with the state curriculum, and for the "decodability" of the words included in the pupil texts at the first-grade level. The decodable text construct called for in the Texas proclamation was significantly different from the construct represented in the holistic scales used in our previous research (Hoffman, et al., 1994) as well as in the research of Menon and Heibert (1999). The construct applied in Texas was more closely aligned with the work of Beck (1981) and Stein, Johnson and Gutlohn (1999). This conception of decodability rests not so much on specific word (phonic) features as it does on the relationship between what is taught in the curriculum (i.e., the skills and the strategies presented) and the characteristics of the words read. Rather than ranging on a continuum from high to low decoding demands/complexity, the Texas definition yields a yes/no decision on the decodability of each word. Following this model, the word "cat" is decodable only if the initial "c," the medial short "a," and the final "t" letter/sound associations have been taught explicitly within the program skill sequence. A word like "together" might be defined as decodable if all of the "rules" needed to decode it had been explicitly taught prior to students' encountering it in the text. A word that is not decodable at one point in time may become decodable after new skills are taught.

Decodability, as defined by the Texas Education Agency, refers to the percent of words introduced that can be read accurately (i.e., pronunciation approximated) through the application of phonics rules that were explicitly taught in the program design prior to the student encountering the word in connected text. We will refer to this as the "instructional consistency" perspective. Within this perspective, the decodability of a word is determined by the instruction that has preceded the appearance of the word in a selection.

Originally, the standard applied in the Texas review process was that an average of 51% of the words in each selection should be decodable in those selections which the publisher had designated as decodable. This standard was drawn literally from the Texas Essential Knowledge and Skills (TEKS) requirement that a "majority" of words be decodable. Later, the state board of education raised the standard to 80% of the words for each selection deemed decodable by the publisher. The Board did not cite any research evidence i support of the 80% level of decodability; however, some have suggested that Beck's (1997) estimate of 80% decodable as a minimum was the basis for this prescription. Eventually, all five of the publishers met the 80% standard and their products were approved for use in the state (S. Dickson, personal communication, December 3, 1999).

Research Questions

The research questions we address in this study are directly related to the requirements of the 2000 adoption in Texas. These questions also build on our previous work in this area.

Methodology

Many of the procedures followed in this study replicated those used in Hoffman et al.'s (1994) study, which compared the features of the 1987 basals (characterized as skill-based) with those of the 1993 basals (characterized as literature-based). For the current study, all of the texts from the first grade programs (2000 adoption) were entered into text files and analyzed for word-level features and vocabulary repetition patterns. Predictability, decodability, and engaging qualities were assessed by trained raters, who applied holistic scoring procedures and scales to the actual pupil text materials. This replicated the procedures followed in the study of the 1987 and 1993 adoption materials (Hoffman, et al., 1994).

In addition to our analysis of the 2000 basals, we also reanalyzed some of the data from the 1987 and 1993 basals to allow for comparisons across the three adoption periods. We limited our historical trends analysis using these comparative data to the three programs that have been part of all three of the most recent Texas adoption cycles (1987, 1993 & 2000).

Texas State-Approved Basal Programs for the Year 2000

The five basal programs are identified in this report through a letter identification system. This system keeps the focus on research variables, rather than program comparisons. The data are summarized in Table 1. Five factors should be kept in mind as these program descriptions are considered.

  1. Materials were included in this analysis if they were included in the "bid" materials. In other words, these are the materials that would be obtained directly through the state's plan for purchasing materials. Publishers may have provided additional materials either "free" of charge to school districts who adopted their program or as additional purchases, but these were not included in our analysis.
  2. Publishers had the option of designating which of the selections in their programs would be considered decodable, as a way of fulfilling the state criteria for decodability. These were the only selections analyzed by the Texas Education Agency. We analyzed all of the selections using holistic scales (Hoffman, et al., 1994) and the CIERA Text Analysis framework (Martin, 1999), as beginning readers will encounter many more texts in the basals than the subset analyzed by TEA.
  3. There were program changes made by the publishers after our cutoff date of December 1, 1999. These changes were made as TEA and publishers negotiated to achieve the 80% standard set by the state board. In some cases, additional materials were added to the programs in order to meet the state's criteria. We analyzed the materials that were distributed to school districts for adoption consideration, but which did not include these revisions.
  4. We analyzed all of the selections that were designated by the publisher for the student to read. If there was an indication that the teacher was to read the text to the students, then it was not included in our analysis.
  5. Most of the programs are divided into five levels; one consists of six levels. In an effort to increase comparability, we combined the fifth and sixth levels of Program C into a single Level 5.

We use the general term "anthologies" to describe the selections included in the student readers, and the term "little books" to describe the selections that appeared in ancillary reading materials. The format for the little books varied from program to program. In some programs, little books were bound books; in other programs, little books were to be constructed by the teacher from black-line masters.

Basal Texts Analyzed

Program

Number of Selections

Number of Selections in "Anthologies"

Number of "Little Books"

% of selections Identified by Publisher as Decodable

Comparison Data Available 1987 & 1993

A

101

51

50

95

yes

B

160

85

75

44

no

C

154

81

73

43

no

D

102

72

30

49

yes

E

100

100

0

49

yes

Data Analysis

We conducted three types of analyses, using the three major theoretical factors:

  1. All of the procedures for holistic analysis of the texts from the 1994 study were replicated. This analysis focused on the following five-point scales: Decodability, (1 = low demands to 5 = high decoding demands); Predictability (1 = high levels of support to 5 = low levels of support), and Text Engaging Qualities (with separate analytical scales for content, language, and design features). Raters on these scales were trained following the procedures in the 1994 study. Each selection in each of the five basals was rated independently by at least two members of the research team. Ratings that differed by only one point were averaged. Ratings that differed by more than one point were negotiated with a third rater. Inter-rater reliability on these scales was checked after training and after scoring of the texts. The agreement levels remained above 80 percent.
  2. In addition, we analyzed all of these same text files using the CIERA Text Analysis Program (Martin, 1999). This program yields data on average decodability of words, assigning a value of 1 to 8 (low to high complexity) to each of the words in the text. The program also yields information on word repetition, rime pattern frequency, and the frequency of rime instantiations in the text. A partial listing and description of the CIERA variables is presented in Appendix A.
  3. We have also included the results of the analyses conducted by the Texas Education Agency during their official review of the materials. The analysis of decodability rested on a comparison of the skills taught in the program with the phonic structure of each word in the text. Words were judged as either decodable or not decodable based on whether the skills which had been taught up to that time would yield a close approximation of the pronunciation of the word. Their analysis also yielded a "potential for accuracy" score on each selection. This score represents the sum of decodable words plus the words explicitly taught as sight words, divided by the total number of words. The description of the procedures followed by the Texas Education Agency is included in Appendix B.

Findings and Discussion

The data from this study reflects an analysis of over 100,000 words and over 600 selections from the 2000 basals, and is combined with a re-analysis of data from two previous adoptions. There are over 25 different variables derived from the holistic scales, the TEA analysis, and the CIERA Text analysis. The reporting of the data is guided by our two primary research questions. We will focus initially on describing the three major features of the texts for the Year 2000 basals as they relate to the designated "decodable" standards set by the state of Texas. We will then present the findings of an analysis comparing data from the 2000 basals to data from the previous two adoption cycles (1985 & 1993).

The Year 2000 Basals

Our analysis of the data for the Year 2000 basals focused on the three major factors which we had identified as theoretically important: instructional design, accessibility (decodability and predictability), and engaging qualities.

Instructional Design

This factor describes the importance of text that provides repeated practice with words and within-word patterns--features which are a critical to the development of decoding abilities. Table1 shows the range of the number of selections across the five levels for the five programs, from a low of 100 to a high of 160. The data reflect the breakdown of program selections that were designated as both decodable and non-decodable by their publishers. About half of the total number of selections across programs were labeled as decodable by the publishers (ranging from 30% in Program E to 96% in Program A). The total number of words found in the programs ranged widely, from 13,793 to 25,928. Total number of unique words ranged from 1,740 to 3,287. "Unique words" refers to the number of different words, and this was calculated within each program. Both the average number of words per selection, F (4,592) = 38.53, p < . 001 (Table 2), and the average number of unique words per selection, F (4,592) = 62.64, p <. 001 (Table 3) showed a statistically significant main effect related to program level. Both the average number of unique words and the average number of words per selection increase across levels. This finding suggests some attention on the part of the publishers to the instructional design factor, in the sense of providing for more practice with fewer words at the earlier levels. These averages are lower than those found in Menon and Hiebert's (1999) analysis of the basal anthologies submitted for the California adoption in the mid-1990's. They found averages of 170 words per selection and 75 unique words per selection. This difference could be explained by the influence of California's emphasis on more decodable text on the 2000 text adoption, or it could be explained by the fact that Menon and Hiebert's data only includes the words appearing in anthologies--not little books or decodable books. When we look at our data for only the anthologies in our data set, the averages are 165 words per selection and 72 unique words

Basal Texts Analyzed

Program

Number of Selections

Number of Selections in "Anthologies"

Number of "Little Books"

% of selections Identified by Publisher as Decodable

Comparison Data Available 1987 & 1993

A

101

51

50

95

yes

B

160

85

75

44

no

C

154

81

73

43

no

D

102

72

30

49

yes

E

100

100

0

49

yes

Total Words (average per selection)

 

level 1

level 2

level 3

level 4

level 5

Combined

Program A

27.0

(16.6)

n = 24

86.1

(41.0)

n = 24

154.3

(58.4)

n = 21

245.5

(168.3)

n = 17

228.9

(142.2)

n = 15

134.3

(123.9)

n = 101

Program B

67.5

(57.9)

n = 38

144.7

(128.3)

n = 29

184.9

(134.8)

n = 29

223.7

(223.8)

n = 22

215.1

(201.4)

n = 42

163.0

(165.8)

n = 160

Program C

58.2

(52.8)

n = 29

86.6

(98.4)

n = 28

143.4

(92.7)

n = 24

246.7

(186.4)

n = 23

236.6

(175.0)

n = 50

162.7

(156.1)

n = 154

Program D

44.9

(21.6)

n = 22

109.4

(102.1)

n = 21

116.0

(106.1)

n = 11

228.2

(183.1)

n = 16

255.9

(216.8)

n = 32

160.8

(173.1)

n = 102

Program E

71.4

(28.2)

n = 19

78.8

(59.8)

n = 19

166.6

(123.0)

n = 21

210.0

(159.3)

n = 20

216.8

(173.3)

n = 21

151.1

(136.7)

n = 100

All

54.9

(45.0)

n = 132

103.2

(96.2)

n = 121

158.7

(108.5)

n = 106

230.8

(183.7)

n = 98

231.5

(186.7)

n = 160

155.9

(153.8)

n = 617

Total Unique Words (average per selection)

 

level 1

level 2

level 3

level 4

level 5

Combined

Program A

13.7

(8.5)

n = 24

44.3

(17.9)

n = 24

79.5

(22.2)

n = 21

108.9

(55.8)

n = 17

103.8

(51.5)

n = 15

64.0

(48.9)

n = 101

Program B

31.4

(28.5)

n = 38

61.0

(50.0)

n = 29

86.1

(61.0)

n = 29

96.9

(59.7)

n = 22

91.8

(58.4)

n = 42

71.5

(57.2)

n = 160

Program C

24.8

(15.6)

n = 29

33.1

(28.6)

n = 28

53.3(25.8)

n = 24

87.1

(55.5)

n = 23

97.0

(56.3)

n = 50

63.5

(51.7)

n = 154

Program D

23.7

(7.7)

n = 22

42.2

(26.0)

n = 21

57.0

(32.0)

n = 11

93.7

(58.6)

n = 16

99.6

(68.8)

n = 32

65.9

(56.6)

n = 102

Program E

37.4

(13.4)

n = 19

35.4

(13.5)

n = 19

67.1

(33.5)

n = 21

103.8

(59.2)

n = 20

98.7

(54.0)

n = 21

69.4

(48.9)

n = 100

All

26.3

(19.6)

n = 132

43.9

(32.8)

n = 121

70.6

(41.4)

n = 106

97.6

(57.1)

n = 98

97.0

(58.3)

n = 160

67.0

(53.1)

n = 617

per selection. These averages are still lower than the Menon and Hiebert findings, suggesting a modest drop in both numbers independent of the format issue.

Several other factors associated with the construct of instructional design showed a similar pattern across program levels. The percent of words following the CVC pattern showed a statistically significant pattern across program levels, F (4,612) = 50.35, p < .001, declining from a high of 68.7% at Level 1 to 47.8% at Level 5. The percent of unique rimes showed a statistically significant pattern across program levels, F(4,612) = 70.08, p < .001, rising from 16.6% at Level 1 to 52.5% at Level 5. Finally, the average total instantiation of rimes showed a statistically significant pattern across program levels, F(4,612) = 9.394, p < .001, declining from 78.9 at Level 1 to 72.6 at Level 5. All three of these analyses suggest that the text is leveled in a way that reflects attention to the instructional design features that support decoding. There are fewer rimes, more common patterns, and more instantiations of these patters at the earlier levels. Further analyses of these data reveal that the selections designated as decodable by the publishers reflect these patterns more than do the selections designated as non-decodable. The average percentage of CVC words for the designated decodable text was 64.5%, and for the designated non-decodable text was 50.3% F(1,615) = 124.37, p <.001. The average percentage of unique rimes for the designated decodable text was 41.3%, and for the designated non-decodable text was 35.6% F(1,615) = 7.121, p < .001. The average instantiation of rimes for the designated decodable text was 82.7, and for the designated non-decodable text was 70.0 F(1,615) = 182.26, p < .001. This pattern of differences between the designated decodable and designated non-decodable texts suggests that the decodable requirement may have increased the with-word regularity patterns in the text.

Accessibility

This factor refers to the difficulty of the decoding demands placed on the reader to recognize words in the text, balanced by any "extra" support (e.g., surrounding words) that may assist the reader in successful word identification. The next set of tables offer data related to the decodability ratings generated by the CIERA Text analysis program (Table 4) and the Hoffman et al. (1994) holistic scale for decodability (Table 5). The scores on the CIERA measure of decodability can range from an average of 1 (simple/common/regular words) to 8 (lesson common/less regular/more complex words). The patterns for each level are described in Figure 1. The CIERA analysis's concept of decodability is focused on the within-word level only. The data in Table 4 reflect the patterns as distributed by program, program level, and by decodability vs. non-decodability as designated by the publisher. Average decodability across all of the five programs was 4.0 (with a range from 3.7 to 4.4). There was a statistically significant main effect for program level, F(4,592) = 39.83, p < .001; across the five programs the average level of decodability increased from 3.5 at Level 1 to 4.5 at Level 5. The average across all programs for texts designated by publishers as decodable was 3.7, and for the text designated as non- decodable was 4.4. There was a statistically significant effect for designated decodable and non-decodable texts, F(4,5) = 43.87, p < .001. The difference in decodability was greatest at Level 1 (2.8 vs. 4.1) and smallest at Level 5 (4.3 vs. 4.6). These findings would suggest that the decodable text requirement had the desired impact in terms of the targeted text. By the CIERA measures, the texts are more decodable at the early levels, and the designated non-decodable text was indeed less decodable.

The data reported in Table 5 reflect our analysis of the Year 2000 basals, using the holistic decodability scale adapted from the Hoffman, et al. (1994) study (which ranged from a score of 1 for high frequency/phonically regular words to 5 for more difficult/phonically irregular words). The average decodability across all five programs was 2.4. Decodability at the program level ranged from 1.9 to 2.8 (with an average of 2.4). Decodability averages increased across program levels, from 1.8 at Level 1 to 2.7 at Level 5. There was a statistically significant main effect for level of decodability, F(4,607) = 30.17, p < .001. The average decodability across programs for those texts designated as decodable by their publishers was 1.8, and for the text designated as non-decodable was 2.8. There was a statistically significant difference between the designated decodable and non-decodable texts, F(1,615) = 176.22, p < .001. The largest discrepancies were at the earliest levels of the programs.

Average CIERA Decodability by Publisher Designation

 

Designated Decodable
or Non-Decodable

level 1

level 2

level 3

level 4

level 5

Averages

Combined

Program A

Dec

3.0

(.6)

n = 21

3.5

(.4)

n-23

4.0

(.4)

n = 21

4.3

(.5)

n = 17

4.3

(.5)

n = 17

3.8

(.7)

n = 97

3.8

(.7)

n = 101

Non-Dec

3.3

(1.8)

n = 3

3.8

(0)

n = 1

 

 

 

3.5

(1.5)

n = 4

Program B

Dec

2.5

(.6)

n = 16

3.6

(.6)

n = 25

4.2

(.5)

n = 24

5.0

(.0)

n = 2

4.6

(.1)

n = 3

3.6

(.9)

n = 70

4.2

(.9)

n = 160

Non-Dec

4.3

(.8)

n = 22

4.4

(.4)

n = 4

4.7

(.4)

n = 5

4.9

(.5)

n = 20

4.7

(.5)

n = 39

4.6

(.6)

n = 90

Program C

Dec

2.8

(.6)

n = 5

3.0

(.8)

n = 15

2.9

(.4)

n = 12

3.6

(.2)

n = 12

4.2

(.3)

n = 22

3.5

(.7)

n = 66

4.0

(1.0)

n = 154

Non-Dec

4.1

(1.5)

n = 24

4.2

(.8)

n = 13

4.4

(.8)

n = 12

4.8

(.6)

n = 11

4.8

(.6)

n = 28

4.5

(1.0)

n = 88

Program D

Dec

2.7

(.5)

n = 15

3.2

(.2)

n = 11

3.6

(.5)

n = 7

3.7

(.3)

n = 9

4.2

(3.)

n = 8

3.4

(.7)

n = 50

3.7

(.8)

n = 102

Non-Dec

3.3

(.7)

n = 7

3.4

(1.1)

n = 10

3.9

(1.3)

n = 4

3.8

(.5)

n = 7

4.4

(.5)

n = 24

4.0

(.9)

n = 52

Program E

Dec

3.3

(.6)

n = 6

3.8

(.7)

n = 6

4.2

(.2)

n = 6

4.4

(.2)

n = 6

4.7

(.4)

n = 6

4.1

(.6)

n = 30

4.4

(.8)

n = 100

Non-Dec

4.3

(1.)

n = 13

4.6

(1.0)

n = 13

4.6

(.7)

n = 15

4.7

(.6)

n = 14

4.6

(.4)

n = 15

4.6

(.8)

n = 70

 

All

Dec

2.8

(.6)

n = 63

3.4

(.6)

n = 80

3.9

(.6)

n = 70

4.1

(.5)

n = 46

4.3

(.4)

n = 54

3.7

(.8)

n = 313

 

Non-Dec

4.1

(1.2)

n = 69

4.2

(1.0)

n = 41

4.4

(.8)

n = 36

4.7

(.6)

n = 52

4.6

(.5)

n = 106

4.4

(.9)

n = 304

 

All

3.5

(1.1)

n = 132

3.7

(.8)

n = 121

4.1

(.7)

n = 106

4.4

(.7)

n = 98

4.5

(.5)

n = 160

4.0

(.9)

n = 617

 

Interestingly, the CIERA and Hoffman scales both take into account all selections submitted by the publishers, and uncover a trend toward decreasing decodability requirements across levels, suggesting that beginning readers are being asked to make bigger leaps earlier in their movement toward reading independence. The TEA index for decodability reveals no such trend.

Decodability Ratings from Hoffman, et al. (1994) by Publisher Designation

 

Designated Decodable
or Non-Decodable

level 1

level 2

level 3

level 4

level 5

TEA Deco

Combined

Program A

Dec

1.3

(.5)

n = 21

2.1

(.7)

n = 23

2.6

(.7)

n = 21

2.3

(.8)

n = 17

2.3

(.9)

n = 15

2.1

(.8)

n = 97

2.2

(.9)

n = 102

Non-Dec

3.0

(1.0)

n = 3

3.3

(2.5)

n = 2

 

 

 

3.1

(1.4)

n = 5

Program B

Dec

1.0

(.1)

n = 16

1.8

(.7)

n = 25

2.3

(.5)

n = 24

2.0

(.0)

n = 2

2.2

(.3)

n = 3

1.8

(.7)

n = 70

2.5

(1.0)

n = 160

Non-Dec

2.6

(.9)

n = 22

3.6

(.3)

n = 4

3.1

(.4)

n = 5

3.2

(1.0)

n = 20

3.3

(.7)

n = 39

3.1

(.9)

n = 90

Program C

Dec

1.6

(.2)

n = 5

1.6

(.3)

n = 15

1.5

(.0)

n = 12

2.1

(.2)

n = 12

2.1

.2)

n = 22

1.8

(.3)

n = 66

2.4

(1.0)

n = 154

Non-Dec

2.1

(1.2)

n = 24

3.3

(1.1)

n = 13

3.8

(1.1)

n = 12

3.6

(.5)

n = 11

2.5

(.4)

n = 28

2.8

(1.1)

n = 88

Program D

Dec

1.1

(.3)

n = 15

1.2

(.3)

n = 11

2.0

(.3)

n = 7

2.2

(.3)

n = 9

2.1

(.2)

n = 8

1.6

(.5)

n = 50

1.9

(.7)

n = 102

Non-Dec

1.6

(1.0)

n = 7

1.9

(.9)

n = 10

1.8

(.3)

n = 4

2.3

(.3)

n = 7

2.7

(.6)

n = 24

2.3

(.8)

n = 52

Program E

Dec

1.3

(.4)

n = 6

2.3

(.8)

n = 6

2.7

(.5)

n = 6

2.9

(.4)

n = 6

3.1

(.3)

n = 6

2.5

(.8)

n = 30

2.8

(.9)

n = 100

Non-Dec

2.5

(1.2)

n = 13

3.0

(1.1)

n = 13

2.9

(.7)

n = 15

2.9

(.4)

n = 13

3.4

(.3)

n = 15

3.0

(.9)

n = 69

All

Dec

1.2

(.4)

n = 63

1.8

(.7)

n = 80

2.2

(.6)

n = 70

2.3

(.6)

n = 46

2.3

(.6)

n = 54

1.8

(.6)

n = 228

 

Non-Dec

2.3

(1.1)

n = 69

2.9

(1.2)

n = 42

3.1

(1.0)

n = 36

3.1

(.8)

n = 51

2.9

(.7)

n = 106

2.8

(.9)

n = 389

 

All

1.8

(1.0)

n = 132

2.2

(1.0)

n = 122

2.5

(.9)

n = 106

2.7

(.8)

n = 97

2.7

(.7)

n = 160

2.4

(1.0)

n = 617

 

We computed a correlation matrix in order to compare these three measures of decodability. Our analysis included only data from those texts that were identified as decodable by the TEA analysis, since this was the only text from which a score was derived following the "have the skills been taught to decode the word" model. A substantial positive correlation was detected between the CIERA decodability measure and the holistic scale used from the 1994 study (r = .64). This high correlation is not surprising, given the two measures' similar construct for decodability as a within-word feature of difficulty. However, there was no correlation between the TEA measure and either the CIERA assessment (r = -.07) or the holistic scale (r = -.08). This lack of correlation suggests important differences in focus between a decodability measure tied directly to word features and a decodability measure tied to instructional consistency.

The ratings for average predictability are presented in Table 6. Scores of holistic predictability (Hoffman, et al., 1994) range from 1 (most supportive) to 5 (least supportive). We found ratings ranging from 3.4 to 3.9 across programs, with an average of 3.7. There were no clear trends in predictability across program levels. The average score at Level 1 was 3.6, and at Level 5 was 3.8. Similarly, there were no clear patterns in the predictability of designated decodable texts (3.6 average rating) with that of texts designated as non-decodable (average 3.8).

Predictability Ratings from Holistic Scales by TEA Decodable

 

Designated Decodable
or Non-Decodable

level 1

level 2

level 3

level 4

level 5

TEA dec

Combined

Program A

Dec

3.8

(1.1)

n = 21

3.8

(.6)

n = 23

3.9

(.8)

n = 21

3.7

(.6)

n = 17

3.6

(.9)

n = 15

3.8

(.8)

n = 97

3.8

(.9)

n = 102

Non-Dec

4.3

(.6)

n = 3

4.3

(1.1)

n = 2

 

 

 

4.3

(.7)

n = 5

Program B

Dec

4.0

(.3)

n = 16

4.2

(.6)

n = 25

4.1

(.5)

n = 24

3.0

(.7)

n = 2

4.0

(.5)

n = 3

4.1

(.5)

n = 70

3.9

(.8)

n = 160

Non-Dec

3.3

(1.0)

n = 22

3.8

(1.2)

n = 4

3.4

(.8)

n = 5

3.8

(.9)

n = 20

3.9

(.7)

n = 39

3.7

(.9)

n = 90

Program C

Dec

3.2

(.7)

n = 5

2.7

(.5)

n = 15

3.0

(.3)

n = 12

2.7

(.4)

n = 12

3.0

(.7)

n = 22

2.9

(.5)

n = 66

3.4

(1.0)

n = 154

Non-Dec

3.6

1.0)

n = 24

3.8

(1.2)

n = 13

4.0

(1.3)

n = 12

4.4

(1.0)

n = 11

3.6

(.8)

n = 28

3.8

(1.0)

n = 88

Program D

Dec

3.8

(.9)

n = 15

3.5

(.7)

n = 11

3.5

(.9)

n = 7

3.9

(.4)

n = 9

3.8

(.6)

n = 8

3.7

(.7)

n = 50

3.7

(.7)

n = 102

Non-Dec

3.7

(.6)

n = 7

3.6

(.4)

n = 10

2.9

(1.0)

n = 4

2.9

(.5)

n = 7

4.2

(.7)

n = 24

3.7

(.8)

n = 52

Program E

Dec

2.8

(1.3)

n = 6

3.3

(.8)

n = 6

3.5

(.5)

n = 6

4.3

(.3)

n = 6