The Shaky #ScienceOfReading on Decodable Texts

“Decodable texts” are books written so that the words that appear in them conform to the phonics rules children are taught. So if children have been taught the rules for the correspondence between the letter n and the phoneme /n/, m and /m/, c and /k/, t and /t/, p and /p/, and s and /s/, plus the high frequency words “the” and “on,” they would read something like this (from Hiebert & Fisher, 2016):

Cam sat on the mat.
A man sat on the mat.
Nat sat on the mat.
Pam sat on the mat.
A tan cat sat on the mat.
Spat!

Riveting stuff, I know.

Decodable texts are very popular with advocates of intensive, systematic phonics instruction. The texts are supposed to provide “practice” to young readers in the phonics rules they’ve been taught.

Researcher Tim Shanahan gave them a (qualified) endorsement on his blog recently. Reading Rockets, a pro-phonics group, thinks they’re great. And they pop up in journalist Emily Hanford’s reporting on teaching reading. They are part of the “consensus” view supported by what on Twitter is known as the #ScienceofReading. States like California and Texas have spent millions and millions of dollars on decodable texts over the years.

What does the research say on the effectiveness of decodable readers? Do they improve reading comprehension, the ultimate goal of all reading instruction? Or do they just help kids become better decoders, which in and of itself does not lead to better comprehension?

Shanahan’s blog wasn’t much help. Most of the studies he cites are descriptive or correlational. He mentions one study (Connor, Morrison, & Katch, 2004) on the effects of phonics instruction, but that study doesn’t focus on decodable texts, and in any case has no reading comprehension measure. The only on point experimental study cited was Jenkins, Peyton, Sanders, & Vadasy, 2004, about which more below.

I found two reviews of the evidence on decodable texts, plus a few studies that came out after those reviews were published.

Take off your hat and sit on a mat, fans of Nan. It isn’t good news.

Mesmer (2000)

Mesmer (2000) provided one of the earliest research reviews of decodable texts. She referred to three experimental or quasi-experimental studies in support of their use.

The first and oldest study, Juel and Roper/Schneider (1985), found that at the end of first grade, the students using decodable texts did no better than a control group on tests of reading comprehension and vocabulary (Iowa Test of Basic Skills) (Phonics Group: 78.5th percentile, Basel Group: 78.8th percentile). Even worse, the decodable group in Juel and Roper/Schneider did no better on a test of decoding ability by the end of the study (Bryant Test of Decoding Ability), even though better decoding is one of the primary reasons for using them.

The second study Mesmer discusses is Felton (1993), which is mostly just a summary of a previous study, Brown and Felton (1990). Brown and Felton compared the effects of two different reading programs on a group of first and second grade children.

One program had a “code” emphasis (Lippincott Basic Reading Program), and the other what they termed a “context” emphasis curriculum (Houghton Mifflin).* The children were all identified as being “at risk” for dyslexia in Kindergarten. All the children received some phonics instruction.

There is no specific mention of decodable texts in Brown and Felton (1990), but Mesmer (2000) claims the two reading programs “varied by approach to phonics instruction (explicit or implicit) and by text decodability” (p. 132).

After two years, the researchers found no differences the the standardized measure of reading comprehension (Metropolitan Achievement Test (MAT)). The code group did better on tests of word identification and decoding, as we would expect.**

The third study, Hoffman, Roser, Salas, Patterson, and Pennington (2001), is not an instructional intervention study at all. It merely attempted to see if measures of decodability and predictability in a text were related to reading accuracy and fluency under different conditions, such as “modeled reading” (teacher reads the story first) and “preview reading” (similar to “guided reading”). Reading comprehension was not measured.

The results in Hoffman et al. were mixed: the decodability of the text was positively correlated with accuracy (r = .21), but negatively correlated with fluency (r = -.21).

Cheatham & Allor (2012)

Cheatham and Allor (2012) did a database search to identify all the studies on decodable texts and reading. They could find only three (!) experimental studies that isolated the benefits of decodable texts from other variables. What did those studies find?

The first study was Juel and Roper/Schneider (1985), which we already covered. The second, and highest quality study of the lot, was Jenkins, Peyton, Sanders, and Vadasy (2004). All the subjects in Jenkins et al. were reading below the 25th percentile on a skills-focused standardized test (Wide Range Achievement Test), and scored on average at the 9th percentile on a word reading test. This was not a case of kids who had “moved beyond” initial stages of decoding. They therefore should have, according to the theory, benefited from more “practice.”

The study found no differences on a standardized decoding test (TOWRE Phonemic Decoding) between a tutored group of first graders using more decodable texts (85% decodable) and those using less decodable ones (11% decodable).

Jenkins and colleagues also included a “passage comprehension” test from the Woodcock-Johnson, another standardized test popular with pro-skill-building researchers. We now know that this test is in fact more a test of decoding ability than comprehension. But even on this test, the kids who read the more decodable texts did no better than their classmates with less decodable ones (Cohen’s d = 0.10).

The final study (Mesmer, 2005) found that kids with the decodable texts did better than kids without them, but mostly on tests of accuracy in reading texts aloud. No real reading comprehension test was given.

Cheatham and Allor conclude that “results were inconsistent across the studies,” which I think is rather an understatement.

More Recent Studies

I was able to find two more studies of decodable texts since Cheatham and Allor’s 2012 paper. Neither provide strong support for their use, at least not for the kind of decodable texts used in many commercial programs.

Hiebert and Fisher (2016), in a report available online, compared two different types of decodable texts with a group of English language acquirers.

One type was “Phonetic Regularity with Phonemes” (PRP), which is the kind of decodable text mandated by California and Texas. These follow the “Nan can fan” pattern, and are described by the researchers as “setting the standard” in beginning reading instruction. The study used PRP texts from the popular but scientifically dubious Open Court reading program.

The second type was also decodable text, but a kinder, gentler version called Phonetic Regularity with Rimes (PRR). PRR texts still have “consistent” letter-sound correspondences, but according to Hiebert (2017), are based “first and foremost on the principle of meaningfulness” (p. 122).

Hiebert and Fisher found that the more “meaningful” PRR texts did better than the widely-used PRP texts. Both the PRR and PRP group did better than a control group on tests of reading fluency (number of words read correctly per minute). No separate test of reading comprehension was given. As is usually the case in these intervention studies, we have little idea as to the kind of instruction control group students received.

The most recent study, published just last month, is Price-Mohr and Price (2019). They compared texts of “high decodability,” with sentences such as “Zon can see a man in a hat,” to texts of “low decodability,” with text such as “Zon thinks the scarecrow is a monster.” Kids were given tests of decoding ability as well as the York Assessment of Reading for Comprehension, a passage comprehension test.

The results weren’t even close. The kids who read the more interesting “low decodable” books did much better on the reading comprehension test, with a large effect size (d = .96). The low decodable text group also did better on the decoding tests, again with respectable effect size (d = .74) (taken from Table 3).

Overall, then, the evidence for decodable texts is weak to nonexistent. There’s no data to support their use in improving reading comprehension, and little to support their use even in improving decoding.

So much for the #ScienceofReading.***

P.S. After I posted this on Twitter, the Reading League graciously conceded that well, no, there aren’t any studies that show decodable texts improve reading comprehension.

We’re not aware of any studies (or actually anyone) who claim decodable texts improve reading comprehension. Decodable texts support the development of reading fluency, a necessary precondition for strong reading comprehension.

— Right to Read Project (@right2readproj) November 21, 2019

I pointed out that four of the seven studies discussed above also found no impact for decoding/fluency, either, so there’s that. Now I’m awaiting word on the evidence that fluency training improves reading comprehension.

* This “context” group is sometimes referred to as being a “whole language” treatment. But Coles (2001) points out that the Houghton Mifflin series used had been previously criticized by whole language advocates as not reflecting whole language principles.

** Brown and Felton claim that the 0.3 mean grade equivalent difference, although not statistically significant, showed a “trend.” Since full results were not provided on the MAT, we have no way of determining effect sizes.

***A more recent review of text difficulty and reading by Amendum, Conradi, and Hiebert (2018) mentioned two additional experimental studies that used decodable texts. Vandasy and Sanders (2009) compared tutored students using decodable texts to whole class instruction. It isn’t clear what texts were used in the control group. Since we know that tutoring has an independent effect on early reading achievement (Camilli, Wolfe, & Smith, 2006), there is an obvious confound in this study. In any case, the results of the two different comprehension measures used in Vandasy and Sanders were, as Amendum and colleagues note, mixed.

The second study cited by Amendsum et al. was Cheatham, Allor, and Roberts (2014), but that study contained no reading comprehension measures. No significant differences were found on a decoding test (TOWRE) or on a reading fluency measure, with “small to negligible” effect sizes (p. 9).

References

Amendum, S., Conradi, K., & Hiebert, E. (2018). Does text complexity matter in the elementary grades? A research synthesis of text difficulty and elementary students’ reading fluency and comprehension. Educational Psychology Review, 30(1), 121-151.

Brown, I. S., & Felton, R. H. (1990). Effects of instruction on beginning reading skills in children at risk for reading disability. Reading and Writing, 2(3), 223-241.

Camilli, G., Wolfe, P., & Smith, M. (2006). Meta-analysis and reading policy: Perspectives on teaching children to read. The Elementary School Journal, 107(1), 27-36.

Cheatham, J. P., & Allor, J. H. (2012). The influence of decodability in early reading text on reading achievement: A review of the evidence. Reading and Writing, 25(9), 2223-2246.

Cheatham, J., Allor, J., & Roberts, J. (2014). How Does Independent Practice of Multiple-Criteria Text Influence the Reading Performance and Development of Second Graders?. Learning Disability Quarterly, 37(1), 3-14.

Connor, C. M., Morrison, F. J., & Katch, L. E. (2004). Beyond the reading wars: Exploring the effect of child-instruction interactions on growth in early reading. Scientific Studies of Reading, 8(4), 305-336.

Felton, R. H. (1993). Effects of instruction on the decoding skills of children with phonological-processing problems. Journal of Learning Disabilities, 26(9), 583-589.

Hiebert, E. (2017). The texts of literacy instruction: Obstacles to or opportunities for educational equity? Literacy Research: Theory, Method, and Practice, 66, 117-134.

Hiebert, E., & Fisher, C. (2016). A comparison of the effects of two phonetically regular text types on young English learners’ literacy. TextProject Reading Research Report, 16-01.

Hoffman, J. V., Roser, N. L., Salas, R., Patterson, E., & Pennington, J. (2001). Text leveling and “little books” in first-grade reading. Journal of Literacy Research, 33(3), 507-528.

Jenkins, J. R., Peyton, J. A., Sanders, E. A., & Vadasy, P. F. (2004). Effects of reading decodable texts in supplemental first-grade tutoring. Scientific Studies of Reading, 8(1), 53-85.

Juel, C., & Roper, D. (1985). The influence of basal readers on first grade reading. Reading Research Quarterly, 134-152.

Mesmer, H. A. E. (2000). Decodable text: A review of what we know. Literacy Research and Instruction, 40(2), 121-141.

Mesmer, H. A. E. (2005). Text decodability and the first-grade reader. Reading & Writing Quarterly, 21(1), 61-86.

Price-Mohr, R., & Price, C. (2019). A Comparison of Children Aged 4–5 Years Learning to Read Through Instructional Texts Containing Either a High or a Low Proportion of Phonically-Decodable Words. Early Childhood Education Journal, 1-9.

Vandasy, P., & Sanders, E. (2009). Supplemental fluency intervention and determinants of reading outcomes, Scientific Studies of Reading, 13(5), 383-425.