Frequency and Cognate Effects in Vocabulary Acquisition

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someone

Reynolds, B. (2016). The effects of target word properties on the incidental acquisition of vocabulary through reading. TESL-EJ, 20(3), 1-31.

Reading is the most powerful tool available to language acquirers for expanding and broadening vocabulary knowledge. Studies have found that while you have a low probability (around 5-15%) of picking up the meaning of an unknown word encountered a single time in a text, if you read enough, this small “pick-up” rate will, a bit like compound interest, yield large gains over time.

For the (vast majority) of words whose meanings are not acquired on a single encounter, a small part of the meaning is still typically picked up on each pass. The more times a word is seen, the higher the likelihood is of acquiring it – the now well-established “frequency effect” for vocabulary acquisition.

Reynolds (2016) looked at the effects of frequency, “cognativenss,” and “patternedness” on the acquisition of unknown words in English while reading. He asked a group of native English speakers (N=20) and advanced ESL readers (N=32) to read a short novel, BFG by Roald Dahl, over a period of two weeks.

Similar to the novels A Clockwork Orange and Things Fall Apart that have been used in previous incidental vocabulary acquisition studies, BFG contains a number of non-English words that are not explicitly defined in the text. In the case of BFG, they are words invented by Dahl and used by the main character (the “Big Friendly Giant”) and his friends, called “giant words.”

Reynolds reported that there are 299 giant word types in BFG. However, since he was interested in whether the “patternedness” of words would affect acquisition (that is, whether words that tend to occur with the same group of surrounding words would be acquired more readily), he did not include in his post-test any of the giant words that occurred only once. The subjects were tested on all 43 word types that occurred at least three times, as well as on six words randomly selected from among those word types that occurred twice (we are not told how many total two-occurrence word types there were).

Reynolds gave his subjects a list of the 49 invented giant words in alphabetical order and asked the readers to recall as much as they could about their meanings. Only answers that included a correct definition and/or translation were counted. This sort of meaning recall measure is much more challenging than a meaning recognition test, where the reader is asked to select the correct meaning from a set of options (such as in a multiple-choice test).

The researcher found (as expected) that frequency of word occurrence was positively related to acquisition among both native English readers (r = .33) and even more strongly among ESL students (r = .46). Even after controlling for the effects of frequency, “cognateness” was also positively related to acquisition for the giant words for both reader groups. Dahl’s made-up words that looked or sounded like real English words were acquired more easily than those that did not. “Patternedness” was not a significant predictor of acquisition, after controlling for the effects of frequency.

Unlike previous studies that have reported on frequency and incidental acquisition, Reynolds did not provide a breakdown of pickup rate by frequency in his report, although he did include the raw data in an appendix (pp. 30-31). I have calculated the pickup rates from his data (Table 1), allowing us to compare his results with previous studies.

Table 1: Word Recall for Unknown Words by ESL Readers

Frequency of OccurrenceNumber of WordsPercentage Acquired
> 20360%

Reynolds’ subjects appear at first to have done better overall on meaning recall (21%) than Pellicer-Sánchez and Schmitt’s (2010) subjects did using a similar study design (recall in their study was 14%). Pellicer-Sánchez and Schmitt’s analysis, however, included single-occurring words, which would lower the overall estimate. Pellicer-Sánchez and Schmitt’s subjects also read a longer text (67,000 tokens versus 37,600) and were given four weeks to read their book rather than two. Both factors may have caused the differences in measured rates of vocabulary pickup, since the time between exposure to the unknown words and testing would have been greater in Pellicer-Sánchez and Schmitt’s study than in the present one.

Notice that while the general trend of “more occurrences = more acquisition” holds in Reynolds’ data, the relationship is not perfectly linear here or in other studies of this sort. Other factors (including “cognativeness”) also influence acquisition of individual words. Given that the measure of acquisition used by Reynolds was rather conservative, however, these results are still impressive.

Excluding single-occurring words from the analysis was understandable given the goals of the study, but, as I noted above, even a low pickup rate of such words can lead to substantial, cumulative gains in vocabulary. That’s because a large percentage of unknown words in any given text occur only once (see McQuillan, 2016, footnote 4 on this point), so even a low pickup rate will yield a large number of words. Reynolds’ data thus underestimates the impact of incidental vocabulary acquisition from reading.


Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someone