Skip to main content

Rethinking Tests of Vocabulary Size

Vocabulary size tests assume that people learn the most frequent vocabulary words first—but is frequency the only factor that matters?


How do you measure how many words a person knows? This might seem like a simple task at first glance. Yet it’s impossible to test every single word—by one measure, the average adult native English speaker knows about 20,000 word families, let alone individual words.1

Designers of vocabulary size tests generally assume that learners are more likely to know words that appear more frequently in the language. This makes logical sense, but is the correlation between word knowledge and frequency strong enough for tests to generalize a learner’s vocabulary size based on frequency alone? This is the question that BYU linguist Dr. Brett Hashimoto asks in his article “Is Frequency Enough?: The Frequency Model in Vocabulary Size Testing.

The “Frequency Model”

Leading vocabulary tests, such as the Vocabulary Size Test (VST), work by dividing words into 1,000-word bands—one for the first 1,000 words, another for the next 1,000, and so on—based on which words appear most frequently in a corpus of language data. Test developers handpick a fixed number of words from each band to include on the test. The number of words a learner answers correctly on the test is then multiplied by a constant to extrapolate the total number of words the learner knows. This is called the Frequency Model.

It is intuitive that how often a word is used relates to whether a learner is likely to know it—people encounter words like “boring” and “house” more often than “quotidian” and “abode”—but is usage frequency the only important factor? If it is, then there should be a very strong correlation between the frequency of a word and how many people get it right on a vocabulary test. If the correlation is not strong, then there must be other factors besides frequency involved.

The Study

To test this, Dr. Hashimoto gave a vocabulary test to 403 adult learners of English who spoke 35 different native languages. Rather than handpicking the test items from frequency bands, he systematically included every tenth word from the first 5,000 most frequent words in the Corpus of Contemporary American English (COCA). He mixed these words with pseudowords not found in English, and participants had to answer whether a word was a real word or a pseudoword.

Dr. Hashimoto found that a word’s frequency does not strongly predict its difficulty. The difficulty of individual test items had significant variation that was not explained by frequency alone, suggesting a need to consider other variables.

He also identified other problems with the Frequency Model. First, sampling of words from each of the bands is arbitrary, and test developers bring their own subjective biases in deciding which words are “most representative” of words at that frequency level. Second, separating vocabulary into 1,000-word bands (or bands of any size) is just as arbitrary—by freely selecting a set number of words from each band, developers assume that all words within that band are equally difficult.2 The test data revealed significant variation in word difficulty within bands.

The Implications

Dr. Hashimoto’s aim with this study is not to construct an entirely new test that abandons frequency altogether, but to point to the need to consider variables other than frequency when designing vocabulary size tests. The list of possible variables is lengthy. For example:

  • Words that are long or difficult to pronounce may be learned later, even if they are more frequent.
  • Words that are similar to those in a learner’s first language and words that refer to concrete things that the learner interacts with often may be learned earlier, even if they are less frequent overall.
  • Demographic variables may also influence the vocabulary that learners acquire, and people learning a language for a specific purpose may know a large number of infrequent words in a specific field. For example, many Latter-day Saint missionaries develop massive vocabularies of abstract religious terms but may still struggle giving directions to the bakery. In such instances, frequency-based vocabulary tests make little sense.

“In short,” Dr. Hashimoto concludes, “modern vocabulary size tests have some potentially major issues with their current design, and efforts should be made in the future to improve them" (182). And the first step of these efforts, he argues, would be to involve variables other than frequency. Only then will researchers be able to accurately estimate vocabulary size—unless, of course, you have a dictionary, a highlighter, and a lot of time on your hands. 20,000 word families’ worth. ∎

Have thoughts or comments? Let us know at You can find the full article online at the citation below—make sure to be logged in to your library account (for BYU students, you can log in at You can also find the article by searching your library’s online collection.

Portrait of Brett Hashimoto
Dr. Brett Hashimoto, Assistant Professor of Linguistics at Brigham Young University

Read the original article

Brett J. Hashimoto. "Is Frequency Enough?: The Frequency Model in Vocabulary Size Testing." Language Assessment Quarterly 18, no. 2 (January 2021): 171–187.

Abstract: Modern vocabulary size tests are generally based on the notion that the more frequent a word is in a language, the more likely a learner will know that word. However, this assumption has been seldom questioned in the literature concerning vocabulary size tests. Using the Vocabulary of American-English Size Test (VAST) based on the Corpus of Contemporary American English (COCA), 403 English language learners were tested on a 10% systematic random sample of the first 5,000 most frequent words from that corpus. Pearson correlation between Rasch item difficulty (the probability that test-takers will know a word) and frequency was only r = 0.50 (r2 = 0.25). This moderate correlation indicates that the frequency of a word can only predict which words are known with only a limited degree of and that other factors are also affecting the order of acquisition of vocabulary. Additionally, using vocabulary levels/bands of 1,000 words as part of the structure of vocabulary size tests is shown to be questionable as well. These findings call into question the construct validity of modern vocabulary size tests. However, future confirmatory research is necessary to comprehensively determine the degree to which frequency of words and vocabulary size of learners are related.

Further Reading: Jeffrey Stewart, Joseph P. Vitta, Christopher Nicklin, Stuart McLean, Geoffrey G. Pinchbeck, and Brandon Kramer. "The Relationship between Word Difficulty and Frequency: A Response to Hashimoto (2021)." Language Assessment Quarterly 19, no. 1 (October 2021): 90–101.


1 Robin Goulden, Paul Nation, and John Read. “How Large Can a Receptive Vocabulary Be?” Applied Linguistics 11, no. 4 (1990): 341–363. Accessed May 23, 2022,

2 Hashimoto (2021) states, “Assuming that word 1 in a frequency list is as difficult as word 999 seems logically problematic. In the same vein, there is nothing intrinsic about word 999 in a word list that makes it automatically, necessarily, and significantly easier than word 1001” (182).