Assessing vocabulary

Post on 20-May-2015

11.654 views 3 download



How to assess vocabulary? This paper attempts to address some of the limitation found in vocabulary assessment in EFL context in Indonesia.

Transcript of Assessing vocabulary

Assessing Vocabulary

A paper assignment for

Language Testing and Evaluation

ENGL 6201

Ihsan Ibadurrahman (G1025429)

Assessing Vocabulary

I. Introduction

Vocabulary is an essential part of learning a language, without which communication would suffer. A

message could still be conveyed somewhat without a correct usage of grammatical structure, but

without vocabulary nothing is conveyed. Thus, one may expect a communication breakdown when the

exact word or vocabulary itself is missing albeit the correct syntax or grammar is there such as in “Would

it perhaps be possible that you could lend me your ___”. On the contrary, without much grammar but a

sufficient knowledge of vocabulary, it would suffice to say ”Correction pen?” (Thornburry, 2002). In the

context of ESL teaching, it would make sense that the teaching of vocabulary should take priority over

the teaching of grammar, especially in today’s growing use of communicative approach where limited

vocabulary is the primary cause of students’ inability to express what they intend to say in

communicative activities (Chastain, 1988 as cited in Coombe, 2011). Since vocabulary is the basic

building blocks of language, most second language learners undertake their ambitious venture in

memorizing lists of words. For ESL learners, learning a language is essentially a matter of learning new

words (Read, 2000). Vocabulary is also closely tied to comprehension; it is generally believed that

vocabulary assessment may be used to predict reading comprehension performance (DeVriez, 2012;

Pearson et al, 2007; Read, 2000; Thornbury, 2002). This implies that to be able to comprehend text fully,

vocabulary is much needed (Nemati, 2010). Vocabulary is thus important both for communication and

comprehension, especially in reading.

Knowing the importance of vocabulary in language teaching, it would make sense to be able to

assess vocabulary. Such measurement may inform teachers of how much vocabulary learning has taken

place in the class, whether the teaching has been indeed effective. Thornbury (2002) contends that

vocabulary tests may also give two added advantages to teachers in that it provides beneficial backwash


and an opportunity for recycling vocabulary. Provided that students are informed in advance that

vocabulary is part of the assessment, students may review and learn vocabulary in earnest, thus creating

a beneficial backwash effect. The test also gives students a chance to recycle, and use their previously

learned vocabulary in a new way (Coombe, 2011). However, despite the many benefits it has on

language teaching, vocabulary assessment does not receive the attention it deserves. Pearson et al

(2007: 282) argue that vocabulary assessment is “grossly undernourished” and instead of living up to a

good standard of measurement, much more effort has been exerted in favor of a practical standpoint in

which tests are designed for economical and convenient reasons. Such phenomenon exists in Indonesian

EFL contexts where vocabulary assessment is merely used as a part of reading test in its standardized

national examinations, rather than on its own and is fashioned in the form of convenient multiple-choice

tests (Aziez, 2011).

This paper aims to describe an overview of current practices of vocabulary assessment. The

paper also attempts to outline some recommendations on testing vocabulary, and how it may be

relevant to the EFL vocabulary teaching in Indonesian contexts. To achieve these aims, recent journal

articles dated 2005 onwards were gathered and studied. Books and other journal articles that are of

relevance to the study were also used to support the current journal articles. The paper begins by

describing the many facets of vocabulary assessment that teachers first need to take into account. It

then goes on to elaborate some common types of vocabulary assessment. Where relevant, the

advantages and disadvantages of each test techniques will also be discussed. It then goes on further to

briefly report what researchers have done on vocabulary assessment, including the new direction that

they have taken. Recommendations on how to assess vocabulary will then be presented, and related to

the EFL contexts in Indonesia. Finally, the paper closes with a conclusion which summarizes the content

of the paper.


II. The dichotomies in vocabulary assessment

Before going into the details of the common types of vocabulary assessment, it would be useful

here to present the many facets of vocabulary assessment. The first thing that test-designers need to do

is to decide on which aspects of vocabulary that they want to test. This is especially true in vocabulary

assessment since vocabulary is multi-faceted, in which words can be examined in many different ways,

not just its meanings (Samad, 2010). These aspects are often viewed in a binary opposition or

dichotomies. Thus, vocabulary can either be assessed informally or formally, whether it is part of a

larger test or a vocabulary test on its own, or whether the assessment is on learners’ passive vocabulary

or its active counterpart. Some of the many facets of vocabulary assessment found in the literature are

discussed as follows.

a. Informal versus formal

Formal vocabulary assessment refers to the tests that are standardized, and have been designed in

such a way that reliability and validity are ensured (DeVries, 2012). Test of vocabulary can

sometimes be a part of placement test and proficiency test to measure the extent of vocabulary

knowledge a learner has. In proficiency tests such as TOEFL (Test Of English as Foreign Languages),

vocabulary is usually tested as a part of a larger construct such as reading, where candidates are

tested on their vocabulary knowledge based on a context on a reading passage. Formal assessment

also includes achievement test that is typically administered at the end of the course, and is

designed to measure whether words taught within the duration of a specific course have been

successfully learned.

Informal assessments, on the other hand, are not usually standardized, and are typically done as a

formative test, or a progress check to see if students have made a progress in learning specific

words that we want them to learn. Learning words is not something that can be done overnight.


Especially, in second language learning where there is less exposure to the words, learners need to

recycle the vocabulary from time to time by doing some kind of revision vocabulary activities. Such

activities are an informal vocabulary assessment, intended primarily to check whether they have

learned and progressed with their vocabulary learning (Thornburry, 2002). DeVries (2012) proposes

teacher’s observation as one of the most useful informal vocabulary assessment during on-going

classroom activities. Observations may provide teachers the first indication of whether or not words

have been grasped by learners, from which follow-up activities may ensue.

In sum, informal and formal assessment is very much related to the nature of the test itself,

particularly on the demands of the testing and the standard of the test itself. The next three

distinctions that follow are proposed by Read (2000), to which he calls the three dimensions of

vocabulary assessment.

b. Discrete versus embedded

The distinction in Read’s (2000) first dimension of vocabulary assessment is the construct of the test

itself, whether it is independent or dependent of other constructs. Where vocabulary is measured as

a test on its own right, it is called discrete. However, when a test of vocabulary forms a larger part of

a construct, it is called embedded. Using this first dimension, we can say that progress check tests

that are available at the end of a unit in most course books fall into the former category, whereas

the TOEFL test mentioned previously clearly falls into the latter category.


c. Selective versus comprehensive

The second dimension deals with the specification of vocabulary that is included in the test.

Vocabulary test is said to be selective when certain words are selected as the basis of vocabulary

measurement. Its comprehensive counterpart, on the other hand, examines all the words that are

spoken or written by the candidate. A selective vocabulary measure can be found typically in most

conventional vocabulary tests where the test-designer selects the words to be tested in the

assessment, such as those found in TOEFL reading comprehension questions. Comprehensive

vocabulary measure is typically found on a speaking or writing test where raters judge the overall

quality of the words rather than looking specifically at some words.

d. Context-independent versus context-dependent

The last dimension of vocabulary entails the use of context in a vocabulary test. If words in the test

are presented in isolation, without a context, the test is called context-independent but when it

makes use of the context in order for the test-takers to give the appropriate answer it is called

context-dependent. In the former case, learners are typically asked to respond whether they know

specific words or not. For example, the yes/no vocabulary check list asks whether learners know the

words from the list by marking a tick on it. For the latter, learners must engage in the context in

order to come up with the right response in the test. For example, in TOEFL reading passage, in

order to know which options is the closest synonym to the word, learners must refer to the text and

use the available context there.

e. Receptive versus productive

Another distinction to make in vocabulary assessment is to decide whether we want to test learners’

receptive vocabulary (passive) or the productive (active) one. Receptive vocabulary is the vocabulary


needed to comprehend listening or reading text while active vocabulary is the vocabulary used

when learners use it in writing or speaking. It is understood that learners have more receptive

vocabulary than productive vocabulary at their disposal. Knowing this distinction is crucial because

we certainly do not need to tests learners to demonstrate how they can use all words; there are

words which we simply want the learners to be able to comprehend.

f. Size versus depth

The last distinction is one that has gained currency in the research of vocabulary assessment

(Cervatiuc, 2007; Read, 2007). Size (or breath) of vocabulary refers the amount to vocabulary a

learner has, while the depth is the quality of these words. It is generally understood that knowing a

word does not simply entail knowing its meaning, but other aspects as well such as its

pronunciation, part of speech, collocation, register, morphological changes. This word knowledge

deepens through a gradual process of learning (Cervatiuc, 2007). A vocabulary depth test is thus

used to assess learners’ knowledge on some of these aspects of words. As Read (1999) puts it,

vocabulary size measures how much leaners know words, whereas vocabulary depth deals with how

well they know these words.

The binary distinctions listed thus far suggest that we must have some kind of reasons first before

vocabulary tests are constructed. Nation (2008, as cited in Samad, 2010) proposes a matrix which

lists different reasons in vocabulary assessment along with its corresponding formats of test. The

adapted version of the matrix is illustrated in the table below:


Reason for testing Useful formats and existing test

To encourage learningTeacher labeling, matching, completion,


For placementVocabulary Levels test, Dictation level test,

Yes/No, Matching, Multiple choice

For diagnosisVocabulary Levels test, Dictation level test, EVST-


To award a grade Translation, Matching, Multiple-choice

To evaluate learningForm recognition, Multiple-choice, Translation,


To measure learners’ proficiencyLexical Frequency Profile, Vocabulary size test,


Table 1: Reasons for assessing vocabulary and its corresponding test formats (Samad, 2010: 78)

Some of the examples of the formats presented in the table will be given in the next section.

III. How vocabulary is assessed

This section briefly outlines some commonly used vocabulary formats in vocabulary assessment. The list

below roughly follows a chronological order of how they appeared in the testing of vocabulary. The first

four formats listed below were the earliest measures of vocabulary which primarily ask the learners to

demonstrate their vocabulary knowledge by labeling, giving definition, and translating. Traditionally,

such assessment was done orally via an individual interview (Pearson et al, 2007). However, due to the

mass testing triggered by World War I, a more reliable, practical scoring is needed. This gave birth to the

next two test techniques in the list: Yes/No list and Multiple Choice Questions (MCQs). Research on

Second Language Acquisition (SLA) and Reading soon changed the view on how words are learned. It

becomes a widespread belief that words are learned best when they are presented in context

(Thornbury, 2002). Such view motivates more contextualized vocabulary assessments such as the cloze-


test. Next in the list is, the four skills assessment (writing, speaking, listening, and reading), where

vocabulary is sometimes a part of the construct, which makes use of the context to demonstrate

learners’ ability in using the words (active vocabulary).

a. Labeling

One of the most commonly used test technique in vocabulary assessment is labeling, where learners

are typically asked to respond by writing down what the word is for a given picture as illustrated


Alternatively, one picture can be used in which the learners are asked to label parts of it. Although it

may be relatively easy to come up with a picture especially with the growing mass of picture content

available on the net, it is somehow limited to pictures showing, and thereby testing concrete nouns

(Hughes, 2003).

b. Definitions

In definitions, learners are asked to write the word that corresponds to the given definition, as

illustrated below.


From Redman, S, Vocabulary in Use: Pre-intermediate & intermediate, p. 13, CUP. (2003)

A ____________ is a person who teaches in your class.

______________ is a loud noise that you hear after lighting, in a storm.

______________ is the first day of the week.

Definition provides a wider range of vocabulary to test, unlike the labeling format which is restricted

to concrete nouns. However, Hughes (2003) pinpoints one issue in this kind of test in that not all

words can be uniquely defined. To address this limitation, dictionary definitions may provide

shortcuts and save our headaches in finding the best, clear-cut, unambiguous definition.

c. Translation

There are many different ways in which vocabulary is measured using translation. Learners can

choose the correct translation in a MCQ, or simply be asked to provide the translation for each word

given in a list as follows:

Teacher __________ Taxi driver __________

Student __________ Librarian __________

Actor __________ Shop keeper __________

President __________ Professor __________

Note the above example may also be reversed, asking learners to provide the English words from

the L2. One pitfall in using translation is that one word may consist of more than one meaning, and

therefore there may be more than one correct answer which is an issue of reliability. However, the

use of context may help address this limitation. This can be done by adding sentences, in which the

word to be translated in underlined. Another issue in translation is the assumption that the teacher

has the knowledge of the student’s mother tongue (Coombe, 2011). It may also be noted that the

use of translation is somewhat regarded as controversial in the current trend in language education

where the use of mother tongue is discouraged (Read, 2000); learners should instead be given a


healthy dose of L2 exposure in the classroom (Harmer, 2007). However, a recent study done by

Hayati and Mohammadi (2009) suggests that translation provides longer memory retention of words

than another vocabulary learning technique called ‘task-based’ approach whereby learners are

asked to remember the definition, parts of speech, collocation, and other aspects of a word (or to

which is referred earlier to as vocabulary depth). Their findings imply that translation may still have

its place in vocabulary assessment.

d. Matching

Another common vocabulary test is where learners are presented with two columns of information,

and are asked to respond by matching a word in one column to another one. Items on the left-hand

column are referred to as premises, and items on the other end are called options. The word can be

matched based on its related meaning, a synonym, an antonym, or a collocation as exemplified in

the excerpt below:

Ur (1991) cautions the use of matching since learners can utilize the process of elimination, which

can be useful when they do not know the words in question. She thus recommends the use of more

options in matching.


From Redman, S, Vocabulary in Use: Pre-intermediate & intermediate, p. 13, CUP. (2003)

e. Yes/No list

The Yes/No format is particularly useful when we wish to test a large sample of items in a relatively

short time. This is achievable because in such format the learners are only asked to give a mark if

they know what the word means. For this practical reason, the yes/no format is typically used to

measure learner’s vocabulary size as a large sample of items is particularly needed in measuring size.

Give a tick (√) if you know what the word means.

__ Mayonnaise

__ Catastrophe

__ Belligerent

__ Distinctive

f. Multiple choice question

MCQs are among the most common test techniques in vocabulary assessment, especially in formal

tests (Combee, 2011). MCQs consist of a stem and response options. What the learners do is simply

to find one correct answer in the options. In vocabulary test, MCQ can be used to demonstrate

learners’ word knowledge of synonyms, antonyms, meanings in context, or a range of English

expressions as shown in the excerpt below:


McCarthy and O’Dell, Academic Vocabulary in Use, p. 41, CUP. (2008)

Although MCQs are often criticized for its sheer difficulty in designing good construct, limited

number of distractors to use, and existence of guessing element, MCQs nevertheless remain one of

the most popular vocabulary test simply because of their virtue of practicability, versatility,

familiarity, and high reliability.

g. Cloze-test

Cloze-test, also known as sentence completion or gap-fill item, is yet another common vocabulary

test where learners are asked to fill in the missing words in a text. It is relatively more demanding

than the previous test format since learners must demonstrate their ability in using the word based

on the context provided in the text. Cloze-test comes in many forms. The first one, is a fixed cloze

test in which every n-th word is deleted in a fixed ratio from a reading passage. The second form is

called selective-deletion or rational cloze test where instead of deleting words in a fixed ratio, the

test-designers purposefully delete some words from a reading passage. Another form of cloze-test

involves the use of multiple choice questions in answering the items. Thus, instead of the learners

having to write down the answer, they need to choose one correct response for every deleted word.

To eliminate the possibility of having more than one correct answer, it is desirable to provide the

first letter of each deleted word as illustrated in the following excerpt:

12McCarthy and O’Dell, Academic Vocabulary in Use, p. 43, CUP. (2008)

In the excerpt above, the first letter serves as a clue as to what the answer should be. Another

extreme version of this is called C-test, where instead of giving the first letter for each deleted word,

the first half of the word is revealed. One main advantage of cloze-test is its ease in writing one,

however Read (2000) casts doubt on the use of cloze-test as a true vocabulary measure. Since there

are quite many aspects to look at in answering a cloze-test item, the score may not reflect only the

learners’ lexical knowledge but it may be seen as gauging learner’s overall proficiency in the

language, including reading ability.

h. Embedded test

As previously mentioned, embedded vocabulary test is not a vocabulary test on its own but it is part

of a larger construct such as found in the testing of four language skills (reading, listening, writing

and speaking). In such assessment, the rater judges the overall quality of learners’ vocabulary in a

given skill. In reading, mainly the learner is asked to define the meaning from the context in a

reading passage. In listening, vocabulary can be one part of a larger writing component where

students’ knowledge of word spelling is assessed. Since writing and speaking are both productive

skills, vocabulary is somewhat given more weight. IELTS writing and speaking, for instance, put

‘lexical resource’ as one of the four marking schemes. This lexical resource refers to the quality of

learners’ vocabulary, whether for example the usage of word is appropriate, varied, and natural or

incorrect, limited, and stilted.


IV. Research on vocabulary assessment

Read (2000) provides one of the most comprehensive historical account of research into vocabulary

assessment. He states that vocabulary assessment is one field of study where not much attention has

been paid, particularly by the researchers of language testing themselves. In 1990s, most of the study

was conducted by Second Language Acquisition (SLA) researchers who might not have an adequate

understanding of testing and measurement but need vocabulary testing as their research instrument in

order to validate their own findings. Basically, these SLA researchers were interested in examining

whether specific lexical strategies are fruitful in terms of vocabulary acquisition by means of a test. The

four recurring topics that SLA researchers contributed to the field were systematic vocabulary learning,

incidental vocabulary learning, inferring the meaning of words from context, and communication

strategies. Systematic vocabulary learning looks at a systematic, orderly, way of how one acquires

vocabulary. Incidental vocabulary learning concerns the extent to which learners absorb new words

incidentally over time. The third topic investigates learners’ use of context in getting the meaning of

unknown words. The last most researched SLA topic deals with the ranges of communication strategies

used in a situation where learners lack the vocabulary to express what they wish to say. Other key

contributors in the previous study of vocabulary assessment are first language reading researchers. This

is primarily due to the consistent findings on the positive correlation between vocabulary knowledge

and reading.

The real testing researchers, on the other hand, take interest in the construction of vocabulary

test as to how it may cater for different testing purposes such as diagnosis, achievement and

proficiency, rather than in the processes of vocabulary learning and how effective each of these

different processes is by employing some sort of vocabulary measurement. The twentieth century was

marked as the year where researchers in the field of language testing began to take interest in


vocabulary assessment. The first research area that gained currency at that time was objective testing,

which is a kind of test that does not require judgment in its scoring. Read (2000) also contends that the

most frequently used test techniques in objective scoring is multiple-choice question, which is favored

particularly in vocabulary assessment because: (a) the availability of synonyms, translation, and a short

defining phrase lend themselves readily to the ease of constructing distractors, (b) the source of what

vocabulary to test is available thanks to the development of lists of the most frequent words in English,

(c) objective vocabulary measurement can also be used to indicate overall language proficiency. The use

of MCQs means that vocabulary testing throughout the whole twentieth century is typified as the test

that is selective, discrete, and context-independent. However, with the growing concern of using context

in vocabulary assessment, more and more test uses context in its construct such as the cloze test.

However, as Read (2000) acknowledges, the dilemma of a context-dependent vocabulary measure is

that it becomes quite difficult to separate the scoring of pure vocabulary knowledge from other skills

such as reading ability.

Read (2007) continues the documentation of research in vocabulary assessment from its 2000

publication. He states that a growing trend in the current research on vocabulary assessment is the

measurement of vocabulary size (or breadth) and depth. Both of these two distinct vocabulary measures

will be briefly discussed in turn.

Vocabulary size is an area that has gained currency in second language vocabulary assessment.

But what is it that makes it worth studying? As pointed out by Nation and Beglar (2007), measuring

learners’ vocabulary size is important for several reasons. First, it may inform teachers of how their

learners cope with real life, authentic task such as reading newspaper, novel, watching movies, or

speaking in English. Data on vocabulary size needed to perform each task is available, by testing

learners’ current vocabulary size, teachers could estimate how close their learners are to performing


these tasks. Secondly, such measurement is needed to track the progress of learners’ vocabulary. And

lastly, vocabulary size can also be used to compare non-native speakers and native speakers, whereby

learners may be predicted as to how close they are to achieving the size of native speakers’ vocabulary.

In measuring vocabulary size, researchers have used word list as a source to assess the number of words

a learner has. This is more possible now than ever before due to the growing use of computer corpora

which may provide word lists that are of quality. The first step researchers must take in measuring

vocabulary size is thus to choose word list that is available. After that, words are selected from the list,

and finally a suitable test technique is chosen. One commonly used test in assessing vocabulary size is

the Vocabulary Level Test (VLT) designed by Paul Nation and has been used for second language

diagnostic test (Cervatiuc, 2007). The test is available online ( and it has

both the receptive and productive version which can be used to measure learner’s passive and active

vocabulary respectively.

In contrast to vocabulary size, there has been relatively little progress made on the research of

the depth of vocabulary. This is due to the lack of definition that constitutes ‘depth’ and the construct in

developing such test. It is generally understood that knowing word does not simply mean knowing its

definition. The fact that there are many aspects of word a learner must know such as its pronunciation,

spelling, morphological forms, part of speech, and collocation mean that there are quite a lot of things

to measure, and there is little agreement on which ones should constitute learner’s depth of vocabulary

(Read, 2007). As such it is relatively more difficult to construct this kind of test. However, one much

acknowledged vocabulary depth test is the Word-Associate Test (WAT) designed by John Read in 1993

(Cervatiuc, 2007). As its name suggests, learners’ vocabulary are measured by using word associations

such as synonyms, collocations, and related meaning. Typically, WAT measures how well leaners know

words by ticking four out of eight possible options that have these associations such as exemplified



V. Assessing vocabulary in Indonesian EFL context

Taking my personal experience of being a student learning English in Indonesia as a compulsory subject

for six years, and also my experience of being a teacher teaching English in a senior high school in

Bandung for seven years, vocabulary assessment in Indonesia seems to leave a lot to be desired. There

are two main issues that contribute to my painful experience of being assessed and assessing

vocabulary, namely the nature of teaching and learning in school, and the tough national examination

test. These two issues will be elaborated in turn.

Since I was a student learning English in junior and senior high school until I became an English

teacher myself, vocabulary learning remains unchanged. Typically, learners are asked to read a passage

from a book, followed by a comprehension exercise which may entail some vocabulary exercises.

Usually, words from these exercises will be recycled only once in the review unit. After that, the words

that learners learn would never get repeated; they seem to vanish into thin air. As Thornbury (2002)

recommends, learner needs to encounter at least 8 times for a word to be ‘stuck’ in their mental lexical

knowledge. This suggests that teachers should incorporate informal vocabulary assessment, as


Read (1998), Word Associate Test, taken from <>

mentioned previously in this paper, in their teaching so that these words get recycled and used

meaningfully in a different way, and eventually stored into the long-term memory.

Another sad fact about vocabulary assessment in Indonesian context is that the same words

learned in the class do not even come in the national examination – an achievement test at the end of

the school year as a requirement for graduation. These words seem to be used as a kind of reading

exercise to get students used to one component of reading skill, which is to guess meanings from

context. Therefore, vocabulary testing is largely used as a means to an end, rather than a means by

itself. Using Read’s (2000) dimensions of vocabulary testing then, the national tests in Indonesia have

mainly been embedded and selective. Below is an example of an item taken from the 2010 English

national examination.

The above sample typifies the test as a context-dependent vocabulary measure, whereby the

word inhabitant is not presented in isolation but used in sentence taken from a text. As Read (2000)

points out this might come from the assumption that words never occur by themselves but constitute as

an integrated part of the whole text. However, a closer look at the item reveals that it might not be

context-dependent in its true sense. Test-takers who attempt to answer this question might respond C.

Citizens without necessarily having to read the whole text in order to come up with that answer. It must

be highlighted that in order the key element in context-dependent question is that learners must engage


Taken from Ujian Nasional 2009/2010, Kementrian Pendidikan Nasional

with the context in order to give appropriate response. As a comparison, below is a context-dependent

item as illustrated by Read (2000):

Humans have an innate ability to recognize the taste of salt because it provides us with sodium, an

element which is essential to life. Although too much salt in our diet may be unhealthy, we must consume

a certain amount of it to maintain our wellbeing.

What is the meaning of consume in this text?

A. Use of completely

B. Eat or drink

C. Spend wastefully

D. Destroy

In contrast to the previous item sample, the above item has all four options with possible

invariant meaning of consume, and therefore puts the learners in a condition where they must read the

text and use the available context to select the correct response, which is B. Eat or Drink.

The English National Examination (ENE) has 50 multiple-choice items altogether, with 15

listening comprehension questions and the 35 reading comprehension questions. However, not all of

these 35 reading questions pertain to vocabulary assessment, from the 2010 ENE only 5 questions or

10% from the whole items assess vocabulary knowledge. The large proportion of reading in the test

seems to indicate that there is an emphasis to reading skill. This might be one of the ways the

government invest in developing a culture of reading as stipulated in article three of National

Educational Law of July 2003 (UNESCO, 2011). Such emphasis is also desirable owing to the fact when

students enroll university they are expected to read English text books that cover 80% of the required

reading (Nurweni & Read, 1999). However, these reading texts are deemed to be too difficult for the

students to comprehend. In relation to vocabulary size mentioned previously, learners must possess at

least 4000 word level in order to gain 95% comprehension level with the assumption that the remaining


Taken from Read (2000: 12), Assessing Reading.

5% is the maximum amount of tolerance of grappling with unfamiliar words (Laufer, 1989 as cited in

Azies 2011; Nurwenti & Read, 1999). With 95% coverage, learners may still be able to comprehend a

200-word reading passage with 10 unknown words present. A recent study suggests that with 4000

word level, learners might be able to cover at least 95.96% the words occurring in 2010 senior high

school ENE, and surprisingly a slight margin of 95.80% in its junior high school counterpart (Aziez, 2011).

This means that reading texts in both junior high and senior high school national exams belong to the

same word level, which further suggests that test designers have not fully considered the vocabulary

load of these two different levels of high school education. Even more surprisingly perhaps, Nurwenti

and Read (1999) reveal that most of these senior high school students entering university do not even

come close to the required 4000 -5000 word level, meaning that it is such arduous work having to deal

with these reading passages. A reading passage that is difficult is also said to affect test reliability, which

refers to the degree of consistency and accuracy of a test. As pointed out by Samad (2010), a test that

contains difficult reading passages might contribute to the errors which in turn affect the accuracy of

one’s true score.

To sum up, vocabulary measurement in senior high schools in Indonesia largely employ

embedded, selective, and context-dependent test in a form of MCQs. It is embedded in the sense that

vocabulary measurement constitutes a larger part of a reading skill, and thus measuring only learners’

receptive (or passive) vocabulary. It is selective since the words to be tested are chosen by the test-

designers and context-dependent since the word is not presented in isolation. However, the test does

not use context-dependent in its full sense since learners do not need to engage with the context in

order to come up with the correct response (Read, 2000). To overcome this and some other problems

mentioned previously, the following suggestions may be helpful:


1. The vocabulary construct should be revamped as to reflect true context-dependent

vocabulary measure, in which all the options are possible variant meanings of the word and

forces the learner to make use of the context in order to come up with the correct response.

2. Teachers need to make use of informal vocabulary assessment so that words get recycled

for at least eight times. This can be done in various ways such as doing a 10 minute review

game at the beginning of every class to recycle and use these words in different ways.

Although these same words may not come up in the national exam, they still can train their

students the reading skill in which learners infer the meaning of unknown words in a reading

passage or familiarize students with the types of text that will come out in the exam.

3. In order to improve test reliability, test designers must carefully consider the weight of tests

difficulty in national exams. Reading texts that are too difficult will considerably affect the

accuracy of the score (Samad, 2010).

4. If the education goal is to enable students to deal with English text books later when they

enroll university, then the teaching of vocabulary that gears towards mastering at least 4000

word level is desirable (Nurweni & Read, 1999). Teachers may consider using the Academic

Word List (AWL) devised by Coxhead since most of these words are related to academic

registers. This list, and many other word lists are largely available to download in the

internet. This implicates that the testing of vocabulary should also be directed to measuring

vocabulary size, so that teachers may track the number of words students know from time

to time. Thankfully there are many websites that can do this automatically for them, such as (


VI. Conclusion

Vocabulary is an essential building block of learning language. As such, it makes sense to be able to

accurately measure it. As important as it may, it is sobering to know that there is a paucity of research

into vocabulary assessment. Even more saddening is the fact that most of the contributors to the field

are SLA, and first language reading researchers who might not have an adequate understanding of

testing but need vocabulary measurement to validate their own findings. It was not until the later

twentieth century that real researchers in the field of language testing began to pay more attentio n to

vocabulary assessment. The current trend in vocabulary assessment is towards measuring learner’s

vocabulary size and vocabulary depth, or also referred to as the measurement of how many words they

know, versus how well they know these words. Other vocabulary distinctions that researchers might use

in assessing vocabulary are receptive versus productive, informal versus formal, discrete versus

embedded, selective versus comprehensive, and context-dependent versus context-independent. These

vocabulary distinctions may come in various test formats such as Labeling, Definition, translation, MCQs,

Yes/No Checklist, Matching, Cloze-test, and embedded test.

This paper has also described the current practice of vocabulary assessment in Indonesian EFL

contexts particularly in senior high school in Bandung. The common practice of its assessment is the

receptive, embedded, context-dependent, selective use of vocabulary measurement in its MCQs. The

paper also mentioned problems in its assessment which includes the tough reading passages which

make up 70% of the total questions in the English National Exams. The difficult reading passages

severely hamper learners’ comprehension which in turn threatens test reliability. This paper thus

suggests a strong call for test-designers to reconsider the difficulty weighed in reading as well as for

teachers to adapt direct teaching of high frequency words which further confirm the need to measure

learners’ vocabulary size, and is in line with the new direction in vocabulary assessment.



Aziez, F. (2011). Examining the Vocabulary Levels of Indonesia’s English National Examination Texts.

Asian EFL Journal, Vol. 51, pp. 16-29.

Cervatiuc, A. (2007). Assessing Second Language Vocabulary Knowledge, International Forum of

Teaching and Studies, Vol. 3(3), pp. 40-78.

Coombie, C. (2011). Assessing vocabulary in the classroom. Retrieved April 2th, from

DeVriez, B. (2012). Vocabulary assessment as predictor of literacy skills, New England Reading

Association Journal, Vol. 47(2), pp. 4-9.

Harmer, J. (2007). The Practice of English Language Teaching (4th ed.), Essex: Pearson Longman.

Hayati, A., Mohammadi, M. (2009). Task-based instruction vs. translation method in teaching vocabulary:

The case of Iranian-secondary school students, Iranian Journal of Language Studies, Vol. 3(2),

pp.153-176. Retrieved 14th April, from:

Hughes, A. (2003). Testing for Language Teachers (2nd ed.), Cambridge: Cambridge University Press.

Pearson, P., Hiebert, E., Kamil, M. (2007). Vocabulary assessment: What we know and what we need to

learn. Reading Research Quarterly, Vol. 42(2), pp. 282-296.

Read, J. (1998). Word Associate Test. Retrieved 15th April, from:

Read, J. (2000). Assessing Vocabulary. Cambridge: Cambridge University Press.

Read, J. (2007). Second Language Vocabulary Assessment: Current Practices and New Directions,

International Journal of English Studies, Vol. 7(2). pp. 105-125.

Redman, S. (2003). Vocabulary in Use: Pre-intermediate & intermediate, Cambridge: Cambridge

University Press.

Samad, A. (2010). Essentials of Language Testing for Malaysian Teachers. Selangor: Universiti Putra

Malaysia Press

Thornbury, S. (2002). How to Teach Vocabulary. Essex: Pearson ESL.

McCarthy, M., O’Dell, F. (2008). Academic Vocabulary in Use, Cambridge: Cambridge University Press.


Nation, P., Beglar, D. (2007). A vocabulary size test, The language teacher, Vol. 31(7). pp.9-12.

Nemati, A. (2010). Proficiency and Size of Receptive Vocabulary: Comparing EFL and ESL Environments.

International Journal of Education Research and Technology, Vol. 1(1), June 2010, pp. 46-53.

Nurweni, A., Read, J. (1999). The English Vocabulary Knowledge of Indonesian University Students,

English for Specific Purposes, Vol. 18(2), pp.161-175.

UNESCO [United Nations Educational, Scientific, and Cultural Organization] (2011). Indonesia. Word

Data on Education (7th ed.). Retrieved 15th April, from:

Ur, P. (1991). A course in Language Teaching. Cambridge: Cambridge University Press.