Lexical Frequency & ESP(Wk5)

22
7/25/2019 Lexical Frequency & ESP(Wk5) http://slidepdf.com/reader/full/lexical-frequency-espwk5 1/22 Engineering English: A lexical frequency instructional model Olga Mudraya Department of Linguistics and Modern English Language, Lancaster University, Lancaster LA1 4YT, UK Abstract This paper argues for the integration of the lexical approach with a data-driven corpus- based methodology in English teaching for technical students, particularly students of Engi- neering. It presents the findings of the authors computer-aided research, which aimed to establish a frequency-based corpus of student engineering lexis. The Student Engineering Eng- lish Corpus (SEEC), reported here, contains nearly 2,000,000 running words reduced to 1200 word families or 9000 word-types encountered in engineering textbooks that are compulsory for all engineering students, regardless of their fields of specialization. The most immediate implication arising from this research is that sub-technical vocabulary as well as Academic English should be given more attention in the ESP classroom. The paper illustrates some sample data-driven instructional activities consistent with the lexical approach, in order to help students acquire the so-called  language prefabs, or formulaic multi-word units/collocations, for technical and non-technical uses. The integration of the lex- ical approach with a corpus linguistic methodology can enrich the learners language experi- ence and raise their language awareness, bringing out the researcher in them.  2005 The American University. Published by Elsevier Ltd. All rights reserved. 1. Introduction In recent years, corpus linguistics has come together with language teaching by recognizing the importance of language corpora as a basis for acquiring facts about 0889-4906/$30.00   2005 The American University. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.esp.2005.05.002 E-mail address:  [email protected]. www.elsevier.com/locate/esp English for Specific Purposes 25 (2006) 235–256 ENGLISH FOR SPECIFIC PURPOSES

Transcript of Lexical Frequency & ESP(Wk5)

Page 1: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 1/22

Engineering English: A lexicalfrequency instructional model

Olga Mudraya

Department of Linguistics and Modern English Language, Lancaster University, Lancaster LA1 4YT, UK 

Abstract

This paper argues for the integration of the lexical approach with a data-driven corpus-

based methodology in English teaching for technical students, particularly students of Engi-

neering. It presents the findings of the authors computer-aided research, which aimed to

establish a frequency-based corpus of student engineering lexis. The Student Engineering Eng-lish Corpus (SEEC), reported here, contains nearly 2,000,000 running words reduced to 1200

word families or 9000 word-types encountered in engineering textbooks that are compulsory

for all engineering students, regardless of their fields of specialization.

The most immediate implication arising from this research is that sub-technical vocabulary

as well as Academic English should be given more attention in the ESP classroom. The paper

illustrates some sample data-driven instructional activities consistent with the lexical

approach, in order to help students acquire the so-called   language prefabs, or formulaic

multi-word units/collocations, for technical and non-technical uses. The integration of the lex-

ical approach with a corpus linguistic methodology can enrich the learners  language experi-

ence and raise their language awareness, bringing out the researcher in them.

 2005 The American University. Published by Elsevier Ltd. All rights reserved.

1. Introduction

In recent years, corpus linguistics has come together with language teaching by

recognizing the importance of language corpora as a basis for acquiring facts about

0889-4906/$30.00    2005 The American University. Published by Elsevier Ltd. All rights reserved.

doi:10.1016/j.esp.2005.05.002

E-mail address:   [email protected].

www.elsevier.com/locate/esp

English for Specific Purposes 25 (2006) 235–256

ENGLISH FOR 

SPECIFIC

PURPOSES

Page 2: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 2/22

the language to be learned and sharing a larger, ‘‘chunkier’’ view of language (Johns,

1991; McEnery & Wilson, 1997; Murison-Bowie, 1996). The availability of language

corpora to language learners and teachers offers promising opportunities in learning

a language, allowing learners to set up and carry out their own language analyseswith the help of computer concordancing programs that are aimed at identifying col-

locations, or word partnerships, in which certain words co-occur in natural text with

greater than random frequency.

The lexical approach to language teaching and learning (Lewis, 1993; Nattinger &

DeCarrico, 1992; Willis, 1990; overview in Moudraia, 2001) is similarly directed at

teaching collocations. It makes a particular distinction between   vocabulary, tradi-

tionally understood as a stock of individual words with fixed meanings, and   lexis

which takes into account not only single words but also word combinations that

we store in our mental lexicons ready for use.

This paper aims to show how the integration of the lexical approach with a cor-

pus-based methodology in teaching English for Specific Purposes (ESP), especially

Engineering English, can improve the way ESP is taught. My particular point here

is to demonstrate how a technical student can benefit from the data-driven lexical

approach. The examples will be taken from my Student Engineering English Corpus

(SEEC) of nearly 2,000,000 running words (Moudraia, 2003, 2004), which was built

with the purpose of establishing a representative corpus of Student Engineering Eng-

lish that reflects the lexis encountered in compulsory textbooks for engineering stu-

dents, regardless of their fields of specialization.

2. Corpus linguistics and ESP language learning

The lexical approach argues that language consists of   chunks which, when com-

bined, produce continuous coherent text, and that only a minority of spoken sentences

are entirely novel creations. The existence and importance of formulaic multi-word

units has been pointed out by many linguists. Bolinger (1976) called them ‘‘the prefabs

of language’’ before Sinclair (1987, 1991) put forward the notion of the idiom principle

as a clear methodological grounding for viewing collocation, arguing that words do not

occur at random in a text. On the contrary, ‘‘a language user has available to him or hera large number of semi-preconstructed phrases that constitute single choices, even

though they might appear to be analysable into segments’’ (Sinclair, 1991, p. 110).

Corpus linguistics is a methodology which can be described as a study of natural

language on examples of   real life   language use via a   corpus   (McEnery & Wilson,

2001), defined as a body of text that is representative of a particular variety of lan-

guage and is stored on a computer. The availability of language corpora to language

learners and teachers adds a fresh dimension to the criteria for success in learning a

language. With   data-driven learning (Johns, 1991), the data are primary, the teacher

has a new role as coordinator of research, and learners become research workers in

control of their learning process.In particular, concordancing programs for computerized text analysis can be used

very productively in and outside the ESP classroom. A  concordance  is ‘‘a collection

236   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 3: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 3/22

of the occurrences of a word-form, each in its textual environment’’ (Sinclair, 1991,

p. 32). Language teachers can use concordancers to produce vocabulary exercises to

help their students understand word partnerships. The concordance data can make

language facts more explicit by isolating common patterns in authentic languagesamples, the point of a concordance being to present abundant examples of a word

in its usual contexts. By seeing the contexts and collocates, the learners can get a

much better idea of the use of the word than they would achieve by merely looking

it up in the dictionary. Furthermore, by drawing students attention to collocates of 

the keyword, concordance-based study has considerable potential for expanding

student vocabulary. Essentially, keywords are the words which are most unusually

(or outstandingly, in Scotts (1997) terms) frequent in a given body of text compared

with its frequency in a reference corpus.

McEnery and Wilson (2001, p. 121), identify ESP as a particular domain-specific

area of language teaching and learning, where ‘‘corpora can be used to provide many

kinds of domain-specific material for language learning, including quantitative ac-

counts of vocabulary and usage which address the specific needs of students in a par-

ticular domain more directly than those taken from more general language corpora’’.

In professional domains, various corpora are being built. Most of them are of finite

size, with the exception of so-called   monitor   corpora – open-ended collections of 

texts, to which new texts are being constantly added until the corpora ‘‘will get

too large for any practicable handling, and will be effectively discarded’’ ( Sinclair,

1991, p. 25).

The largest current professional corpus is to be the Corpus of Professional English(http://www.perc21.org/cpe_project/index.html). It is being developed in collabora-

tion between the Professional English Research Consortium (PERC), Japan, and

Lancaster University, UK. When finished, it will consist of a 100-million-word data-

base of English used by professionals in science, engineering, technology and other

fields. Also, a monitor engineering corpus of several million words representing

the English used by engineers in over 355 professional engineering organizations,

has been steadily growing at the University of Aizu in Japan (Orr & Takahashi,

2002).

However, for language learning and teaching, smaller corpora can be more useful 1

as they are designed to represent the specific part of the language under investigationand are tailored to address the aspects of the language relevant to the needs of the

learner. Furthermore, smaller corpora are more manageable, allowing easier and fas-

ter access to language data. Some examples of smaller technical corpora designed for

language learners are Peter Roes Corpus of Scientific English comprising 280,000

running words, i.e.,   tokens, (cited in   Yang, 1986), the 400,000-token Guangzhou

Petroleum English Corpus or   GPEC   (Qi-bo, 1989), the Jiaotong Daxue English of 

Science and Technology (JDEST ) Corpus (Yang, 1985a, 1985b, 1986) and the Hong

Kong University of Science and Technology (HKUST ) Computer Science Corpus

(James, Davidson, Heung-yeung, & Deerwester, 1994) of 1,000,000 tokens each, as

1 See Ghadessy, Henry, and Roseberry (2001) for applications of smaller corpora to language teaching.

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    237

Page 4: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 4/22

well as my Student Engineering English Corpus (SEEC) of nearly 2,000,000 tokens

(Moudraia, 1999, 2003, 2004).

All these corpora are largely based on textbook selections although they are quite

different in design and have different objectives. For example, the JDEST was cre-ated mainly to monitor language teaching materials in order to learn ‘‘how well

the materials which have been developed for the learners of English are representing

the authentic materials they are going to read in the future’’ and also possibly to pro-

vide some knowledge on the productivity of different multi-word term patterns

(Yang, 1986, p. 103). The authors also hoped that the JDEST might be used for syn-

tactic and discourse study of EST (Yang, 1985b, p. 95). Peter Roes Corpus of Sci-

entific English was used for the automatic identification of scientific/technical terms

(Yang, 1986, p. 97).

The purpose of building the GPEC was threefold: firstly, to get to know more

about the features of Petroleum English; secondly, to provide teachers and learners

with a series of vocabulary lists; and finally, to gain some empirical knowledge in

developing a model for processing a medium-sized corpus on a microcomputer

(Qi-bo, 1989, p. 28). The HKUST had two principal objectives: (i) an empirical

determination of the nature of the comprehension problems of Chinese-speaking

undergraduate students in listening and reading in English for academic purposes;

and (ii) the development of materials to enhance listening and reading skills, in-

formed by the findings of empirical enquiries (James et al., 1994, p. 3). The SEEC

had three primary aims: (a) to establish a representative corpus of Student Engineer-

ing lexis; (b) to provide teachers and learners with a word list that could serve as thelexical syllabus foundation of English for Engineering; and (c) to explore the syntac-

tical, morphological, lexical, and discursive features of Engineering English (Moud-

raia, 2004, p. 142).

Despite their different purposes, all these corpora have led to the production of 

vocabulary lists and lexical syllabuses for ESP/EST courses at tertiary level.

Exploitation of the findings using concordancing software was another outcome

of some of these projects. For example, an automatic monitoring and collecting

system of scientific/technical terms, in which new terms collected could be sup-

plied with concordances, was envisioned by the JDEST developers (Yang, 1986,

p. 102–103).I believe that concordancing is an indispensable tool in course design. In this pa-

per, I will be exploring the issue of technical/sub-technical/non-technical vocabulary

with examples from the SEEC. In the fourth section of this article, I have included

some data-driven instructional activities based on concordance samples from the

SEEC that are aimed at helping students acquire language prefabs for technical

and non-technical uses in the specialist context.

3. Technical/sub-technical/non-technical vocabulary

The division between technical and non-technical vocabulary is far from distinct.

Strictly technical words are characterized by the absence of exact synonyms,

238   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 5: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 5/22

resistance to semantic change, and a very narrow range; e.g., words such as  urethane,

or  vulcanise. Some researchers (Baker, 1988; Cowan, 1974; Flowerdew, 1993; Trim-

ble, 1985), however, distinguish a third category – so-called  sub-technical  vocabulary,

a class of words that stand between technical and non-technical words. These are lex-ical items with technical as well as non-technical senses, e.g.   iron,   force,  stress,  cur-

rent,   tension,   strength, etc., which have the same meaning in several technical

disciplines. As   Baker (1988, p. 91)   noted, the term   sub-technical   covers ‘‘a whole

range of items that are neither highly technical and specific to a certain field of 

knowledge nor obviously general in the sense of being everyday words which are

not used in a distinctive way in specialised texts’’.

In addition, according to Yang (1986), sub-technical words are identified by their

frequency and distribution as well as their collocational behaviour. Yangs statistical

analysis has shown that sub-technical words have very high distribution across all

specialized fields; however, their frequency of occurrence is lower than that of func-

tion words. Both function words and sub-technical words are characterized by fairly

low peakratio (i.e., the maximum frequency of occurrence divided by the average fre-

quency) and rangeratio (i.e., the maximum frequency divided by the minimum fre-

quency). On the other hand, technical terms have very low distribution but very

high peakratio and rangeratio (Yang, 1986, p. 98). Even so, a sub-technical word

might also be a term in a specific field if it suddenly shows a peak frequency in that

field. In view of this, I will also be examining whether the most frequent words in the

SEEC are indeed technical or non-technical/sub-technical.

4. The Student Engineering English Corpus

4.1. Rationale

The goal of the project was to develop a reliable lexical syllabus for engineering

students in order to meet the objectives of English teaching for Engineering at Wala-

ilak University in Thailand,2 where I had worked for nearly seven years. We were in

a situation quite common in Southeast Asia: lectures in most subjects were delivered

in a local language (Thai, in this case) whilst textbooks were in English. That is why,in order to build a representative corpus of Student Engineering English, I selected

English-language textbooks, 13 in total, used in basic engineering disciplines

(BED). By   BED, I mean those disciplines which are compulsory for all engineering

students regardless of their fields of specialization. At Walailak University, these

were Engineering Mechanics, Engineering Materials, Mechanics of Materials,

Mechanics of Fluids, Thermodynamics, Electrical Engineering, Engineering Draw-

ing, Manufacturing Process and Computer Programming. The main criterion for

selection was that the textbooks were recommended for engineering students, who

had to read them in English.

2 The project was supported by a small Grant # 970112 from the Walailak University Research Council.

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    239

Page 6: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 6/22

4.2. Procedures

The main stages in the project included gathering a text corpus, putting it into ma-

chine-readable form, conducting a computational analysis of the material, and build-

ing a word list.3 Whole texts were used in the SEEC, as opposed to text extracts,

which is the case with most other smaller technical corpora designed for languagelearners (e.g., GPEC, JDEST and HKUST). In corpus construction, whole texts

are preferable to text extracts wherever possible, as this frees the researcher from

concerns about the validity of sampling techniques; moreover, a corpus made up

of whole documents is open to a wider range of linguistic studies than a collection

of short samples (Sinclair, 1991, p. 19). The SEEC is composed of thirteen text files,

details of which as presented in Table 1. The collected material formed a corpus of 

about 2 million tokens and over 18,000 word-types, analysed with the help of the

WordSmith Tools  software (Scott, 1996).

4.3. Word list organization

The entries in the resulting word list were organized by word families. The lem-

matisation process reduced the number of entries to about 7700 that were treated

according to the cumulative frequency of occurrence of the members of the word

families, and the most frequent word families (with the sum total of 100 occurrences,

3 This step required permission from the publishers for the electronic use of their texts. My

acknowledgements go to   McGraw-Hill Australia   (permission dated October 12, 1998),   McGraw-Hill 

Companies, Inc.   (permission dated December 1, 1998),  Brooks/Cole Publishing Company  (Grant No. G-

09857, November 17, 1998) and Addison Wesley Longman Limited  (ref. AP/2743, November 25, 1998) for

their permission to store their texts in an electronic format in order to create a word list.

Table 1

The structure of the SEEC

N    Text file Bytes Tokens Types Type/token

ratio

Standardised type/token

ratioOverall 11,694,812 1,986,595 18,203 0.92 9.85

1 Manufact.txt 1,764,178 290,782 10,082 3.47 13.72

2 Material.txt 1,444,793 232,743 7056 3.03 10.49

3 Fluidmech.txt 1,307,973 220,666 5333 2.42 9.15

4 Mechmat.txt 1,177,429 202,513 4125 2.04 7.79

5 Elec.txt 983,672 167,394 5626 3.36 10.27

6 Intofluidmech.txt 860,281 147,028 4666 3.17 9.54

7 Dynamics.txt 795,910 142,446 3205 2.25 7.07

8 Statics2.txt 710,854 127,623 4129 3.24 9.57

9 Statics1.txt 668,896 121,696 2919 2.40 6.94

10 Chemi.txt 653,622 110,812 4299 3.88 9.60

11 Graph.txt 486,152 80,804 5034 6.23 11.55

12 Pascal.txt 466,756 77,242 3124 4.04 8.54

13 Draw.txt 374,296 64,846 3030 4.67 9.05

240   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 7: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 7/22

or 0.005%) were selected. As a result, the 1260 most frequent word families compris-

ing 8850 words were included in the Student Engineering Word List which can serve

as the foundation for an Engineering English lexical syllabus.

The ‘‘word family’’ here is interpreted in the broadest sense, incorporating notonly derived and inflected forms but compound words as well, according to Level

7 of   Bauer and Nations (1993)   scale.  Table 2   gives an example of the word fam-

ily under the headword  use  which is the most frequent word family in the Student

Engineering Word List. Also,   Appendix A   presents the one hundred most fre-

quent entries listed by headwords (i.e., base word or the most frequent word in

the family).

4.4. Word frequency analysis – findings

Word frequency analysis of the SEEC was carried out in comparison with the

COBUILD Bank of English Corpus and the British National Corpus (BNC). The

COBUILD Bank of English Corpus is the biggest monitor corpus of the English lan-

guage, steadily growing at Birmingham University, UK. Currently, it contains about

450,000,000 tokens; this analysis, however, is based on the 323,302,789 tokens that

COBUILD had in 2000. The BNC, developed at Lancaster University, UK in the

1990s, is the biggest finite corpus of the English language to date, containing around

100,000,000 tokens. For the analysis of the most frequent word forms, I used the

Written part of the BNC of 89,800,000 tokens.

The word frequency analysis was concerned with the most frequent word forms inall three corpora, including the most frequent closed-class (grammatical) and open-

class (content) word forms. It has revealed, firstly, that the most frequent word forms

in all three corpora, being mainly function words, concur (Appendix B). The

correlation between the fifty most frequent closed-class word forms in the SEEC,

Table 2

Use – the most frequent word family in the Student Engineering Word List

N    Headword Frequency % Words joined

ABC order – 1186

Frequency order – 1

use 10,313 0.52 use (2784:  n 961,  v 1823), uses

(262: n 48, v 214), using (2100),

used (4538);

useful (341), usefully (1), usefulness (7);

useless (6);

usable (22), useable (2);

user (149), users (24), users (2);

usage (39); reuse (4:  n 3,  v 1),

re-use (3:  n 1,  v 2), reused (5),

reusable (7);

unused (5 adj), unusable (5);

misuse (1

n), misusing (1), misused (1);abuse (2:  v 1,  attrib 1);

multiuse (1 attrib), multi-user (1 attrib)

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    241

Page 8: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 8/22

the COBUILD Bank of English and the BNC Written proved to be statistically sig-

nificant at the .01 level. The Spearmans rank order correlation between the fifty

most frequent closed-class word forms in the SEEC and the COBUILD Bank of 

English is .778 while between the SEEC and the BNC Written it is .802.Secondly, a comparison of the fifty most frequent open-class (content) word

forms has indicated that the content word forms in the SEEC are predominantly

from the scientific register, while the most frequent content word forms in

COBUILD and the BNC Written are of a general nature (Appendix C). Further-

more, the most frequent content word forms in the SEEC are rather infrequent in

COBUILD and the BNC Written (Appendix D). This finding supports   Salagers

(1983, p. 54) observation about ‘‘those context-dependent words’’ which occur with

high frequency across different scientific disciplines but tend to be used infrequently

in general word-frequency counts.

Similarly, the most frequently encountered words in the SEEC appear to be  sub-

technical , i.e., words with non-technical as well as technical senses, common in most

kinds of technical writing, which are identified by their frequency and distribution

as well as their collocational behaviour (Yang, 1986). The SEEC word frequency anal-

ysis has additionally revealed that the non-technical sense of a sub-technical lexical

item is used more frequently than its technical sense. For example, the word  solution

is more commonly used in the SEEC in the non-technical sense than in the chemical

sense (Table 3), even in a Chemical Engineering Thermodynamics textbook4 (Table 4).

Finally, keyword analysis of the Student Engineering lexis, carried out with the

help of the WordSmith Tools software, has provided further support for my hypoth-esis that the most frequent words in a specialist corpus are in fact sub-technical and

non-technical. Basically, keywords are the words which are most unusually frequent

in a given body of text against a reference corpus while the so-called   key-keywords

(Scott, 1997) are the most frequent keywords over a number of files in the database

ensuring that these words are characteristic of the whole corpus.

The key-keyword comparison of the SEEC against the BNC Written Sampler5

provides some interesting information about the key verbs in the SEEC – they ap-

pear to be predominantly from the academic register. The key-key verbs in the SEEC

are:   act, apply, assume, be, become, calculate, consider, correspond to, define, deter-

mine, exert, give, illustrate, indicate, locate, obtain, occur, require, show, sketch, solve,substitute, and use. These verbs are key in at least five (seven on average) text files out

of the thirteen that constitute the SEEC (Table 1). Importantly, ten of these key

verbs (assume, correspond, define, illustrate, indicate, locate, obtain, occur, require,

substitute) are included in Coxheads (2000) New Academic Word List; ten (assume,

correspond, define, illustrate, indicate, locate, obtain, occur, require, sketch) are also in

4 However, the word form   solutions, although very infrequent, does occur more frequently in its

chemical sense in the Chemical Engineering Thermodynamics textbook.5 The BNC Written Sampler is a one-million-word written subcorpus of the BNC containing a wide and

balanced sampling of texts from the BNC Written. It was used for the key-keyword comparison as the full

BNC was too large to be analysed by WordSmith Tools.

242   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 9: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 9/22

Xue and Nations (1984) University Word List; and the first nine in each appear in

both lists. Interestingly, Martin (1976) considered academic and sub-technical vocab-

ulary equivalent terms.Thus, word frequency analysis of the SEEC in comparison with COBUILD and

the BNC suggests that the answer to the research question   Are the most frequent

words in a specialist corpus technical or non-technical/sub-technical?   is that they

are (a) sub-technical and (b) non-technical from the academic register. This finding

has important implications for teaching Engineering English, indicating that more

attention should be devoted to academic English and sub-technical vocabulary.

5. Data-driven teaching and learning of Engineering English

Corpus linguistic techniques, such as the use of concordancers, play a major role

in data-driven learning and increasingly shape the development of teaching materials

Table 3

Technical vs. non-technical senses

Rank Headword Words joined Frequency %

ABC order Frequency order

1032 29   solution

(of a problem)

solution (1899), solutions

(185); solve (912), solves (8),

solving (341), solved (249);

solvable (8), solver (3),

solvers (2); unsolved (1);

resolve (29), resolving (41),

resolved (96); unresolved (2)

3776 0.19

1033 242   solution

(liquid)

solution (455), solutions (114),

solutionizing (1), solutionized

1); solubility (111), soluble (32),insoluble (16); solvent (96),

solvents (27), solute (58)

solvus (8); dissolve (14),

dissolves (11), dissolving (4),

dissolved (44), dissolution (16),

nondissolvable (1)

1025 0.05

Table 4

Dispersion of the word  solution   in a Chemical Engineering Thermodynamics textbook

Word Frequencies Per 1000 words

Total General sense Chemical sense

solution   259 157 102 2.36

solutions   27 6 21 0.25

solution + solutions   286 163 123 2.60

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    243

Page 10: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 10/22

(Cheng, Warren, & Xun-feng, 2003; Flowerdew, 1993; Hadley, 2002; Johns, 1991;

McKay, 1980; Mudraya, 2004; Murison-Bowie, 1996; Thurstun & Candlin, 1998).

Via corpus-based teaching and learning, learners become exposed to authentic

real-life language use and no longer rely solely on published instructional material,much of which is inauthentic.

Within the lexical approach too, language activities are directed towards naturally

occurring language, and more time is devoted to collocations and idiomatic expres-

sions.  Lewis (1993)  claims that the basis of language is lexis, while grammar is the

search for powerful patterns. There is compelling evidence (Lewis, 1993; McKay,

1980) that the majority of errors made by foreign/second language learners are

semantic errors of inappropriate word choice caused by vocabulary deficiency

and, particularly, by lack of collocational power. In consequence,  Nattinger (1980,

p. 341) has suggested that

Perhaps we should base our teaching on the assumption that, for a great deal

of the time anyway, language production consists of piecing together the ready-

made units appropriate for a particular situation and that comprehension

relies on knowing which of these patterns to predict in these situations. Our

teaching, therefore would center on these patterns and the ways they can be

pieced together, along with the ways they vary and the situations in which they

occur.

I argue for the integration of the lexical approach with data-driven corpus-based

methodology in English teaching, including ESP teaching, as I believe that the use of language corpora in the classroom can improve students knowledge of the language

and their ability to use it effectively. Clearly, the major strength of using a computer

corpus in language teaching is the insight it can provide into the unique collocational

patterns of a word. This is one of the many persuasive reasons for utilizing computer

corpora in the development of vocabulary materials. Although the exercises that

resemble those of standard vocabulary and grammar teaching practices (i.e.,

blank-filling, sentence completion, word matching, translation, etc.) can still be

put to use, their linguistic focus has fundamentally changed, with many of the activ-

ities being of the receptive, awareness-raising kind that can aid language acquisition

by providing learners with a tool which enables them to process input moreeffectively.

I find concordancing a very valuable tool in course design. A case can be made,

though, for the use of the specialist corpus for teaching ESP students, since, even

where lexis is common to both general and the specialist corpus, the items in the spe-

cialist corpus, as Flowerdew (1993, p. 236) has noted, may have particular uses that

will be revealed in concordancing. Keeping this in mind, I have worked out some

data-driven exercises based on concordance data that are aimed at helping students

acquire language prefabs for technical and non-technical uses in the specialist con-

text. These concordance-based activities are designed not so much to help students

understand engineering textbooks but rather to aid productive use of the languageprefabs. Fig. 1 presents a concordance sample from the SEEC that includes carefully

selected examples of the word solution used both in the general sense (e.g., solution of 

244   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 11: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 11/22

a problem) and in the technical (chemical) sense.  Solution  was chosen because it fig-ures, in its general sense, as a high-frequency word family and also occurs frequently

as a sub-technical item.

Fig. 1. Concordance sample of   solution.

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    245

Page 12: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 12/22

Activity 1.   Study the concordance data and find the instances of the word  solution

used (1) in the general sense (e.g.,   solution of a problem) and (2) in the technical

(chemical) sense.

Answer key

General Chemical

Lines 1–3, 5–10, 20, 22–23, 33–34,

38, 41, 43–44, 46–52, 54, 57, 59–61

Lines 4, 11–19, 21, 24–32, 35–37,

39–40, 42, 45, 53, 55–56, 58

Activity 2a.   From the concordance data, supply the adjectives that collocate with

the word   solution   used (1) in the general sense and (2) in the technical (chemical)

sense. Underline the adjectives that can be used with both senses of  solution.Answer key

General Chemical

adequate, analytical, (not) possible,

particular, optimum, similar, following,

explicit, known, sensitive, insensitive,

straightforward, ideal, alternative

aqueous, acid, dilute, concentrated,

strong, weak, ideal, saturated,

following, similar, liquid, particular,

partially miscible

Activity 2b.  From the concordance data, supply the verbs that collocate with the

word solution  used (1) in the general sense and (2) in the technical (chemical) sense.

Underline the verbs that can be used with both senses of  solution.

Answer key

General Chemical

find, obtain, complicate, is/are, attempt,

add, yield, exist, take, lead to, give rise

to, have, lengthen, print, contain,calculate, simplify

add, be/is, has, pump, form, immerse in,

enter, flow out of, absorb, plate out of,

take, exist, contain

Fig. 2   presents a concordance sample from the SEEC that includes carefully

selected examples of the verb   solve   used only in its general sense (e.g.,   solve a

 problem). It was chosen because it features as a high-frequency word family and a

prominent key verb in the SEEC. Below are concordance-based activities designed

to provide some insight into the syntactic patterns in which the verb  solve  functions

in the SEEC.

Activity 3.  Use the concordance data to exemplify the following syntactic patterns

with the verb  solve + solution method.

246   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 13: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 13/22

Pattern 1:   ‘solve/solves/solving/solved with’   as in   ‘The following problems are

designed to be solved with a computer’.

Answer key

Solve with a vector approach.

the following problems are designed to be solved with a computer.

if this problem were solved with a numerical calculation

Solving this with the characteristic yields

air velocity expressions, each solved as appropriate with the equationproblems which we shall riot attempt to solve with the means at our disposal

Many problems can be solved with more than one choice of system.

it solves rapidly with any root-finding program

Pattern 2:   ‘solve/solves/solving/solved by’  as in   ‘The problem may be solved graph-

ically by drawing’.

Answer key

Solve, first, by double integrationthe problem may be solved graphically by drawing

transcendental equations which must be solved by successive approximations

Fig. 2. Concordance sample of  solve.

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    247

Page 14: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 14/22

will be an oblique triangle and should be solved by applying the law of sines

can be solved quite simply by the use of 

When the problem is solved simply by moving the disk from

a wide class of problems which are solved by trial and error.problems that cannot be solved by the Work-Energy Principle

problems in this chapter have been solved by using the Moody diagram.

Such problems are solved by considering a short length of 

equations and as such may be solved by numerical techniques.

set of algebraic equations that can be solved by methods developed earlier

were solved by the application of second law.

Solve by trial the equation

would be relatively simple matter to solve them, say by matrix method.

Solving a problem by following five steps

solved by computer.

Pattern 3:   solve/solves/solving/solved using   as in   Alternatively, we can solve such

 problems using graphical solution.

Answer key

deformable-body mechanics problems are solved using these work-energy principle

Alternatively, we can solve such problems using graphical solution

the following problems are intended to be solved using the program provided in

13.14 Solve Problem 12.18c, using the method15.75 Using the method of 15.7, solve 15.49.

we demonstrate the FORTRAN program that solves these using the routine Original

Equations (13.52) can be solved using each of the two sets

we add this reaction to be solved using the ‘‘final’’ moles

Another useful activity would be finding out when one syntactic pattern (the use

of  by, with, using  with solve) was preferred over another. It would require examining

all the relevant concordance lines in the corpus but the limitations of space do not

allow it to be included here.By using corpora, students gain direct access to abundant examples of authentic

language samples, resulting in a better understanding of the use and patterns of cer-

tain linguistic features. Thus, corpus-based teaching can help train multi-skilled

autonomous learners who can take charge of their own learning processes.

6. Conclusion

In this paper, I have argued for the integration of the lexical approach with a

data-driven corpus-based methodology in ESP teaching, as I believe that the useof language corpora in the classroom can improve students   knowledge of the

language and their ability to use it effectively. This leads me to the conclusion that

248   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 15: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 15/22

corpora can also improve the way ESP teaching is approached. It can inform teach-

ing and learning, producing students who know what it means to use a corpus, who

know how to extract material from it, and who, consequently, can learn a great deal

about language via a corpus. After all, as Dlaska (1999, p. 403) observed, ESP teach-ing need not be ‘‘dire and difficult pedagogical ground’’, forcing language teachers to

surrender their expertise in favour of teaching unfamiliar subjects, but on the con-

trary, it needs to ‘‘address, and eventually bridge, the discrepancy between general

language ability and specialized language ability  . . . since the two areas are not in

opposition but complement each other’’.

Appendix A. The one hundred most frequent word families in the Student Engineering

Word List

N    Headword Frequency %

1 use 10,313 0.52

2 force 9247 0.46

3 form 7075 0.35

4 flow 7045 0.35

5 pressure 7016 0.35

6 show (v) 7002 0.35

7 determine 6896 0.34

8 figure/configure 6650 0.33

9 section 6404 0.32

10 line 5812 0.29

11 equation 5771 0.2912 point 5236 0.26

13 angle 4923 0.25

14 act/react/interact/transact/counteract 4666 0.23

15 velocity 4614 0.23

16 system 4540 0.23

17 value 4484 0.23

18 apply 4327 0.22

19 problem 4278 0.21

20 work 4198 0.21

21 give 4103 0.21

22 axis 4053 0.20

23 stress 4033 0.20

24 material 4014 0.2025 center 3992 0.20

26 length/long 3890 0.19

27 part 3867 0.19

28 surface 3821 0.19

29 solution (of a problem) 3776 0.19

30 type 3606 0.18

31 produce 3582 0.18

32 metal 3457 0.17

33 example 3447 0.17

34 load 3406 0.17

35 other/another 3371 0.16

36 time 3299 0.16

37 high 3252 0.1638 energy 3245 0.16

39 vary 3232 0.16

(continued on next page)

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    249

Page 16: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 16/22

Appendix A   (continued )

N    Headword Frequency %

40 number 3216 0.16

41 temperature 3119 0.1642 body 3101 0.16

43 process 3048 0.15

44 chapter 3016 0.15

45 moment 2989 0.15

46 machine 2979 0.15

47 dimension 2938 0.15

48 put 2889 0.14

49 placement 2840 0.14

50 require 2828 0.14

51 area 2827 0.14

52 plane 2820 0.14

53 direction 2784 0.14

54 result 2763 0.1455 move/remove 2751 0.14

56 all 2741 0.14

57 follow 2731 0.14

58 constant 2719 0.14

59 unit 2661 0.13

60 view 2647 0.13

61 fluid 2639 0.13

62 know 2609 0.13

63 draw 2603 0.13

64 operation 2601 0.13

65 component 2560 0.13

66 expression 2528 0.13

67 beam 2513 0.1368 end 2484 0.12

69 pipe 2476 0.12

70 make 2467 0.12

71 steel 2429 0.12

72 assume 2424 0.12

73 shear 2409 0.12

74 case (=state) 2351 0.12

75 find 2343 0.12

76 diameter 2341 0.12

77 obtain 2341 0.12

78 mass 2337 0.12

79 air/aero- 2315 0.12

80 define 2276 0.1181 also 2267 0.11

82 calculate 2266 0.11

83 water 2262 0.11

84 cut 2258 0.11

85 element 2254 0.11

86 rotate 2250 0.11

87 maximum 2246 0.11

88 different 2235 0.11

89 change 2205 0.11

90 equilibrium 2183 0.11

91 structure 2183 0.11

92 position 2177 0.11

93 base/basic 2172 0.1194 write 2167 0.11

95 consider 2154 0.11

250   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 17: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 17/22

Appendix A  (continued )

N    Headword Frequency %

96 design 2125 0.11

97 free 2087 0.1098 friction 2086 0.10

99 low 2083 0.10

100 method 2070 0.10

Appendix B. The fifty most frequent word forms in the SEEC, the COBUILD Bank of 

English Corpus and the BNC Written

SEEC (ca. 2 million words) COBUILD(ca. 323 million words)

BNC (written)(ca. 90 million words)

N    Word %   N    Word %   N    Word %

1 the 8.50 1 the 5.58 1 the 6.43

2 of 4.19 2 of 2.60 2 of 3.11

3 a 2.84 3 to 2.51 3 and 2.70

4 and 2.72 4 and 2.37 4 to 2.60

5 is 2.43 5 a 2.21 5 a 2.18

6 in 2.07 6 in 1.83 6 in 1.95

7 to 2.06 7 that 1.04 7 is 0.99

8 for 1.08 8 is 0.93 8 that 0.99

9 are 0.88 9 it 0.92 9 was 0.94

10 be 0.83 10 for 0.87 10 it 0.9311 that 0.80 11 i 0.78 11 for 0.88

12 at 0.76 12 was 0.76 12 on 0.72

13 as 0.75 13 on 0.70 13 with 0.67

14 by 0.71 14 he 0.65 14 he 0.67

15 with 0.57 15 with 0.64 15 be 0.67

16 on 0.50 16 as 0.57 16 i 0.66

17 from 0.48 17 you 0.54 17 by 0.55

18 an 0.47 18 be 0.53 18 as 0.55

19 this 0.47 19 at 0.52 19 at 0.49

20 or 0.46 20 by 0.50 20 you 0.47

21 we 0.42 21 but 0.47 21 are 0.47

22 which 0.42 22 have 0.46 22 his 0.47

23 it 0.38 23 are 0.44 23 had 0.4624 if 0.32 24 his 0.43 24 not 0.46

25 figure 0.31 25 from 0.43 25 this 0.45

26 flow 0.31 26 they 0.43 26 have 0.44

27 can 0.28 27 this 0.39 27 from 0.44

28 determine 0.27 28 not 0.38 28 but 0.43

29 force 0.27 29 had 0.35 29 which 0.39

30 two 0.26 30 has 0.34 30 she 0.38

31 shown 0.25 31 an 0.32 31 they 0.37

32 will 0.25 32 we 0.32 32 or 0.37

33 used 0.23 33 or 0.29 33 an 0.36

34 may 0.22 34 said 0.28 34 her 0.35

35 velocity 0.22 35 one 0.28 35 were 0.33

36 pressure 0.22 36 there 0.27 36 there 0.2837 its 0.20 37 will 0.27 37 we 0.28

38 when 0.20 38 their 0.27 38 their 0.28

(continued on next page)

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    251

Page 18: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 18/22

Appendix B  (continued )

SEEC (ca. 2 million words) COBUILD

(ca. 323 million words)

BNC (written)

(ca. 90 million words)

N    Word %   N    Word %   N    Word %

39 have 0.20 39 which 0.27 39 been 0.28

40 has 0.19 40 she 0.26 40 has 0.27

41 equation 0.19 41 were 0.26 41 will 0.26

42 not 0.19 42 all 0.25 42 one 0.26

43 one 0.18 43 been 0.25 43 all 0.25

44 each 0.18 44 who 0.25 44 would 0.25

45 point 0.18 45 her 0.24 45 can 0.22

46 where 0.18 46 would 0.23 46 if 0.21

47 system 0.17 47 up 0.22 47 who 0.21

48 forces 0.17 48 if 0.22 48 more 0.21

49 these 0.16 49 more 0.22 49 when 0.21

50 between 0.16 50 when 0.22 50 said 0.20

Appendix C. The fifty most frequent open-class word forms in the SEEC, the

COBUILD Bank of English Corpus and the BNC Written

SEEC (ca. 2 million words) COBUILD (ca. 323 million words) BNC Written (ca. 90 million words)

N    Rank Word %   N    Rank Word %   N    Rank Word %

1. 5 is 2.43 1. 8 is 0.93 1. 7 is 0.99

2. 9 are 0.88 2. 12 was 0.76 2. 8 that 0.993. 10 be 0.83 3. 18 be 0.53 3. 9 was 0.94

4. 25 figure 0.31 4. 22 have 0.46 4. 15 be 0.67

5. 26 flow 0.31 5. 23 are 0.44 5. 21 are 0.47

6. 27 can 0.28 6. 29 had 0.35 6. 23 had 0.46

7. 28 determine 0.27 7. 30 has 0.34 7. 26 have 0.44

8. 29 force 0.27 8. 34 said 0.28 8. 35 were 0.33

9. 30 two 0.26 9. 35 one 0.28 9. 39 been 0.28

10. 31 shown 0.25 10. 37 will 0.27 10. 40 has 0.27

11. 32 will 0.25 11. 41 were 0.26 11. 41 will 0.26

12. 33 used 0.23 12. 43 been 0.25 12. 42 one 0.26

13. 34 may 0.22 13. 46 would 0.23 13. 44 would 0.25

14. 35 velocity 0.22 14. 55 can 0.20 14. 45 can 0.22

15. 36 pressure 0.22 15. 58 new 0.16 15. 50 said 0.2016. 39 have 0.20 16. 59 do 0.16 16. 51 do 0.20

17. 40 has 0.19 17. 60 two 0.16 17. 61 could 0.16

18. 41 equation 0.19 18. 62 time 0.15 18. 64 time 0.15

19. 43 one 0.18 19. 63 people 0.15 19. 67 two 0.14

20. 45 point 0.18 20. 64 like 0.15 20. 70 may 0.14

21. 47 system 0.17 21. 68 now 0.15 21. 73 new 0.13

22. 48 forces 0.17 22. 71 year 0.14 22. 74 like 0.13

23. 51 surface 0.16 23. 75 first 0.13 23. 78 first 0.12

24. 52 energy 0.16 24. 76 could 0.13 24. 80 did 0.12

25. 53 stress 0.16 25. 81 last 0.12 25. 81 now 0.12

26. 54 section 0.15 26. 83 well 0.12 26. 83 people 0.11

27. 55 example 0.15 27. 85 years 0.11 27. 85 should 0.11

28. 57 line 0.14 28. 86 know 0.11 28. 86 very 0.1129. 58 chapter 0.14 29. 89 very 0.10 29. 88 see 0.10

252   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 19: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 19/22

Appendix C  (continued )

SEEC (ca. 2 million words) COBUILD (ca. 323 million words) BNC Written (ca. 90 million words)

N    Rank Word %   N    Rank Word %   N    Rank Word %

30. 60 use 0.14 30. 91 pound 0.10 30. 91 made 0.10

31. 63 temperature 0.13 31. 92 back 0.10 31. 93 back 0.10

32. 64 problem 0.13 32. 94 get 0.10 32. 94 way 0.09

33. 65 must 0.13 33. 95 may 0.10 33. 96 years 0.09

34. 66 given 0.13 34. 97 think 0.09 34. 97 being 0.09

35. 67 time 0.13 35. 98 even 0.09 35. 100 work 0.09

36. 68 body 0.12 36. 100 way 0.09 36. 107 make 0.08

37. 72 area 0.12 37. 101 right 0.09 37. 108 even 0.07

38. 73 constant 0.12 38. 102 three 0.09 38. 110 still 0.07

39. 75 value 0.12 39. 104 dont 0.09 39. 111 must 0.07

40. 77 number 0.12 40. 106 world 0.09 40. 112 own 0.07

41. 78 solution 0.12 41. 110 being 0.09 41. 113 know 0.07

42. 79 fluid 0.12 42. 111 says 0.09 42. 115 year 0.0743. 80 shear 0.12 43. 112 government 0.09 43. 116 good 0.07

44. 81 length 0.12 44. 114 dollar 0.08 44. 119 last 0.07

45. 82 moment 0.11 45. 115 should 0.08 45. 120 get 0.07

46. 84 mass 0.11 46. 116 made 0.08 46. 121 three 0.07

47. 85 axis 0.11 47. 117 good 0.08 47. 122 well 0.07

48. 86 maximum 0.11 48. 119 see 0.08 48. 123 take 0.07

49. 88 work 0.11 49. 120 go 0.08 49. 125 go 0.07

50. 89 plane 0.11 50. 121 did 0.08 50. 126 government 0.07

Appendix D. The fifty most frequent content word forms in the SEEC compared

against the COBUILD Bank of English Corpus and the BNC Written

N    Rank in Corpus Word % in Corpus

SEEC

(ca. 2 m)

COBUILD

(ca. 323 m)

BNC W

(ca. 90 m)

SEEC

(ca. 2 m)

COBUILD

(ca. 323 m)

BNC W

(ca. 90 m)

1. 25 940 546 figure 0.31 0.01 0.02

2. 26 2563 1934 flow 0.31 0.004 0.006

3. 28 3670 2521 determine 0.27 0.003 0.004

4. 29 487 605 force 0.27 0.02 0.02

5. 31 1267 630 shown 0.25 0.008 0.02

6. 33 185 134 used 0.23   0.05 0.06

7. 35   a 7891 velocity 0.22   a 0.001

8. 36 705 837 pressure 0.22   0.01 0.01

9. 41   a 3798 equation 0.19   a 0.003

10. 45 272 230 point 0.18 0.03 0.04

11. 47 298 173 system 0.17 0.03 0.05

12. 48 633 810 forces 0.17 0.02 0.01

13. 51 1752 1092 surface 0.16 0.006 0.01

14. 52 918 793 energy 0.16 0.01 0.01

15. 53 2066 2120 stress 0.16 0.005 0.005

16. 54 1437 497 section 0.15 0.007 0.02

17. 55 469 790 example 0.15 0.02 0.01

18. 57 352 415 line 0.14 0.03 0.02

19. 58 1836 631 chapter 0.14 0.006 0.02

20. 60 208 132 use 0.14 0.04 0.06

(continued on next page)

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    253

Page 20: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 20/22

Appendix D  (continued )

N    Rank in Corpus Word % in Corpus

SEEC

(ca. 2 m)

COBUILD

(ca. 323 m)

BNC W

(ca. 90 m)

SEEC

(ca. 2 m)

COBUILD

(ca. 323 m)

BNC W

(ca. 90 m)

21. 63 3116 2318 temperature 0.13 0.003 0.005

22. 64 341 307 problem 0.13 0.03 0.03

23. 66 292 192 given 0.13 0.03 0.04

24. 67 64 66 time 0.13 0.15 0.15

25. 68 399 324 body 0.13 0.02 0.03

26. 72 390 249 area 0.12 0.02 0.04

27. 73 2808 1981 constant 0.12 0.004 0.005

28. 75 774 523 value 0.12 0.01 0.02

29. 77 247 165 number 0.12 0.04 0.05

30. 78 1928 1466 solution 0.12 0.005 0.007

31. 79 6344 5093 fluid 0.12 0.001 0.002

32. 80

  a a

shear 0.12

  a a

33. 81 1997 1418 length 0.12 0.05 0.008

34. 82 528 453 moment 0.12 0.02 0.02

35. 84   1569 1362 mass 0.11   0.007 0.008

36. 85   a 7092 axis 0.11   a 0.001

37. 86 2839 2021 maximum 0.11 0.003 0.005

38. 87 1146 424 thus 0.11 0.009 0.02

39. 88 135 102 work 0.11 0.07 0.09

40. 89 2198 2884 plane 0.11 0.005 0.004

41. 94 1292 708 material 0.11 0.008 0.01

42. 95 8372 6198 diameter 0.11 0.0008 0.001

43. 96 1180 559 type 0.11 0.009 0.02

44. 97 305 242 water 0.11 0.03 0.04

45. 98 178 163 end 0.11 0.05 0.0546. 99 2521 2242 metal 0.11 0.004 0.005

47. 100 181 157 part 0.11 0.05 0.05

48. 101 7404 6584 beam 0.11 0.001 0.001

49. 102   a 5085 equilibrium 0.11   a 0.002

50. 103 565 358 using 0.11 0.02 0.03

a Not among 10,000 most frequent word forms.

References

Bauer, L., & Nation, P. (1993). Word families.  International Journal of Lexicography, 6 (3), 1–27.

Baker, M. (1988). Sub-technical vocabulary and the ESP teacher: an analysis of some rhetorical items in

medical journal articles.  Reading in a Foreign Language, 4(2), 91–105.

Bolinger, D. (1976). Meaning and memory.  Forum Linguisticum, 1, 1–14.

Cheng, W., Warren, M., & Xun-feng, X. (2003). The language learner as language researcher: putting

corpus linguistics on the timetable.  System, 31(2), 173–186.

Cowan, J. R. (1974). Lexical and syntactic research for the design of EFL reading materials.   TESOL

Quarterly, 8(4), 389–399.

Coxhead, A. (2000). A new Academic Word List.  TESOL Quarterly, 34(2), 213–238.

Dlaska, A. (1999). Suggestions for a subject-specific approach in teaching foreign languages to engineering

and science students.  System, 27 (3), 401–417.

Flowerdew, J. (1993). Concordancing as a tool in course design.  System, 21(2), 231–244.

Ghadessy, M., Henry, A., & Roseberry, R. L. (Eds.). (2001).  Small corpus studies and ELT: theory and 

 practice. Amsterdam/Philadelphia: John Benjamins Publishing Co.

254   O. Mudraya / English for Specific Purposes 25 (2006) 235–256 

Page 21: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 21/22

Hadley, G. (2002). An introduction to data-driven learning.  RELC Journal, 33(2), 99–124.

James, G., Davidson, R., Heung-yeung, A. C., & Deerwester, S. (1994). English in Computer Science: a

corpus-based lexical analysis. The Hong Kong University of Science and Technology: Longman Asia

Ltd.

Johns, T. (1991). Should you be persuaded – two examples of data-driven learning materials. ELR Journal,

4, 1–16.

Lewis, M. (1993). The lexical approach: the state of ELT and the way forward. Hove, England: Language

Teaching Publications.

Martin, A. V. (1976). Teaching academic vocabulary to foreign graduate students.   TESOL Quarterly,

10(1), 91–99.

McEnery, A., & Wilson, A. (1997). Teaching and language corpora (TALC).   ReCALL, 9(1),

5–14.

McEnery, A., & Wilson, A. (2001). Corpus linguistics (2nd ed.). Edinburgh: Edinburgh University Press.

McKay, S. L. (1980). Developing vocabulary materials with a computer corpus.  RELC Journal, 11(2),

77–87.

Moudraia, O. (1999). Lexical syllabus foundation for engineering.  RELC Journal, 30(2), 140–141.Moudraia, O. (2001). Lexical approach to second language teaching.   Eric Digest EDO-FL-01-02.

Washington, DC: ERIC Clearinghouse on Languages and Linguistics. Available from   http://

www.cal.org/ericcll/digest/0102lexical.html .

Moudraia, O. (2003). The student engineering corpus: analysing word frequency. In: D. Archer, P.

Rayson, A. Wilson, & T. McEnery (Eds.),   Proceedings of the corpus linguistics 2003 conference

(pp. 552–561). UCREL technical paper number 16. UCREL, Lancaster University. ISBN

1862201315.

Moudraia, O. (2004). The student engineering English corpus.  ICAME Journal, 28, 139–143.

Mudraya, O. V. (2004). Using a lexical approach for data-driven instruction of engineering English.  IEEE 

Transactions on Professional Communication, 47 (1), 65–70.

Murison-Bowie, S. (1996). Linguistic corpora and language teaching.   Annual Review of Applied 

Linguistics, 16 , 182–199.Nattinger, J. (1980). A lexical phrase grammar for ESL.  TESOL Quarterly, 14, 337–344.

Nattinger, J., & DeCarrico, J. (1992).  Lexical phrases and language teaching . Oxford: Oxford University

Press.

Orr, T., & Takahashi, A. (2002). Constructing a corpus of fundamental engineering English for nonnative

speakers. In J. Williams (Ed.),   Conference proceedings of the IEEE international professional 

communication conference (pp. 403–409). USA: Oregon.

Qi-bo, Z. (1989). A quantitative look at the Guangzhou Petroleum English Corpus. ICAME Journal, 13,

28–38.

Salager, F. (1983). The lexis of fundamental medical English: classificatory framework and rhetorical

function (a statistical approach).  Reading in a Foreign Language, 1, 54–64.

Scott, M. (1996).   WordSmith tools. Oxford: Oxford University Press. Available from   http://www.lexi-

cally.net/wordsmith/ .

Scott, M. (1997). PC analysis of key words – and key key words.  System, 25(2), 233–245.

Sinclair, J. M. (Ed.). (1987). Looking up: an account of the COBUILD project in lexical computing . London:

Collins COBUILD.

Sinclair, J. (1991).  Corpus, concordance, collocation. Oxford: Oxford University Press.

Thurstun, J., & Candlin, C. N. (1998). Concordancing and the teaching of the vocabulary of academic

English. English for Specific Purposes, 17 (3), 267–280.

Trimble, L. (1985).   English for science and technology: a discourse approach. Cambridge: Cambridge

University Press.

Willis, D. (1990).   The lexical syllabus: a new approach to language teaching . London: Collins

COBUILD.

Xue, G., & Nation, I. S. P. (1984). A university word list.  Language Learning and Communication, 3(2),

215–229.

O. Mudraya / English for Specific Purposes 25 (2006) 235–256    255

Page 22: Lexical Frequency & ESP(Wk5)

7/25/2019 Lexical Frequency & ESP(Wk5)

http://slidepdf.com/reader/full/lexical-frequency-espwk5 22/22

Yang, H. (1985a). The JDEST Computer Corpus of texts in English for science and technology.  ICAME 

News, 9, 24–25.

Yang, H. (1985b). The use of computers in English teaching and research in China. In R. Quirk & H. G.

Widdowson (Eds.), English in the world: teaching and learning the language and literature (pp. 86–100).

Cambridge: Cambridge University Press.

Yang, H. (1986). A new technique for identifying scientific/technical terms and describing scientific texts.

Literary and Linguistic Computing, 1(2), 93–103.

Olga Mudraya (Ph.D./Comparative Linguistics; MA (Hon)/Teaching English and Literature) is currently

a Research Associate in the Department of Linguistics and Modern English Language at Lancaster

University, UK. Previously, she was Assistant Professor at Walailak University in Thailand. Her current

research interests include Corpus Linguistics and ESP.

256   O. Mudraya / English for Specific Purposes 25 (2006) 235–256