The Corpus In The Classroom

16
The Corpus in the Classroom Colin Graham THT Seminar – Manila 2008

description

Slides from presentation at first Teachers Helping Teachers seminar in Manila 2008

Transcript of The Corpus In The Classroom

Page 1: The Corpus In The Classroom

The Corpus in the Classroom

Colin Graham

THT Seminar – Manila 2008

Page 2: The Corpus In The Classroom

What is a corpus?

• A corpus is basically a body of knowledge or information of some kind.  In linguistics, it usually means a collection of texts which are taken to represent some aspect of language - for example, fictional writing, radio broadcasts, editorials, etc.  By carrying out research on the corpus (or corpora) the researcher hopes to make generalizations about an aspect of language as a whole.

Page 3: The Corpus In The Classroom

Why is a corpus useful?

• Many of the ideas we have about our own language, which are based on linguistic intuitions, are not correct.

* “Language users cannot accurately report language usage, even their own” [Sinclair, J. (1987) Introduction, in the Collins Cobuild English Language Dictionary, London: Collins]

* “There are many facts about language that cannot be discovered by just thinking about it, or even reading and listening very intently” [Sinclair, J. (1995) Introduction, in the Collins Cobuild English Dictionary, London: HarperCollins]

* “Using a language is a skill that most people are not conscious of; they cannot examine it in detail, but simply use it to communicate” [Sinclair, J. (1995) Introduction, in the Collins Cobuild English Dictionary, London: HarperCollins]

Page 4: The Corpus In The Classroom

Why is a corpus useful?

• People used to think the Earth was flat and that it was the centre of the Solar System. Galileo’s discovery of the moons around Jupiter, by using better technology (a telescope), forced astronomers and other scientists to think again about their theories and assumptions. We can think of a corpus as being like a telescope which provides a more clearly focused view of the language we are investigating.

Page 5: The Corpus In The Classroom

What do my intuitions tell me?

• The meanings of words in isolation

• Whether or not a sentence in isolation is well formed

Page 6: The Corpus In The Classroom

What do my intuitions tell me?

• The meanings of words in isolation

• Whether or not a sentence in isolation is well formed

But…!

Page 7: The Corpus In The Classroom

How can I test my intuitions?

• How do you say ‘come’ in Tagalog?

• Bad Company! [You can tell a word by the company it keeps – John Firth]

Come mai mangia come un coniglio ogni giorno?

Como vindo come como um coelho cada dia?

How come she eats like a rabbit every day?

• Now, how do you say ‘come’ in Tagalog!

Page 8: The Corpus In The Classroom

The corpus as a bridge

• Long-term exposure• Many thousands of

examples• Wide variety of

contexts and usage• Self-generated rules,

patterns and meanings

• Inductive

• Short-term exposure• Perhaps only

hundreds of examples• Limited variety of

context and usage• Teacher-provided

grammar rules and dictionary definitions

• Deductive

Page 9: The Corpus In The Classroom

The corpus as a bridge

• Long-term exposure • Short-term exposure

Corpora may reduce the lack of exposure to sufficiently varied examples by provided a variety of examples in a concentrated form.

They often offer more motivating, interesting or exciting approaches to teaching and learning foreign languages.

Page 10: The Corpus In The Classroom

Starting a language investigation

• What do you see as being the core or primary meaning(s) of the words see and keep?

• How are these words used in English?• These words have a high frequency of use,

mostly because they have a dependent status based on a phrasal role, rather than being used in their core or primary forms

I’ll keep it in mind.I see.

You’ll have to keep an eye on her!...

Page 11: The Corpus In The Classroom

The investigation continues

• What do you see as the main facts about the meaning and use of the word listen?

• Corpus research makes us consider pragmatics, in that listen is used, amongst other things, to gain the floor in conversations or discussions

Page 12: The Corpus In The Classroom

Another step in the process…

• When do you use the word which and when that in a sentence?

• British and American usage is different, and usage questions like this are more about personal idiolect (localized variants of a language form)

Page 13: The Corpus In The Classroom

Making a corpus – considerations

• What do you want to know? • What information do you need access to in your

corpus? • How much text do you need to have a

representative sample on which to make confident generalizations?

• Do I need to work with the data or can I present it as-is to students?

• Are there copyright restrictions? • Do I need sophisticated software to do what I

want?

Page 14: The Corpus In The Classroom

Making a corpus – Can I build it?

Corpus design is an art in itself.  However, you can build useful corpora in the classroom.  Basically you need a collection of writings or transcriptions as simple text files and some concordancing software as a minimum tool for analysis.

However, you need to consider copyright and the type of language investigation you want to carry out.

[Wichmann, Fligelstone, McEnery and Knowles. Teaching and Language Corpora.  Longman 1997.]  is a good starting point for most of the further questions you may have.

Page 15: The Corpus In The Classroom

Using a corpus for materials

• Use the online corpora for grammar investigations (links on final slide)

• Extract examples of real use from online corpora and build them into Tim Johns programs (link on final slide)

• Get the students to do pre-designed investigations • Get students to write sentences using selected

vocabulary and then check online to see if there are similar examples (can highlight usage problems)

• Use printouts from concordance packages to give students examples of areas where they make consistent or 'fixable' errors (prepositions, for example)

• Use a point from a grammar reference and check it out online or using your classroom corpus.

Page 16: The Corpus In The Classroom

Useful Resources• HarperCollins:

http://www.collins.co.uk/Corpus/CorpusSearch.aspx [Collins - English only; search written, spoken, British, American separately; KWIC format; max 40 examples per search; can specify wordclass, and a few other features; collocation lists also available; 56 Million Words]

• Oxford-BNC (British National Corpus): http://sara.natcorp.ox.ac.uk/lookup.html [British English only; from c 1994; sentence-length examples only, not KWIC format; max 50 examples per search; 100 Million Words]

• Brigham Young University-BNC: http://corpus.byu.edu/bnc/x.asp [a better place to access BNC; KWIC format concordances, etc]

• Corpus of American English: http://www.americancorpus.org/

• David Lee’s Corpora Bookmarks: http://devoted.to/corpora

• Tim Johns’ website has many exercises and useful links, including his “CONTEXTS” program: http://www.eisu2.bham.ac.uk/johnstf/index.html