Post on 23-Jun-2020
LOT Winter School 2018
Contrastive Phraseology
Session 1:
Corpus-based Contrastive Analysis
and Phraseology
Aims of the course
Give an introduction to:
Contrastive analysis and contrastive phraseology.
Gain insight into:
Different approaches to contrastive analysis;
Bidirectional Contrastive Method;
A bottom-up approach to contrastive phraseology.
Do practical corpus-based contrastive phraseology.
Using existing corpora;
Compiling new corpora.
Outline of the course
Day Topic Reading
Mon Corpus-based Contrastive Analysis and
Phraseology
Aijmer 2008;
Johansson 2007, ch. 1;
Ebeling & Ebeling 2013, ch. 1
Tue Approaches to CA and the Bidirectional
Contrastive Method
Ebeling & Ebeling 2013, ch. 2
Wed Hands-on – Using multilingual corpora Ebeling 2014;
Ebeling & Ebeling 2015
Thur Contrastive phraseology: Method and
analysis
Compiling a multilingual corpus
Ebeling & Ebeling 2013, ch. 5;
Ebeling et al. 2013;
Johansson 2007, ch. 2
Fri Hands-on – Compiling and using
multilingual corpora
https://www.lotschool.nl/about/teacher/t/0/204
Today
Introduction to contrastive analysis
Introduction to corpus-based CA
Brief introduction to phraseology
Contrastive Analysis
Contrastive analysis (CA) is the systematic
comparison of two or more languages,
with the aim of describing their similarities
and differences.
(Johansson 2007)
What is CA?
A linguistic enterprise aimed at producing [...] two-valued typologies (a CA is always concerned within a pair of languages), and founded on the assumption that languages can be compared. (James 1980: 3)
A general approach to the investigation of language, particularly as carried on in certain areas of applied linguistics, such as foreign-language teaching and translation. In a contrastive analysis of two languages, the points of structural differences are identified, and these are then studied as areas of potential difficulty in foreign-language learning. (Crystal 1985: 74)
A branch of linguistics that describes similarities and differences among two or more languages at such a level as phonology, grammar, and semantics, especially in order to improve language teaching and translation. (McArthur 1992)
What is CA?
NB: A methodology for language study, not a theory of
language.
Often associated with the study of a pair of languages,
although more than two languages may be involved.
Generally associated with applied linguistics: linguistic
studies aiming to achieve better language
teaching/learning, better translators, better translations,
etc.
Also useful for insights (e.g. testing hypotheses) in
theoretical linguistics.
Terminology
'Contrastive analysis' and 'contrastive
linguistics' are terms generally used
synonymously for cross-linguistic studies
involving a systematic comparison of two
or more languages with a view to
describing their similarities and
differences. (Hasselgård 2010: 98)
Contrasting English and Norwegian
o Lexical similarity: bring vs. bringe
o Lexical difference: bring vs. ta med 'take with'
o Syntactic similarity: inverted word order in interrogative sentences
"Hva har du gjort idag, Herman?“ (LSC1)
"What have you done today, Herman?"
o Syntactic difference: Do periphrasis
"Jeg spiller ikke på døde hester." (LSC2)
*"I bet not on dead horses."
"I don't bet on dead horses."
o Phraseological similarity: throw in the towel vs. kaste inn håndkledet
o Phraseological difference: make a mistake vs. gjøre en feil 'do a mistake'
These may be more or less obvious areas of
contrast between languages. Greater awareness
of the less obvious ones will increase our
understanding of the individual languages and will
provide a better platform for us as language
learners and language researchers.
The objective of CA
Language comparison is of great interest in a
theoretical as well as an applied perspective. It
reveals what is general and what is language
specific and is therefore important both for the
understanding of language in general and for the
study of the individual languages compared.
(Johansson & Hofland 1994: 25)
Why embark on contrastive linguistics? (Lauridsen 1996: 63-64)
Language pedagogy
The comparison of languages in order to prepare language teaching materials or
explain foreign language learner performance (interlanguage analysis)
Cross-linguistic analysis
(a) The analysis of the expression of non-language-specific categories in two or more
languages, eg the semantic category of modality as it is expressed grammatically or
lexically in different languages.
(b) The analysis of linguistic categories in two or more languages which become
apparent only as a result of a contrastive analysis, eg the progressive aspect as it is
expressed lexically in Danish, discovered as a result of a contrastive analysis of
English and Danish.
Contrastive lexicographical or terminological work
The creation of bilingual or multilingual dictionaries and databases.
CA may:
Contribute to a better description of the languages compared;
Contribute to the development of better teaching materials;
Reveal new and unknown areas of contrast;
Contribute to advanced multilingual and cross-cultural
competence.
Compare languages on the basis of what?
Introspection?
A bilingual researcher uses his/her intuition (“it would be difficult to establish cross-
linguistic correspondences on the basis of introspection alone. Introspection cannot provide certainty that we have found all the correspondences and it would be difficult to have intuitions about the cases where the semantic overlap is only partial” (Simon-Vandenbergen & Aijmer 2004: 16)
Elicitation? Elicitation tests in different language communities
Study of language learning, first language transfer?
Authentic language? (Text-based studies) Comparable texts (original texts in the languages compared)
Parallel texts (original and translated texts)
Translation is a source of perceived similarities
across languages.
CA and corpora
… there has been a steadily growing realization that cross-linguistic studies cannot rely on introspection or scanty empirical evidence, but must be firmly based on naturally occurring language used in a variety of situations.
(Aijmer & Altenberg 1996)
A corpus can be defined as:
a body of texts in machine-readable form
put together in a principled way
representing authentic language use
for use in linguistic research
After a period when contrastive analysis was rejected by many, there has been a revival, in large measure connected with the new possibilities of contrasting languages using multilingual electronic corpora.
(Johansson 2007: xv)
Corpora for cross-linguistic research
Comparable corpus
Translation corpus
Parallel corpus
Note: the term ‘parallel corpus’ is also used
about the first two.
Comparable corpus (model 1)
A corpus that contains original texts in more than one language and where the texts in each language have been selected according to the same criteria (genre, content, publication date etc.)
Language 1
Genre A
Genre B
Genre C
Genre D
Language 2
Genre A
Genre B
Genre C
Genre D
Language 3
Genre A
Genre B
Genre C
Genre D
Comparable corpus – extreme case
Language 1
President Obama har
ikke vært president i
24 timer ennå, men
har alt tatt sin første
avgjørelse som USAs
mektigste mann.
Han har bedt aktorene
i de militære
domstolene, det
såkalte tribunalet, i
Guantanamo om å be
om en 120-dagers
pause i alle pågående
saker. (Aftenposten)
Language 2
Within hours of taking
office, Obama's
administration filed a
motion to halt the war
crimes trials for 120
days, until his new
administration
completes a review of
the much-criticised
system for trying
suspected terrorists. (The Guardian)
Language 3
Schon wenige
Stunden nach seinem
Amtsantritt hat Obama
einen ersten Schritt
zur Beendigung der
Militärtribunale in
Guantánamo
eingeleitet. Der neue
Präsident wies die
Militärankläger an, in
allen 21 laufenden
Verfahren eine 120-
tägige Aussetzung zu
beantragen. (Frankfurter
Allgemeine)
Comparable corpus – example (FC)
Language 1
For han visste at i
krigen lurte døden, og
noen av mennene
som dro ut ville aldri
vende hjem.
Det hadde han vært
vitne til flere ganger,
og enkene og barna
etter de døde gikk
gråtende omkring
mellom hyttene.
Language 2
The village school for
younger children was
a bleak brick building
called Crunchem Hall
Primary School.
It had about two
hundred and fifty
pupils aged from five
to just under twelve
years old.
Language 3
Sie führten ein
äußerst genügsames,
strenges und hartes
Leben, und ihre
Kinder, Knaben wie
Mädchen, wurden zur
Tapferkeit, zur
Großmut und zum
Stolz erzogen.
Sie mußten Hitze,
Kälte und große
Entbehrungen
ertragen lernen und
ihren Mut unter
Beweis stellen.
GO
EO
Guilt brings us nearer to God.
"Bring in something to eat …
…enough, bring me a taste Sunday.…
…it brings you a smell …
…Schwester Cecilia brachte nichts …
… Leute dazu bringen, sich zu …
… Zeit zurück, oder bringen sie in …
Aber jeder bringt der …
L1
(English originals)
L2
(German originals)
Comparable corpus
GO
EO
Guilt brings us nearer to God.
"Bring in something to eat …
…enough, bring me a taste Sunday.…
…it brings you a smell …
…Schwester Cecilia brachte nichts …
… Leute dazu bringen, sich zu …
… Zeit zurück, oder bringen sie in …
Aber jeder bringt der …
Comparable corpus
Comparable corpora
… are not ideal for contrastive analysis. Since it is difficult to guarantee the comparability of texts in different languages, a comparable corpus gives a less clear picture of the correspondences of a lexical item than does a parallel corpus.
(Aijmer 2008: 278)
Translation corpus (model 2)
A corpus that contains the ‘same’ texts in more than one language, in other words a corpus with both original and translated texts.
Original text(s)
Translation, language 1
Translation, language 2
Translation, language 3
Translation corpus - example
Foran meg på skrivebordet hjemme i Croydon sitter jeg
med et krøllet prospektkort som er datert Barcelona 26.
mai 1992. (JG3)
Hier auf meinem Schreibtisch in Croydon liegt eine
zerknitterte Ansichtskarte, abgestempelt in Barcelona am
26. Mai 1992. (JG3TD)
On my desk here at home in Croydon is a crumpled picture
postcard from Barcelona, dated 26 May 1992. (JGTE)
Je suis dans ma maison à Croydon. Devant moi, sur mon
bureau, il y a une carte postale froissée datée du 26
mai 1992, Barcelone. (JG3TF)
GT EO
Guilt brings us nearer to God.
"Bring in something to eat …
…enough, bring me a taste Sunday.…
…it brings you a smell …
… Unsere Schuld führt uns näher...
"Bring was zu essen ...
… ein Schüsselchen mitbringen.
Kommet Wind auf, trägt er den…
L1
(English originals)
L2
(German translations)
Translation corpus
(unidirectional)
GT EO
Guilt brings us nearer to God.
"Bring in something to eat …
…enough, bring me a taste Sunday.…
…it brings you a smell …
… Unsere Schuld führt uns näher...
"Bring was zu essen ...
… ein Schüsselchen mitbringen.
Kommet Wind auf, trägt er den…
Translation corpus
(unidirectional)
The translations ensure that we compare like with like, as a solid
tertium comparationis is present, i.e. "the relationship between a
unit in the source language and its translation in the target
language" (Granger 2010: 5).
However, as it has been claimed that translations "cannot but give
a distorted picture of the language they represent" (Teubert 1996:
247), a combination of translation and comparable corpora is
endorsed ...
Hasselgård & Ebeling (in prep.)
ET
Sister Cecilia brought to light no…
… will-o'-the-wisps make others lose their way.
… back for me, or give it a differ…
… each one of us is going to the .. with..
GO
…Schwester Cecilia brachte nichts …
… Leute dazu bringen, sich zu …
… Zeit zurück, oder bringen sie in …
Aber jeder bringt der …
GT EO
Guilt brings us nearer to God.
"Bring in something to eat …
…enough, bring me a taste Sunday.…
…it brings you a smell …
… Unsere Schuld führt uns näher...
"Bring was zu essen ...
… ein Schüsselchen mitbringen.
Kommet Wind auf, trägt er den…
L1
(English originals)
L2
(German translations)
Parallel corpus (model 3)
(aka bidirectional translation corpus)
L2
(German originals)
L1
(English translations)
ET
Sister Cecilia brought to light no…
… will-o'-the-wisps make others lose their way.
… back for me, or give it a differ…
… each one of us is going to the .. with..
GO
…Schwester Cecilia brachte nichts …
… Leute dazu bringen, sich zu …
… Zeit zurück, oder bringen sie in …
Aber jeder bringt der …
GT EO
Guilt brings us nearer to God.
"Bring in something to eat …
…enough, bring me a taste Sunday.…
…it brings you a smell …
… Unsere Schuld führt uns näher...
"Bring was zu essen ...
… ein Schüsselchen mitbringen.
Kommet Wind auf, trägt er den…
Parallel corpus (model 3)
(aka bidirectional translation corpus)
In the parallel corpus model, we draw on translation
data to objectively identify corresponding items in the
languages compared. These translation paradigms
(horizontal lines in the model) may form the basis for
further scrutiny in the comparable data (slant line).
Translation paradigm
The translation paradigm provides a ‘rich’
description of what a lexical element or
construction means, and how it functions, by
considering the translations into one or more
languages. (Aijmer 2008: 284)
Parallel corpora
It is difficult to see how any other method
could give such a clear and detailed picture
of the relationship between the languages
and contribute to the language-specific
description of the languages compared. (Aijmer 2008: 280)
(Minimal) mark-up of (parallel) texts
Metadata (i.e. descriptive information about the text,
including information about author, translator, year of
publication, etc.)
Sentence alignment
Sentence alignment (1:1 / 1:2)
<s id=FW1.2.s38 corresp=FW1TD.2.s36>Guilt brings us nearer to God.</s>
<s id=FW1TD.2.s36 corresp=FW1.2.s38>Unsere Schuld führt uns näher zu Gott.</s>
<s id=NG1.1.s80 corresp='NG1TD.1.s81 NG1TD.1.s82'>He saw the need to
bring together the school and the community in which it performed an isolated
function — education as a luxury, a privilege apart from the survival
preoccupations of the parents.</s>
<s id=NG1TD.1.s81 corresp=NG1.1.s80>Er wollte die Schule und die Gemeinde
zusammenbringen.</s>
<s id=NG1TD.1.s82 corresp=NG1.1.s80>Er wollte die isolierte Funktion der
Schule — Bildung als Luxus, als Privileg, das über die dem reinen Überleben
dienenden Beschäftigungen der Eltern hinausging, überwinden.</s>
Sentence alignment (line:line)
Guilt brings us nearer to God.
He saw the need to bring together the
school and the community in which it
performed an isolated function —
education as a luxury, a privilege apart
from the survival preoccupations of the
parents.
Unsere Schuld führt uns näher zu Gott.
Er wollte die Schule und die Gemeinde
zusammenbringen. Er wollte die isolierte
Funktion der Schule — Bildung als Luxus,
als Privileg, das über die dem reinen
Überleben dienenden Beschäftigungen der
Eltern hinausging, überwinden.
Using parallel corpora for research
Particularly well suited for studies of lexis / lexico-grammar (or studies that can take lexis as their starting point), but also lends itself well to the study of (frequent) patterns (in a bottom-up approach).
The methodology is not tied to any particular theoretical approach (SFL, cognitive linguistics, pattern grammar, lexis-driven approach à la Sinclair + traditional grammar…)
Broad range of phenomena have been (are being) investigated, e.g. the use of individual verbs (bli, få, take, see), modality, particular syntactic constructions, connectives, discourse relations, phraseology.
Some limitations
You can only search for something that is explicit in the text (applies to corpus linguistics in general);
The (small) size of parallel corpora restricts studies of less frequent items / constructions;
Faulty and less successful translations.
Ways around the limitations?
Manual “searches” in running text, e.g. for Theme, subjects;
Identify typical (and searchable!) expressions;
Go for a bottom-up approach;
Errors in the translation: Weed out? Ignore translations that occur only once, or in only one text?
Arguments against using translated texts
in contrastive research
Translations distort the target language because of influence from the source language (translationese);
Translated language is different from original language (translation as a “third-code”);
Translators are unreliable and make mistakes;
Translations differ depending upon the individual translator;
Translations represent an interpretation of the original. This may imply interpreting (losing?) ambiguity and polysemy.
Translationese/Translation effect
• Does NOT include translation mistakes such as:
I feel like shit. -- Jeg føler meg som laken ('I feel like sheet').
• DOES include translation features such as:
Overuse or underuse in translations compared to originals, e.g. vær så
snill ('be so kind' in Norwegian original vs. translated texts (influence
from English please).
A systematic influence on TL from SL. Gellerstam (1986)
Differences between choices in original and
translated texts in the same language. Johansson (2007: 32)
Nevertheless…
Some of this criticism "can be met by having many different
translators and a large number of translations as the empirical
foundation for research". Aijmer (2008: 278)
In other words, the translations are "based on the insights of a
large pool of specialist informants, i.e. the translators". Ebeling & Ebeling (2013: 222)
Phraseology
That language to a large extent relies on ‘combinations
of words that customarily co-occur’ (Kjellmer 1991 :
112) is now a generally accepted view in linguistics.
Such combinations are said to constitute the
phraseology, or phrasicon, of a language. (Ebeling & Hasselgård 2015: 2017)
Field of study
Set of lingusitic units studied in the field.
What is phraseology?
"The study of the structure, meaning and use of word
combinations" (Cowie 1994:3168).
More or less fixed ”word combinations”.
The lexico-grammatical pattern belonging to a word.
arms akimbo
heavy rain fish and chips what is more
mind how you go piece of cake
I don't I think that and then he
Phraseology is a field bedevilled by the proliferation
of terms and by conflicting uses of the same term … Cowie (1998: 210)
N-grams Phraseological Units
Restricted collocations Chunks
Frames
Sentence stems
Patterns
Lexical bundles
Fixed expression
Idioms Prefabs Clusters
Recurrent word combinations Collocations
Fixed phrases Multi-word units
Phraseology as defined and studied in
this course
Lexico-grammatical context of lexical items
Specified items as starting point
Pattern: a recurrent multi-word combination that
functions as a semantic unit.
Unspecified items (N-grams) as starting point to identify
and select patterns for research
(n-gram = a contiguous sequence of n items in naturally occurring
text)
Approach to phraseology
Frequency-based / Distributional
Bottom-up/inductive approach, identifying items that do not necessarily fit predefined linguistic categories.
Many of these traditionally considered outside the limits of phraseology (i.e. may be "free combinations").
Idiom principle (strings of co-selected words that constitute single choices).
Contrastive Analysis and phraseology
Contrastive analysis (CA) is the systematic
comparison of two or more languages, ...
(Johansson 2007)
Contrastive phraseology is the systematic
comparison of word-combinations across
languages.
Contrastive phraseology angle: "similar-looking expressions in
two languages (e.g. all the way – hele veien 'the whole way') may
be associated with different degrees of metaphorical potential and
thereby different conditions of use" (Ebeling et al. 2013)
Preparation for hands-on sessions
http://folk.uio.no/signeo/Winterschool2018/ContrastivePhraseology.html
• Download Tools:
– AntConc
– AntPConc
– Notepad++ (Windows)
– Visual Studio Code (Mac/Windows)
• Download Texts (one of the following folders):
– En-Du
– En-Ge
– …
• Create an account at https://cqpweb.lancs.ac.uk/
References and further reading
Aijmer, K. 2008. Parallel Corpora and Comparable Corpora. In Lüdeling, A. & M. Kytö (eds.), Corpus Linguistics. An International Handbook, Vol. 1. Berlin / New York: Walter de Gruyter. 275-292.
Aijmer, K. & B. Altenberg. 1996. Introduction. In Aijmer, K, B. Altenberg, M. Johansson (eds.) Languages in Contrast. Lund: Lund University Press. 11-16
Cowie, A.P. 1981. The treatment of collocations and idioms in learners' dictionaries. Applied Linguistics 2(3): 223–325.
Cowie, Anthony P. 1994. Phraseology. In R.E. Asher (ed.),The Encyclopedia of Language and Linguistics. Oxford: OUP. 3168–3171.
Cowie, A.P. 1998. Phraseological dictionaries: Some East-West comparisons. In Cowie (ed.), Phraseology. Theory, Analysis, and Applications. Oxford: OUP. 209–228.
Ebeling, J. & S.O. Ebeling. 2013. Patterns in Contrast. Amsterdam: Benjamins.
Ebeling, J., S.O. Ebeling, & H. Hasselgård. 2013. Using recurrent word-combinations to explore cross-linguistic differences, In K. Aijmer & B. Altenberg (eds), Advances in Corpus-based Contrastive Linguistics: Studies in honour of Stig Johansson. Amsterdam: Benjamins, 177-199.
Ebeling, S.O. & H. Hasselgård. 2015. Learner corpora and phraseology, In Sylviane Granger; Gaëtanelle Gilquin & Fanny Meunier (eds), The Cambridge Handbook of Learner Corpus Research. Cambridge: CUP, 207-229.
Fiedler, S. 2007. English Phraseology. Tübingen: narr studienbücher.
Gellerstam, M. 1986. Translationese in Swedish novels translated from English. In L. Wollin and H. Lindquist (eds), Translation Studies in Scandinavia. Lund: CWK Gleerup. 88-95.
Granger, S. & M. Paquot. 2008. Disentangling the phraseological web. In S. Granger & F. Meunier, Phraseology. An Interdisciplinary Perspective. Amsterdam: Benjamins. 27–50.
Granger, S. 2010. Comparable and translation corpora in cross-linguistic research: Design, analysis and applications. Journal of Shanghai Jiaotong University, 2010. Available at: http://sites-test.uclouvain.be/cecl/archives/Granger_Crosslinguistic_research.pdf.
Hasselgård, H. 2010. Contrastive analysis / contrastive linguistics. In K. Malmkjær (ed.). The Routledge Linguistics Encyclopedia. Third Edition. London: Routledge. 98-101.
Hoey, M. 2007. Lexical priming and literary creativity. In M. Hoey, M. Mahlberg, M. Stubbs and W. Teubert (eds), Text, Discourse and Corpora: Theory and Analysis. London: Continuum. 7-29.
James, C. 1980. Contrastive Analysis. London: Longman.
Johansson, S. 2007. Seeing Through Multilingual Corpora. On the Use of Corpora in Contrastive Studies. Amsterdam/Philadelphia: John Benjamins Publishing Company.
Johansson, S. and K. Hofland. 1994. Towards an English-Norwegian parallel corpus. In Fries, U., G. Tottie & P. Schneider (eds.), Creating and Using English Language Corpora. Amsterdam: Rodopi, pp. 25-37.
Lauridsen, K. 1996. Text corpora and contrastive linguistics: Which type of corpus for which type of analysis? In Aijmer, K, B. Altenberg, M. Johansson (eds.) Languages in Contrast. Lund: Lund University Press, 63-72
McArthur, T. (ed.). 1992. The Oxford Companion to the English Language. Oxford: Oxford University Press.
Simon-Vandenbergen, A.-M. and K. Aijmer. 2004. The expectation marker of course in a cross-linguistic perspective. Languages in Contrast 4:1 (2002/2003, published 2004), 13-44.
Teubert, W. 1996. Comparable or parallel corpora? International Journal of Lexicography 9(3): 238–264