Shared structural features in Transeurasian languages ... contributions...0.4 0.04 0.68 0.89 0.46...
Transcript of Shared structural features in Transeurasian languages ... contributions...0.4 0.04 0.68 0.89 0.46...
Shared structural features in Transeurasianlanguages: borrowed or inherited?
Nataliia Neshcheret
Eurasia3angle,Max Planck Institute for the Science of Human History
29 Aug 2018, SLE Tallinn
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 1 / 26
Introduction
Language sample
20 40 60 80 100 120 140 160
020
40
60
80
100
EvBEvDEvk
Nan
Neg
OrocOrok
Udi
Olch
Soln
Azer
BashChu
CrimGag
Khak
Khal
Shor
Trk
Tuv
Yak
Tat
TukJap
Ogm
ShuTar
HatIke
Oki
Yon
Yuw
Bon
Halh
Mang
Kalm
Bur
Kor
Ain
Niv
Mar
Fin
TurkicMongolicTungusicJaponicKoreanAinuNivkhUralic
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 2 / 26
Introduction
How are these methods and this kind of data useful?
Q1 How can my research contribute to the debate on the internalstructure of the Transeurasian family?
Q2 Can we define a “Transeurasian” area, based on structuralfeatures, which stands out among other language families inthe area?
Q3 What methods do other disciplines offer for investigation ofquestions from macro-typology and historical linguistics?
Q4 What can structural features tell us about the relationshipsbetween languages in question?
Q5 What is the impact of language contact on the structuralchange of the Transeurasian languages?
Q6 Are there differences in structural features regarding theamount of genealogical signal?
Q7 How does the topology change, if structural features with thelowest phylogenetic signal are excluded from the analysis?
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 3 / 26
Introduction
The challenge: Structural features
I “where the lexical signal has been lost, a faint structuralsignal might still be discernible” (Dunn et al. 2005)
I “the most stable structural features of languages could beuseful for deep historical reconstruction just like the mostconservative portion of the vocabulary” (Dediu and Levinson2012)
I “Structural features necessarily have a more attenuatedhistorical signal than lexical features, since shared structuralfeatures may originate from borrowing and convergentevolution (homoplasy) as well as from inheritance.” (Reesinket al. 2009)
I “[...] on average, most grammatical features actually changefaster than items of basic vocabulary” (Greenhill et al. 2017)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 4 / 26
Material and methods
What and how?
I data:I 38 Transeurasian languages (9,576 data points)I 4 non-Transeurasian languagesI 228 structural features (189 Grambank features, 39 features on
phonology and formal representation)
I sources:I language descriptions,I dictionaries,I native speakers,I language specialists
I methods:I Bayesian tree-sampling,I neighbour-joiningI phylogenetic comparative methods
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 5 / 26
Material and methods
Feature set
I morphosyntactic features:I person, number, possession, interrogation, negation, derivation
patterns, valency operations, numeral systems, comparison,argument marking, deixis)
I phonological featuresI voicing distinction in plosives/fricatives, l/r distinction,
constraints on initial consonants, availability of initialconsonant clusters, vowel harmony, vowel length
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 6 / 26
Material and methods
Coding example
(1) Udehe (Tungusic; Nikolaeva and Tolskaya 2001: 840)
mamasaold.woman
ule:-wemeat-ACC
olokto-inicook-3SG
‘The old woman is cooking meat.’
(2) Khalkha (Mongolic; Janhunen 2012: 246)
noxaidog
mo:r-i:gcat-ACC
barı-ebcatch-TERM
‘The dog caught the cat.’
I Is pragmatically unmarked word order verb-final for transitive clauses? →yes, 1
I Can the A argument be indexed by a suffix/enclitic on the verb in thesimple main clause? → yes for Udihe, 1, → no for Khalkha, 0
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 7 / 26
Material and methods
Raw data
language verb-final word order A argument marked on the verb
Udehe 1 1Khalkha 1 0
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 8 / 26
Transeurasian debate
Q1, Q4: The TEA (=Transeurasian) forest
I lexicostatistics tree (vocabulary)(Starostin et al. 2003)
I classical comparative tree(Robbeets 2015)
I lexicostatistics tree (case suffixes)(Blazek and Schwarz 2014)
I Bayesian tree (Robbeets andBouckaert 2018)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 9 / 26
Transeurasian debate
Q1, Q4: Transeurasian topology based on structuralfeatures
Japono-Koreanic
Tungusic
Turkic
Mongolic
I Japono-Koreanic vs. Altaic branches
I Tungusic splits off first from the Altaic ancestor
I this structure is stable across all tested models, ifneighbouring languages are excluded
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 10 / 26
TEA vs. non-TEA: neighbour-joining
Q1, Q4: Transeurasian topology based on structuralfeatures
Tungusic
TurkicMongolic
Koreanic
Japonic
weight threshold = 0,00568
Splitstree4, Huson and Bryant (2010)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 11 / 26
TEA vs. non-TEA: neighbour-joining
Q2, Q5: How do non-Transeurasian languages relate tothe Transeurasian languages?
Tungusic
Turkic
Mongolic
Uralic
Nivkh Ainu
Koreanic
Japonic
weight threshold = 0,00568Splitstree4, Huson and Bryant (2010)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 12 / 26
TEA vs. non-TEA: neighbour-joining
Q3: What are the languages with the highest conflictingsignal?
MangghuerFinnishYakutBuriat EvenD
KalmykKhakasTuvan MariKhalajNivkh ChuvashSolon
Khalkha UdeheUlchNanaiBaoanAinu Bashkir
OrokEvenB ShorOroch Tatar
GagauzEvenki TurkishAzerbaijaniTurkmen
CrimTatarIkemaHateruma
NegidalKoreanYonaguniTarama
OgamiYuwanJapanese
Shuri
Okinoerabu
0.30
0.33
0.36
0.39Ainu
Japonic
Koreanic
Mongolic
Nivkh
Tungusic
Turkic
Uralic
Language family
De
lta
sc
ore
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 13 / 26
TEA vs. non-TEA: Bayesian
Q1-Q5: Transeurasian vs. neighbours: the best-fittingmodel
Tatar
EvenD
GagauzTurkmen
Khalkha
Chuvash
Buriat
Nanai
Tuvan
Mangghuer
Korean
Baoan
EvenB
Yonaguni
Finnish
Shuri
Solon
Khakas
Kalmyk
CrimTatar
Orok
Bashkir
Ulch
Japanese
Shor
Mari
Khalaj
Yakut
Tarama
Oroch
Ikema
Azerbaijani
Udehe
Hateruma
Okinoerabu
Nivkh
NegidalEvenki
Ainu
Turkish
Yuwan
Ogami
0.57
0.92
0.15
0.4
0.04
0.680.89
0.46
0.96
0.5
0.32
0.14
0.4
0.87
0.53
0.17
0.21
0.13
0.08
0.98
0.1
0.99
0.16
0.17
0.89
0.3
0.22
0.03
0.61
0.41
0.3
0.37
0.51
0.07
0.72
0.16
0.46
0.32
0.12
1
0.09
BEAST2, Bouckaert et al. (2014)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 14 / 26
Phylogenetic signal
Q6: How reliable are structural features in derivingphylogenies?
I is there a phylogenetic signal in structural features?I calculate Fritz and Purvis’ D (Fritz and Purvis 2010):I lower D value indicate a higher phylogenetic signal, higher
values are a sign of overdispersionI a feature with a high signal will have the same state in sister
languagesI how does a phylogeny based on features with a high
phylogenetic signal differ from the one based on all the codedfeatures?I compare the results of distance-based methodsI compare the maximum clade credibility trees
I what are the differences in the phylogenetic signal acrossfeatures? [future research]I compare D values across featuresI compare D values across features on different language
domains
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 15 / 26
Phylogenetic signal
Q6: GB028 Is there an inclusive/exclusive distinction?Estimated D = 0.08
AinuNivkhMariFinnishKoreanJapaneseIkemaTaramaHaterumaYonaguniOgamiOkinoerabuYuwanShuriEvenDEvenBNegidalEvenkiSolonNanaiUlchOrokOrochUdiheBuriatKhalkhaKalmykBaoanMangghuerChuvashKhalajYakutTuvanKhakasShorBashkirTatarCrimTatarTurkishGagauzAzerbaijaniTurkmen
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 16 / 26
Phylogenetic signal
Q6: Phylogenetic signal across structural features: Fritz &Purvis’ D
0
20
40
60
-10 -5 0 5
Estimated D
Fre
qu
en
cy
blue=ideal topology, red=the best-fitting modelfunction phylo.d, package CAPER, Orme et al. (2013)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 17 / 26
Phylogenetic signal
Q6: Are high D values due to feature uniformity?
I if a feature has 41 times a “0” value and 1 time a “1” value,most of the sister branches will have the same value
0.00
0.25
0.50
0.75
1.00
-4 -2 0 2
Estimated D
Un
iform
ity
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 18 / 26
Phylogenetic signal
Q6: Estimated D < 0.5: core Transeurasian
Tungusic
Turkic
Mongolic
Koreanic
Japonic
weight threshold = 0,00568Splitstree4, Huson and Bryant (2010)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 19 / 26
Phylogenetic signal
Q7: Estimated D < 0.5: Neighbours
Uralic
Nivkh
AinuKoreanic
JaponicTungusic
TurkicMongolic
weight threshold = 0,00568
Splitstree4, Huson and Bryant (2010)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 20 / 26
Phylogenetic signal
Q7: Estimated D < 0.5: Neighbours
Yuwan
Tarama
Mangghuer
Bashkir
Turkmen
Turkish
EvenD
CrimTatar
Khalkha
Finnish
Shuri
Ikema
Tuvan
Hateruma
Yonaguni
Khalaj
EvenB
Mari
Azerbaijani
Evenki
Ulch
Japanese
Solon
Udehe
Oroch
Nivkh
KhakasTatar
Negidal
Shor
Okinoerabu
Chuvash
Yakut
Korean
Kalmyk
Orok
Baoan
Buriat
Ogami
Ainu
Gagauz
Nanai
0.58
0.04
0.15
0.47
0.63
0.7
0.1
0.27
0.25
0.03
0.14
0.26
0.13
0.78
1
0.21
0.28
0.4
0.93
0.08
0.230.44
0.6
0.93
0.94
0.15
0.11
0.2
0.05
0.39
0.17
0.52
0.66
0.07
0.18
0.16
0.77
0.06
0.13
0.34
0.34
BEAST 2 Bouckaert et al. (2014)
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 21 / 26
Conclusions
I topology based on structural features = topology based onbasic vocabulary, if neighbours excluded
I “Transeurasian” area not clearly definable due to typologicalsimilarity of the Uralic
I Bayesian and neighbour-joining tree-building methods usefulfor macro-typology
I phylogenetic comparative methods need to be applied withcaution
I the phylogenetic signal can be veiled by contact, if extensive
I exclusion of “unstable” features did not provide a topologysimilar to the one based on basic vocabulary
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 22 / 26
Acknowledgements
I am grateful to Simon Greenhill, Annemarie Verkerk and RonHubler for their methodological support.
Thanks to Aleksandr Savelyev and Sofia Oskolskaya for support indata collection.
The research leading to these results has received funding from theEuropean Research Council (ERC) under the European Union’sHorizon 2020 research and innovation programme (grantagreement No 646612) granted to Martine Robbeets.
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 23 / 26
References I
Baele, Guy, Philippe Lemey, Trevor Bedford, Andrew Rambaut, Marc A Suchard, andAlexander V Alekseyenko. 2012. Improving the accuracy of demographic andmolecular clock model comparison while accommodating phylogenetic uncertainty.Molecular biology and evolution 29:2157–2167.
Blazek, Vaclav, and Michal Schwarz. 2014. Jmenna deklinace v altajskych jazycıch.Linguistica Brunensia 62.
Bouckaert, Remco, Joseph Heled, Denise Kuhnert, Tim Vaughan, Chieh-Hsi Wu,Dong Xie, Marc A Suchard, Andrew Rambaut, and Alexei J Drummond. 2014.BEAST 2: a software platform for Bayesian evolutionary analysis. PLoScomputational biology 10:e1003537.
Dediu, Dan, and Stephen C Levinson. 2012. Abstract profiles of structural stabilitypoint to universal tendencies, family-specific factors, and ancient connectionsbetween languages. PloS One 7:e45198.
Dunn, Michael J., Angela Terrill, Ger P. Reesink, Robert A. Foley, and Stephen C.Levinson. 2005. Structural phylogenetics and the reconstruction of ancientlanguage history. Science 309:2072 – 2075.
Fritz, Susanne A, and Andy Purvis. 2010. Selectivity in mammalian extinction risk andthreat types: A new measure of phylogenetic signal strength in binary traits.Conservation Biology 24:1042–1051.
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 24 / 26
References II
Greenhill, Simon J, Chieh-Hsi Wu, Xia Hua, Michael Dunn, Stephen C Levinson, andRussell D Gray. 2017. Evolutionary dynamics of language systems. Proceedings ofthe National Academy of Sciences 201700388.
Hammarstrom, Harald, Hedvig Skirgard, Jeremy Collins, Hannah Haynie, AlenaWitzlack, Stephen C. Levinson, Russell Gray, Jakob Lesage, Richard Kowalik,Robert Forkel, Linda Raabe, Suzanne van der Meer, Jana Winkler, Ger Reesink,Tessa Yuditha, Patience Epps, Luise Dorenbusch, Hilario de Sousa, Cheryl AkinyiOluoch, Claire Bowern, Giada Falcone, Eloisa Ruppert, Martin Haspelmath,Nataliia Neshcheret, Karolin Abbas, Jesse Peacock, Hugo de Vos, OlgaKrasnoukhova, Robert Borges, Stephanie Petit, Michael Dunn, Carolina Kipf, JayLatarche, Nancy Bakker, Roberto Herrera, Johanna Nickel, Giulia Barbos, KristinSverredal, Tim Witte, Ruth Singer, Michael Dunn, Janina Klingenberg, SorenDanielsen, Swintha Pieper, and Damian Blasi. 2017. Grambank: A world-widetypological database. Electronic database under development. Max Planck Institutefor the Science of Human History.
Huson, Daniel H., and David Bryant. 2010. Splitstree4.
Janhunen, Juha A. 2012. Mongolian, volume 19. John Benjamins Publishing.
Nikolaeva, Irina, and Maria Tolskaya. 2001. A grammar of Udihe, volume 22. Walterde Gruyter.
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 25 / 26
References III
Orme, David, et al. 2013. The caper package: comparative analysis of phylogeneticsand evolution in r. R package version 5:1–36.
Reesink, Ger, Ruth Singer, and Michael Dunn. 2009. Explaining the linguistic diversityof Sahul using population models. PLoS Biol 7:e1000241.
Robbeets, Martine. 2015. Diachrony of verb morphology: Japanese and theTranseurasian languages, volume 291. Walter de Gruyter GmbH & Co KG.
Robbeets, Martine, and Remco Bouckaert. 2018. Bayesian phylolinguistics reveals theinternal structure of the Transeurasian family. Journal of Language Evolution3:145–162.
Starostin, Sergei A, Anna Dybo, Oleg Mudrak, and Ilya Gruntov. 2003. Etymologicaldictionary of the Altaic languages. Brill Leiden.
Nataliia Neshcheret TEA structural features 29 Aug 2018, SLE Tallinn 26 / 26