15 lines representing a bull Traditional statistics Assumes data is independent Comparative methods.
-
Upload
mia-blackburn -
Category
Documents
-
view
214 -
download
1
Transcript of 15 lines representing a bull Traditional statistics Assumes data is independent Comparative methods.
Andrew [email protected] of Reading
15 linesrepresenting a bull
Traditional statisticsAssumes data is independent
Comparative methods
EnglishFish
DanishFisk
DutchVisch
Fish Ryba
CzechRyba
Russian Ryba
BulgarianRiba
23 other languages34other languages
1 3517
Average 17
1 “Who”, “Three”
35 “Person”, “Dirty”
English here sea (A) water when
German hier see, meer (A,B) wasser
wenn
French ici mer (B) eau
quand
Italian qui, qua mare (B) acqua
quando
Greek edo thalasa (C) nero
pote
Hittite ka aruna- (D) watar
kuwapi
Languages Meanings
sea (A) meer (B) thalasa (C) aruna- (D)
English 1 0 0 0
German 1 1 0 0
French 0 1 0 0
Italian 0 1 0 0
Greek 0 0 1 0
Hittite 0 0 0 1
Q01
0Non cognate
1Cognate
Q10
0 10 1
0 0 0 0
Time1000 years
Results = Data + Method
Most probableRandom tree -58204 Log units4.1 x 1014107
Infinite number of poor trees
Out g
roup
Gre
ek
Ind
o-Ira
nia
n
Sla
vic
Germ
anic
Celtic
Rom
ance
“Name”, 3 cognate classesClass A, Gypsy (Alav), Persian (Esm)Class B, Latvian (Vards), Lithuanian (Vardas)Class C, All the rest, Hindi (Nam), Greek (Onoma), Italian (Nome)
Class A
Class B Class C
B AA B
C A
A C
B C
C B
B A, C B, ectThe estimated instantiations transition rate
To many parameters, not enough data
2 cognate classes
Slow rate Fast rate
Class 1
Class 2
“Red”“Salt”
“Five”
Mean = 3.05 1.82Median = 2.74Min. = 0.09Max = 9.27
100 fold difference
Mean rates for the 200 words
Slow‘two’, ‘who’, ‘one’, ‘night’, ‘to die’
Fast‘dirty’, ‘to turn’, ‘to stab’,
Word Half life50% chance of the word being replaced by a non-cognate form
Years
Mean 5260
Median 2530
Min 750
Max 76530
Based on IE being 8000 years
I-E tree showing variation in rates of lexical replacement, per 10k years
“One” 0.43 “Ear” 0.88 “Sand” 4.5
ROMANCE
GERMANIC
GREEK
GERMANIC
SLAVIC
INDO-IRANIAN
Spoken word frequency Spoken word frequency British National CorpusBritish National Corpus
0
50
100
150
200
250
300
350
Co
un
t
1 1.5 2 2.5 3 3.5 4 4.5
log(10) of spoken word frequency per million
N = 4840 wordsmean = 194geometric mean = 35.94median = 25
Distribution of frequency of word use(20-100 million words)
Most words used < 100 times per million
r=0.87 r=0.88
r=0.87Frequent of use is very stable thru out IE
Frequency vs rate of lexical evolution
r=-0.37 r=-0.35
r=-0.41 r=-0.32
Parts of speechconjunctions ----prepositions ----adjectives ----verbs ----nouns ----special adverbs----pronouns ----numbers ----
R2=0.50 R2=0.48
R2=0.48R2=0.48
Numbers, pronouns, special adverbs
Stronger selection?
Attribute Genetic systems Languages
discrete units nucleotides, genes,individuals
words and other linguisticelements
replication transcription teaching, learning, imitation
dominant mode(s) ofinheritance
parent-offspring parent-offspring,generational (includingteaching)
horizontal transmission many mechanisms (e.g.,hybridisation, viruses,transposons, insects)
borrowing
mutation many mechanisms (e.g.,slippage, unequal crossingover, point mutations andfaulty repair)
mistakes, vowel shifts,innovation
selection of favouredvariants
fitness differences amongalleles
societal trends
Some similarities between linguistic and genetic systems