Potentials and limits of haplotype trees in exploring
population structure and pathogenicity of mutations
Hans-Jürgen Bandelt (Hamburg)
17. Jahrestagung der Deutschen Gesellschaft für Humangenetik
Heidelberg, 08.–11. März 2006
Human mtDNA
from MITOMAP
HVS-I
alias HVR1
The perception of evolution as seen through the lenses of laboratories constitutes an overlay of
two different processes:
Perceived evolution =
Natural evolution (of the genome)
+ Artificial evolution (in the lab)
Migrational processes (prehistory)
mtDNA and evolution
α: Natural evolution
10484312618597551191412007
7691018
35947256
13650
648723
1413547155805746
107501418214861
965+3C14614964 5267 6002 6284 9332
10978 11116 11743 12405 12714 12771
14533A 14791 14959 15244
335754606167737677627775847386318697
103731125311344114851165312280124141317413344
14000A14302
87021592332
3254A343462318856
9130A95549941
10700109551135311944126301323914845152631545815703
15777C
825A8655
10688108101350615301
41047521
75014382706476970288860
117191476615326
12705
87019540
103981087315301
1 2 3 4 5 6 7 8
72123575310
10184103141261812816134431370814461145661485115553
24
1719283137774388485953007055876795099827
100441028911563115901196314410
2707387941225147546055675813593080209098925493809965
11440124691308013755
22 23
13105
678792
3582449153937394883593379682
11944123731422114371145601458715833
345950465605627266806842
11933441521155819477
103731100215299
15 16 17
18223666
7819A85278932
1144014769
3396 4218
15514 15944d
56019950
319736934048 4350 51947270 8853
1250712634 141481510615952
9591692464351816293648066028158 825184009932
10604 111761177014590 15940
745+T1719184258219365
1531415479
18 19 20 21 rCRS
10819
76451404014395
21588598
106791126013687
13800A
11 12
235214212
47158392
1256115367
13 14
34353621
5894+T63927129
8041 819789289941
1234014034
3483640183118817
13708
5899+C1475015172
9 10
81336043705 43754793
6671 12346 1363515514
25
50,0000
100,000150,000
200,000
Time(years)
10400 14783 15043
L0
3516A 5442 9042 9347
10589 10664 10915 13276
L3
5231 5460 8428 8566
11176 12720 14308
L0a
L
L5
L2’6 = L2’3’4’6’7
L3bd = L3bcd
L3ex = L3eix
L3x L3iL3e
L3f
L3f1
L4’6 = L3’4’6’7
L3’7 = L3’4’7
L1’5 = L1’2’3’4’5’6’7
MN
R
L3h
6446 6680 12403
12950C 14110
M1
3666 7055 7389
13789 14178 14560
L12395d 5951 6071 8027 9072
10586 12810 13485
14000A 14911
L1c 241682069221
1011513590
L23693
L2d
L3d
3450577362219449
1008613914A153111582415944d
5147 74248618
13886 14284
L3b
L3c
391881049855
1260913470
L4
L7L6
709770961
137101528915499
L3a
54418222
126301481815388
15944d
275828857146846813105
L2’5= L2’3’4’5’6’7
78619575
518614905
2417G30273720497652138152
9809C104931106511260117011218812215
12546T1271412810135691383015383
70985193018224496 5004 5111 5147 56566182 6297 7424 7873 8155 8188 8582 8754 9305 9329 9899
11015 11025 11881 1223613105 13722 14212 14239 14581 14905 14971 15217 15884
159822205162
5899+C6962
10031111641125211959124771254015929
1114314755
34237972
1243212950
5147571162578460
9bp-del11172
L0a2
95459554
13116
L0a2a
L1c1’210321
1204913149
L1c2
921
L3d1750
L3e5
615062537076733787848877
107921079311654
L1c2a
22455603
116411513615431
L0af
4586 9818
L0ak
ML tree of basal African mtDNA haplogroups Coding-region variation displayed Torroni et al. (TIG, June 2006)
. Ethiopian samples
CRS
R M
all mutations that distinguish haplogroups M and R (part of N)
incorrect rooting
One of the first views of the East Asian mtDNA phylogeny (Ozawa, Herz 1994)
Upre-HV JT
R1R5
R6
R7
R30
R31
R9PR11
R8
B
N5
SO
X
A
N9 West Eurasia
South Asia
East Asia
Oceania
15607 9140 6755 8404
N
N1W
R
R2
Palanichamy et al (Amer J Hum Genet, 2004)
Star-burst of autochthonous mtDNA lineages in Eurasia (haplogroup N and its subhaplogroup R)
... and a massive burst in haplogroup M, as e.g. seen in India:
Sun et al (Mol Biol Evol, March 2006)
An Out-of-Africa model based on mtDNA analysis
Kivisild et al (Springer-Verlag, April 2006)
HV
HV0 = pre-V
HV0a H3H1
H
V
R0a = (pre-HV)1
R0 = pre-HVUJT
R X N2
W N1b N1a’I
N1
N1a I
N
Sketch of the phylogeny of basal European mtDNA haplogroups
Torroni et al (TIG, June 2006)
Spatial frequency distributions of haplogroups H1, H3, V, and U5b reveal signature of post-LGM expansions
Torroni et al (TIG, June 2006)
Laboratory-specific processes (error and fraud)
mtDNA and evolution
β: Artificial evolution
Major sources of error in mtDNA sequence data
Artificial Recombination through contamination or sample mix-up (or targeting nuclear inserts of mtDNA)
Phantom mutations sequencing errors at electrophoresis
Documentation errorsincurred by casual reading or writing
Impurifying selection is the driving force in artificial evolution
inasmuch as incorrect data are more flexible to interpret and can support sexy
stories — seemingly told by DNA — which are then disseminated by high-impact
factor journals (e.g. Science and Nature).
Worst case: mtDNA in cancer research (Salas et al, PLoS Medicine
2005)
Case of mtDNA sample mix-up, mis-interpreted as somatic mutations;
data generated with MitoChip by Maitra et al (Genome Res, 2004)
Data re-analysis by Bandelt et al (J Med Genet, 2005)
M7a
N
F
NDsq0168
@6455 965.2+CC
NDsq0178
rCRSL3R
12771
64
1612
990
53
13928C
1630
439
70
1031
063
9224
9d
F1
15618 200
195
NDsq0167
NDsq0015
1622
3M
F1aF1a’c
1617
2
40
86
1620
9 49
58
4386
27
72
2626 9824
64
55
1504
314
783
1040
048
9
1530
110
873
1039
895
4087
01
M7 12705
R9
12882 12406
10410 @9824
13759
1651
9 10
609
6962
52
2-52
3d
1651
9
1614
0 15
422
8005
58
99+
C
4435
22
18
965+
CC
96
1
24
9
1616
2 95
48
14002
F1a1
F1a1b
A C E
B D FF
1
1
3000
3000
6000
6000
9000
9000
12000
12000
15000
15000
16569
16569
M7a2
M7aF1a1b
NDsq0168
M7a
NDsq0167
F1a1b F1a1b
63
A case of cross-over in the 672 human complete mtDNA sequences from Tanaka et al (2004)
Prime example of a phantom mutation (Brandstätter et al, Electrophoresis 2005)
rCRS
Electropherogram from
Nasidze and Stoneking (2001)
generated 1997 / 1998
and for the first time presented in Stoneking and Nasidze (Ann Hum Genet, 2006)
Phantom mutations can be found in excess in the HVS-I Caucasus data of
Nasidze and Stoneking (2001).
In view of additional problems, this may be regarded as the worst data set ever
published in the realm of molecular anthropology;
see Bandelt and Kivisild (Ann Hum Genet 2006) for data re-analysis
Sequences with phantom transitions at 16280-16281 in those Caucasus dataCode Mutation (16000+) Haplogroup
AR31 067 279G 280 281 355 HV1AR483 069 126 145 280 281 367C JAZ2 280 281 ?AZ342 280 281 298 pre-VAZ6 154 168A 280 281 356 384 ?CH444 111 214G 249 280 281 327 388 U1bCH451 280 281 292 ?DAR23 129 223 278 280 281 ?DAR36 258 280 281 384 ?KAB408 224 280 281 311 K
This mutation pair has never been observed in >40,000 HVS-I sequences!
Electropherogram presented by Stoneking and Nasidze (Ann Hum Genet, 2006)
rCRS
Phantom mutations in the HVS-I data of Plaza et al (Ann Hum Genet, 2003)(267 samples) Sample Mutation (16000+) Haplogroup
Algeria 279N 285N ?Andalusia 129 182C 183C 189 223 249 311 359 371 M1Andalusia 129 281 ?Andalusia 281 ?Catalonia 093 192 270 281 290A 304 311 U5bCatalonia 224 281 311 KMorroco 093 224 242 311 371 KMorroco 124 223 284C 285T 300 319 374T L2dMorroco 126 187 189 223 264 270 278 293 311 371 374 L1bMorroco 126 284C 292 294 T2Morroco 183C 189 223 278 382G XMorroco 189 192 270 369T U5bSaharawi 093 172 185 223 327 382G L3e1Saharawi 172 281 311 U6?Saharawi 189 382G ?
Comparison with 1624 complete sequences stored in
the mtDB database
Variation in 16279-16285:
Only 20 transitional variants at 16284
Variation in 16369-16389:
Only 1+1+6 transitional variants at 16371, 16380, and 16381
L3
15301108731039895408701
150431478310400
489
HV
4769 1438
7028 2706
H
H2
15326 8860 750
315+C263
rCRS
1622312705
R
14766
163625178A4883
130741196999508108764265311095326318215146
M11
BJ105 LN7710 GD7817
1448811860A10658
Miao271
C
16172
1609211350
200
#101 #078
16292A1618916167152361247710235860228852238
198146
14569119359554146
#081
16173153271191411410
200151
1629815487T
85847196A 4715
M8
163271431813263119149545
3552A
2
CZ
249d
16519149781295778536338598758214047146
1
1
1
1
1
12
2
D
1466884143010
D4
BJ106
3
3
4
Qu2005
4
16218A161401431411914117781032591509021204
161291497984733206152
D4a
5
5
5
5
6
6
6
7
Li 2004
N
16319162908794482442481736663235
A
7
7
77
77
7
7
7
7
163901629127361555
961+C
8
Zhao2004
8
8
8
8
8
8
14075C11718
11639C4247
2572G1709
961+C
WH6980
16362523-524d
1494
1629414776132878567855142573687
1168d654
9 10 11
9 10 11
10
9
Wang2005
1418096679383
1630413928C
3970
R9
103106392249d
F
12
WZ4
12
12
12
13
WZ5
13
13
13
1629816189
13928C1555495204199184
9824A8964
1382C
D4b2
8020
D4b
13
15043C13182117781978
14
WZ6
163111507115040 14502 131521254987934140709
573+C
152181064688567250
3172+C
M10a
M10
14
14
16129160931313511778112579966
8821G63573866
1288212406106096962
F1
1651916304G15784139281177881672389
523-524d
15
BJ101
15
16362163041629811065 10320 5978 5913 5585 3434
F3
15
15
16220C 9947
8281-8289d152
F3b
15
162651609315784
10988C10980G108731089410427
8270G5885544250761555495G489150
QJ383
1622715910130441191410398480235352392151
16
BJ102
16261 16257A 12372 12358 5231 150
541716
N9
N9a
16
11719 73
pre-HV
1636213856A10873106409443853250461555
1612916111 12007 438616
N9a1
17
17
1717
17
18
BJ104
18
18
18
9296194
D4b2b
18
16519
1521719 16311
1649716265T13928T
2361
SD10324HNsq0152
159241486911926
200154
1479013890 10685
M11b
14340
M11a
10
3
Yuan 2005
5
Li 2005
17
BJ103
1575812468
10742G1064010589863467103423
16519152367511
10410
D4a1
16519744433241811217
1555
12
589744541555
523-524d
16311162171593079821719
14384207
16291159301524414605
523-524d
1555
KAsq0089
5
161898281-8289d
B
161401039899508584709
B5
4
B5a
16266A152353537210
B5a2
6960 35404
WH6967
14989107545773
523-524d
16519
12
12
13
1717
Re-evaluation of the mtDNA data from the lab of Min-Xin Guan
missing mutations
misscored mutations in red
Yao et al (Hum Genet, 2006)
N M
rCRS
R
Strategies of authors to deal with errors1st: Publishing a corrigendum
[rare event]
2nd: No correction — but avoiding similar errors in future work
[common practice]
3rd: No action — and committing the same errors as before [e.g. as Min-Xin Guan and colleagues do]
4th: Fraudulent action — performing fake analyses and giving false statements [as done by Mark Stoneking and
Ivane Nasidze in the Ann Hum Genet]
... only L strand, no H strand information shown!Stoneking and Nasidze (2006)
Human Mitochondrial DNA and the Evolution of Homo sapiens
Series: Nucleic Acids and Molecular Biology, Vol.18 Volume package: Human Mitochondrial DNA
Bandelt, Hans-Jürgen; Richards, Martin; Macaulay, Vincent (Eds.) 2006, Approx. 250 p., 31 illus., 2 in colour., HardcoverISBN: 3-540-31788-0Springer-VerlagDue: April 2006
Top Related