NCAP: analyse critique - INRA
Transcript of NCAP: analyse critique - INRA
NCAP: analyse critique
Atelier BanyulsJuin 2006
R. Petit
Nested Clade PhylogeographicAnalysis (NCPA)
0
20
40
60
80
100
120
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006Year
Cita
tions
NCPA
Geodis
a. Matrice de données
Locus
Hap
loty
pe
c. Analyse par clades emboîtés
b. Cladogramme emboîté
Principe de l’analyse par clade emboîté d’après Cruzan et Templeton (2000). (a) Matrice de caractères reliant cinq haplotypes (A-E) caractérisés par leur état allélique à quatre locus liés (1-4). (b) Cladogramme emboîté. Les haplotypes internes, supposés être plus ancestraux, sont représentés par les cercles blancs, tandis que les haplotypes plus récemment dérivés sont représentés par les cercles gris. (c) Exemple hypothétique d’association géographique du clade 1-1. Les carrés représentent la localisation géographique des populations (les carrés à demi-hachurés représentent les sites avec les deux haplotypes). La distribution géographique des haplotypes A et B est quantifiée à l’aide de deux paramètres : (i) le niveau de dispersion (distance du clade) autour du centre de l’haplotype indiqué par les cercles en pointillés Dc(A) et Dc(B), (ii) le niveau de déplacement (distance du clade emboîté, Dn(A) et Dn(B)) du centre de l’haplotype en regard du centre géographique de tout le clade ( ).
Critiques• A sweeping criticism is that NCPA is ad hoc, non-statistical and not
amenable to falsification (Knowles & Maddison 2002, Knowles 2004). This emerging debate has resulted in the coining of the term “Statistical Phylogeography” for alternative model-based approaches.
• The most unorthodox component of NCPA is the use of a key to draw inferences from the correlation statistics. Such an approach can be considered a posteriori as opposed to a priori hypothesis testing, the common mode of natural scientists.
• A major concern of Knowles (2004) is an under appreciation of the stochastic variance inherent in gene geneaologies (not reflected in a single haplotype network), which will lead to inaccurate or misleading interpretations and thus must be incorporated into the analytical approach.
Current perspectives in phylogeography and the significance of south European refugia in the creation and maintenance of European Biodiversity
Steve Weiss and Nuno Ferrand
Défenses• Despite different schools of thought on how to approach
phylogeographic analysis, a plea is made to maintain pluralism when dealing with such complex, multi-disciplinary and stochastically influenced data sets.
• Alternative approaches, such as Mantel tests or other regression techniques are neither nested nor contain a temporal component. Thus, disregarding the third and final step of NCPA (the use of an inference key) the notion that it is non-statistical is misleading to those unfamiliar with the entire approach.
• The use of an inference key is not ad hoc
Un sérieux problème avec la NCPA
NCPA
NCPA
NCPA
NCPA
Démonstration par l’absurde
Distribution des variants 10-11-12
17
11
1012
Pourquoi une permutation des individus (haplotypes) ne convient pas?
• Processus différents intrapop et à l’échelle de l’aire (ex: structure en tâches liée à la colonisation à LD, changements d’échelle…)
• Pseudoréplication (idem structuration spatiale de génotypes homozygotes dans une population)
Difficultés intrinsèques en statistiques spatiales à comparer la codistribution géographique de
deux variables
La suite?
• Spatialement explicite + coalescent • SPLATCHE is a program that allows to
incorporate the influence of environment in the simulation of migration of a given species fromone origin. In a second phase, the moleculargenetic diversity of one or several samplesdrawn from the simulated species can begenerated.http://cmpg.unibe.ch/software/splatche/
Mono- versus multilocus• Viewing the last 12 issues of Molecular Ecology (Oct. 2005-Nov. 2004),
there were 42 manuscripts with the term “phylogeography” in the title, among which 34 (81%) were based exclusively on organelle genes (mtDNAor cpDNA).
• there is no way to predict the reliability of such single locus data sets. If theory is followed, many gene genealogies must be sampled in order to have some degree of confidence that the history of an organismal lineage is being reasonably recovered.
• the variance in coalescence trees reveals that the most efficient way of increasing the accuracy of inferences drawn from gene genealogies is to increase the number of independent (i.e. unlinked) loci sampled
• The best phylogeographic studies sample the entire range of the organism, and using some prior knowledge of an organism’s diversity or predictions of how landscape fragmentation may be molding extant genetic structure, an attempt is made to provide sample coverage of potential major demes.