Human genetic diversity. ESHG Barcelona

36
Human genetic population structure: patterns and underlying processes Guido Barbujani Dipartimento di Biologia ed Evoluzione, Università di Ferrara [email protected]

Transcript of Human genetic diversity. ESHG Barcelona

Page 1: Human genetic diversity. ESHG Barcelona

Human genetic population structure: patterns and underlying processes

Guido Barbujani

Dipartimento di Biologia ed Evoluzione, Università di Ferrara

[email protected]

Page 2: Human genetic diversity. ESHG Barcelona

• Our genome is very small• Our genome is very large• Our genomes are very similar• Our genomes are very different

Human genetic population structure: patterns and underlying processes

Page 3: Human genetic diversity. ESHG Barcelona

There are clear morphologicaldifferences (“types”)

Page 4: Human genetic diversity. ESHG Barcelona

But each group harbours extensive diversity

Page 5: Human genetic diversity. ESHG Barcelona

Analyses of morphological traits led to inconsistent lists of races

Linnaeus (1758) 4 (europeus, asiaticus, afer, americanus) [+2]Blumenbach (1795) 5 (same, + australianus)Cuvier (1828) 3 (caucasoid, negroid, mongoloid)Huxley (1875) 4 (mongoloid, xanthocroid, australoid, negroid)Deniker (1900) 29Weinert (1935) 17Von Eickstedt (1937) 38Museum of Nat. Hist. Chicago (1933) 107Coon (1967) 5 (negroid, capoid, caucasoid, mongoloid, australoid)Risch (2002) 5 (different in different articles)

According to Molnar (1975) 20th century lists include from 3 to 200 items

Page 6: Human genetic diversity. ESHG Barcelona

Skin colour

Stature

Variation is continuous and discordant. It is possible to cluster people one the basis of any trait, but the resulting classification does not allow one to predict clustering for other traits

The trouble with morphological traits

Page 7: Human genetic diversity. ESHG Barcelona

1. Estimating variances from sequence comparisons

-TACGAACATCAGGC--TATGAACATCAGGC--TATGAACATCGGGC-

Page 8: Human genetic diversity. ESHG Barcelona

Independent studies of genetic variances yield very similar results: 85, 5, 10

Lewontin (1972) 17 loci 85% 8% 6%Latter (1973) 18 86% 5% 9%Barbujani et al. (1997) 109 85% 5% 10%Jorde et al. (2000) 100 85% 2% 13%Romualdi et al. (2002) 32 83% 8% 9%Rosenberg et al. (2002) 377 93% 3% 4%Excoffier & Hamilton (2003) 377 88% 3% 9%Ramachandran et al. (2005) 17 90% 5% 5%Bastos-Rodriguez et al. (2006) 40 86% 2% 12%Li et al. (2008) 650 000 89% 2% 9%

MEDIAN 85% 5% 10%

within populations

among populations

among continents

Page 9: Human genetic diversity. ESHG Barcelona

What does it mean, in practice?

100%

100%100%

Members of our community are only slightly less different from us than members of distant populations

85%85%

85%

Page 10: Human genetic diversity. ESHG Barcelona

Mind the numbers

Humans and chimps share >98% of their genomes

Among the 2% differences, 1.9% are fixed differences within species

The remaining fraction, 0.1%, contains all human genomic variation

85% of that 0.1% represents differences among members of the same population

The differences among the main continental groups represent 10% of 0.1% of the total, that is, 0.01%

But 0.01% of <3 billion DNA sites means <300 000 variable sites

Page 11: Human genetic diversity. ESHG Barcelona

2. Clustering genotypes or haplotypes

Rosenberg et al., 2002

Page 12: Human genetic diversity. ESHG Barcelona

Clustering genotypes by algorithms identifying structure

K=3

K=4

Page 13: Human genetic diversity. ESHG Barcelona

SNPs

Haplotypes

CNV

Jakobsson et al. 2008

Structure inferred from SNPs and haplotypes differs from that inferred from Copy Number Variation

Page 14: Human genetic diversity. ESHG Barcelona

Genes, as well as morphology, suggest inconsistent clusterings of genotypes

Africa

Asia, Europe, Australia, Americas

Americas

Africa, Asia, Americas,Oceania

Asia Europe

Africa, Asia,EuropeOceania

Y chromosome: Romualdi et al. 2002

Alu insertions: Romualdi et al. 2002

X chromosome: Wilson et al. 2001

Europe,Ethiopia

S. Africa N. Guinea

Asia

Page 15: Human genetic diversity. ESHG Barcelona

Genes, as well as morphology, suggest inconsistent clusterings of genotypes

377 STR loci: Rosenberg et al. 2005

Melanesia Eurasia N Africa N America

Maya

S. Africa

377 STR loci: Barbujani and Belle 2006

E Africa

C Africa

Piapoco

Suruì

Karitiana

Kalash

W. Eurasia

E. Asia

Africa

Americas

Oceania

Page 16: Human genetic diversity. ESHG Barcelona

Sampling has a large effect on the apparent structuring

Serre and Pääbo 2004

Page 17: Human genetic diversity. ESHG Barcelona

Variation is continuous and discordant. It is possible to cluster people one the basis of any trait, but the resulting classification does not allow one to predict clustering for other traits

The trouble with genetic traits

MCPH D-haplogroup

NAT2 acetylator

Page 18: Human genetic diversity. ESHG Barcelona

Sampling points in the geographic space

3. Identifying genomic boundaries

Page 19: Human genetic diversity. ESHG Barcelona

The sampling points are connected by edges

Page 20: Human genetic diversity. ESHG Barcelona

d

d

d

dd

d

d

d

dd

d

d

d

d

d

dd

d

d

d

d

d

d

d

d

d

d

d

d

d

dd

d

Genetic distances between neighbours are associated to each edge of the reticulation

Page 21: Human genetic diversity. ESHG Barcelona

d

d

d

dd

d

d

d

dd

d

d

d

d

d

dd

d

d

d

d

d

d

d

d

d

d

d

d

dd

d

Boundaries are traced perpendicular to the edge showing the highest genetic distance and extended through the adjacent edges

Page 22: Human genetic diversity. ESHG Barcelona

d

d

d

dd

d

d

d

dd

d

d

d

d

dd

d

d

d

d

d

d

d

d

dd

d

1

1

A boundary is completed when it exits the reticulation or closes on a preexisting boundary

Page 23: Human genetic diversity. ESHG Barcelona

d

dd

d

d

d

d

d

d

dd

d

d

d

d

d

dd

d

1

1

2

23

3

The number of boundaries one may detect is arbitrary, but there are methods to choose

Page 24: Human genetic diversity. ESHG Barcelona

1

1

2

23

3

Four genetic clusters are identified, each separated from the others by a boundary

Page 25: Human genetic diversity. ESHG Barcelona

8

6

2

45

91

7

Genomic boundaries inferred from diversity at 377 STR loci

(Barbujani and Belle 2006)

Eight significant boundaries, defining 9 groups of populations

Page 26: Human genetic diversity. ESHG Barcelona

81% of SNPs cosmopolitan.

Alleles present in one continent only: 0.91% in Africa, 0.75% in Eurasia, practically 0 elsewhere.

Hunting-gathering populations distinct from farmers in Africa

Jakobsson et al. 2008(525910 SNPs, 396 CNVs)

Page 27: Human genetic diversity. ESHG Barcelona

12.4% of haplotypes cosmopolitan, 29% continent-specific, 18% of which in Africa. More than 50% present in 1 or 2 continents

Jakobsson et al. 2008

Page 28: Human genetic diversity. ESHG Barcelona

LD decreasing with physical distance between loci and with geographic distance from East Africa

Jakobsson et al. 2008

Page 29: Human genetic diversity. ESHG Barcelona

Models with an African population replacing previous human continental groups explain the data better than

any alternative models

Fagundes et al. (2007)

Page 30: Human genetic diversity. ESHG Barcelona

Patterns of morphological and genetic variation are compatible with the effects of dispersal from Africa

Manica et al. 2007

Page 31: Human genetic diversity. ESHG Barcelona

Fitting a model of isolation by distance to human genetic diversity

Liu et al. (2006)

Page 32: Human genetic diversity. ESHG Barcelona

Average coalescence times and gene diversity decline as a function of distance from Africa

Best fit of the model for an African exit 56,000 years ago

Page 33: Human genetic diversity. ESHG Barcelona

Fagundes et al. (2007)

http://info.med.yale.edu/genetics/kkidd/point.html

The best available estimates place our species’ origin and its exit from Africa in a not-so-remote past

Page 34: Human genetic diversity. ESHG Barcelona

Linguistic and genetic differences are often correlated

Page 35: Human genetic diversity. ESHG Barcelona

Genetic variances are significant among language groups

Correlations between distance measures r r2

GEN-GEO 0.746*** 0.557GEN-LAN 0.311*** 0.097GEO-LAN 0.269*** 0.072GEN-GEO.LAN 0.723*** 0.523GEN-LAN.GEO 0.172*** 0.030

Percentages of the total variance

Genetic distance Fst Rst

Among lang. phyla 2.9 6.7Among pops. of 2.4 2.9 the same phylumWithin populations 94.7 90.4

Belle and Barbujani 2007

Page 36: Human genetic diversity. ESHG Barcelona

Origins: Attempting a synthesis

• Human genetic population structure is generally weak, with large differences among members of the same population and discordant variation across loci

• Genetic and morphological data agree in indicating an origin of human dispersal in Africa

• At the large geographic scale,patterns fit a model of repeated founder effects during dispersal from Africa

• Zones of relatively sharp genetic change correspond to reproductive barriers, geographic or cultural