JGI: Genome size impacts on plant adaptation
-
Upload
jrossibarra -
Category
Science
-
view
472 -
download
1
Transcript of JGI: Genome size impacts on plant adaptation
Adaptation in plant genomes: a role for genome size?
Jeffrey Ross-Ibarra @jrossibarra • www.rilab.org
Dept. Plant Sciences • Center for Population Biology • Genome Center University of California Davis
photo by lady_lbrty
Kew C-Value Database
Gaut and Ross-Ibarra 2008
Paris Japonica150GB Genome
Genlisia aurea63MB Genome Michal Rubeš
wide variation in genome size in plants
Kew C-Value Database
what explains genome size variation?
Lynch & Connery 2003 ScienceWhitney et al. 2010 Evolution
Kew C-Value Database
what explains genome size variation?
Lynch & Connery 2003 ScienceWhitney et al. 2010 Evolution
seed weight
Knight et al 2005 AoB
genome size
geno
me
size
leaf
are
acorrelates of genome size: phenotypes across species
elevation
geno
me
size
Bilinski et al. In Prep
correlates of genome size: altitude within Zea mays
0
10
20
30
100 105 110DNA
plants
cycle0
6
genome size
late floweringearly flowering
Rayburn et al. 1994 Plant Breeding
# pl
ants
hard sweep
how do genomes adapt?
hard sweep
how do genomes adapt?
hard sweep
how do genomes adapt?
hard sweep
multiple mutations
standing variation
“soft” sweeps
how do genomes adapt?
hard sweep
multiple mutations
polygenic adaptation
standing variation
“soft” sweeps
how do genomes adapt?
M T G P H R L
GGTCGAC ATG ACT GGT CCA CAT CGA CTG TAG
M T G P H R L
GGTCGAC ATG ACT GGT CCA CAT CGA CTG TAG
M T N P H R L
GGTCGAC ATG ACT GAT CCA CAT CGA CTG TAG
structural change to protein
M T G P H R L
GGTAAAC ATG ACT GGT CCA CAT CGA CTG TAG
GG—-AC ATG ACT GGT CCA CAT CGA CTG TAG
regulatory change to expression
1.5
2.5
3.5
4.5
Angiosperm average
6400 Mb
Non-TE DNATE DNA
Lo
g (
ge
no
me
siz
e in
Mb
)
0
1,500
3,000
4,500
6,000
0 1500 3000 4500 6000
Ge
no
me
siz
e (
Mb
)
TE content (Mb)
r = 0.99
Ara
bid
op
sis
th
alia
na
Ara
bid
op
sis
lyra
ta
Bra
ch
yp
od
ium
dis
tach
yo
n
Pa
pa
ya
Ric
e
Lo
tus ja
po
nic
us
Bla
ck c
ott
on
wo
od
Gra
pe
vin
e
Ca
bb
ag
e
Me
dic
ag
o t
run
cu
lata
So
rgh
um
So
yb
ea
n
Le
va
nt
co
tto
n
Ma
ize
Ae
gilo
ps s
pe
lto
ide
s
Ba
rle
y
Thursday, May 6, 2010
Figure 1 _ Main Text
Tenaillon et al. 2010 TIP Springer et al. 2016 Plant Cell
Tenaillon et al. 2010 TIP Springer et al. 2016 Plant Cell
Ne individuals, µ beneficial mutation rate per trait
bigger genome, larger mutation target, higher µ
selection from standing variation when 2Neµ > 1
predict that larger genomes adapt via noncoding changes, standing variation
Hancock et al 2011 Science
enric
hmen
t?
no<—
——>y
esArabidopsis adaptation predominantly coding
inte
rgen
ic
syno
nym
ous
nons
ynon
ymou
s
maize (2.5Gb)Arabidopsis
log 1C genome size
Suketoshi
how does adaptation work in maize?
maizeteosinte
standing variation
©20
11 N
atur
e A
mer
ica,
Inc.
All
righ
ts r
eser
ved.
NATURE GENETICS ADVANCE ONLINE PUBLICATION 3
L E T T E R S
mutation rate21, strongly suggesting that the Hopscotch insertion (and thus, the older Tourist as well) existed as standing genetic variation in the teosinte ancestor of maize. Thus, we conclude that the Hopscotch insertion likely predated domestication by more than 10,000 years and the Tourist insertion by an even greater amount of time.
We identified four fixed differences in the portion of the proximal and distal components of the control region that show evidence of selection. We used transient assays in maize leaf protoplasts to test all four differences for effects on gene expression. Maize and teosinte chromosomal segments for the portions of the proximal and distal components with these four differences were cloned into reporter constructs upstream of the minimal promoter of the cauliflower mosaic virus (mpCaMV), the firefly luciferase ORF and the nopaline synthase (NOS) terminator (Fig. 4). Each construct was assayed for luminescence after transformation by electroporation into maize pro-toplast. The constructs for the distal component contrast the effects of the Tourist insertion plus the single fixed nucleotide substitution that distinguish maize and teosinte. Both the maize and teosinte constructs for the distal component repressed luciferase expression
relative to the minimal promoter alone. The maize construct with Tourist excised gave luciferase expression equivalent to the native maize and teosinte constructs and less expression than the minimal promoter alone. These results indicate that this segment is function-ally important, acting as a repressor of luciferase expression and, by inference, of tb1 expression in vivo. However, we did not observe any difference between the maize and teosinte constructs as anticipated. One possible cause for the lack of differences in expression between the maize and teosinte constructs might be that additional proteins required to cause these differences are not present in maize leaf pro-toplast. Another possibility is that the factor affecting phenotype in the distal component lies in the unselected region between −64.8 and −69.5 kb, which is not included in the construct. Nevertheless, the results do indicate that the distal component has a functional element that acts as a repressor. The functional importance of this segment is supported by its low level of nucleotide diversity (Fig. 3a), suggesting a history of purifying selection.
The constructs for the proximal component of the control region contrast the effects of the Hopscotch insertion plus a single fixed nucleo-tide substitution that distinguish maize and teosinte. The construct with the maize sequence including Hopscotch increased expression of the luciferase reporter twofold relative to the teosinte construct for the proximal control region and the minimal promoter alone (Fig. 4). Luciferase expression was returned to the level of the teosinte con-struct and the minimal promoter construct by deleting the Hopscotch element from the full maize construct. These results indicate that the Hopscotch element enhances luciferase expression and, by
a
b
0.06
A B C D M
T
P = 0.95 P = 0.41 P = 0.04
HKA neutrality tests
P 0.0001
0.04
0.02
0–67 kb –66 kb
Distalcomponent
Teosinte clusterhaplotype
Maize clusterhaplotype
Proximalcomponent
–65 kbTourist408 bp
Hopscotch4,885 bp
–64 kb –58 kb
Figure 3 Sequence diversity in maize and teosinte across the control region. (a) Nucleotide diversity across the tb1 upstream control region. Base-pair positions are relative to AGPv2 position 265,745,977 of the maize reference genome sequence. P values correspond to HKA neutrality tests for regions A–D, as defined by the dotted lines. Green shading signifies evidence of neutrality, and pink shading signifies regions of non-neutral evolution. Nucleotide diversity ( ) for maize (yellow line) and teosinte (green line) were calculated using a 500-bp sliding window with a 25-bp step. The distal and proximal components of the control region with four fixed sequence differences between the most common maize haplotype and teosinte haplotype are shown below. (b) A minimum spanning tree for the control region with 16 diverse maize and 17 diverse teosinte sequences. Size of the circles for each haplotype group (yellow, maize; green, teosinte) is proportional to the number of individuals within that haplotype.
Transient assay constructs
mpCaMV luc
luc
luc
luc
luc
luc
luc
luc
Hopscotch
Tourist
mpCaMV
T-dist
M-dist
T-prox
M-prox
0 0.5 1.0 1.5 2.0
∆M-dist
∆M-proxPro
xim
al c
ontr
ol r
egio
nD
ista
l con
trol
reg
ion
Relative expression
Figure 4 Constructs and corresponding normalized luciferase expression levels. Transient assays were performed in maize leaf protoplast. Each construct is drawn to scale. The construct backbone consists of the minimal promoter from the cauliflower mosaic virus (mpCaMV, gray box), luciferase ORF (luc, white box) and the nopaline synthase terminator (black box). Portions of the proximal and distal components of the control region (hatched boxes) from maize and teosinte were cloned into restriction sites upstream of the minimal promoter. “ ” denotes the excision of either the Tourist or Hopscotch element from the maize construct. Horizontal green bars show the normalized mean with s.e.m. for each construct.
relative expressionconstructStuder et al. 2011 Nat. Gen.; Vann et al. 2015 PeerJ
enhances expression
teosinte branched - tb1
hard sweep
Figure 1.Phenotypes. a. Maize ear showing the cob (cb) exposed at top. b. Teosinte ear with the rachisinternode (in) and glume (gl) labeled. c. Teosinte ear from a plant with a maize allele of tga1introgressed into it. d. Close-up of a single teosinte fruitcase. e. Close-up of a fruitcase fromteosinte plant with a maize allele of tga1 introgressed into it. f. Ear of maize inbred W22(Tga1-maize allele) with the cob exposed showing the small white glumes at the base. g. Earof maize inbred W22:tga1 which carries the teosinte allele, showing enlarged (white) glumes.h. Ear of maize inbred W22 carrying the tga1-ems1 allele, showing enlarged glumes. For highermagnification copies of f–h see Supplementary Information.
Wang et al. Page 10
Nature. Author manuscript; available in PMC 2006 May 23.
NIH
-PA
Author M
anuscriptN
IH-P
A A
uthor Manuscript
NIH
-PA
Author M
anuscriptWang et al. 2015 Genetics
protein change
teosinte glume architecture - tga1
multiple mutations
Wills et al. 2013 PLoS Genetics
teosinte maizeClint Whipple, BYU
grassy tillers - gt1
5’ control region 3’ UTRmodifies expression
hard sweep
M T N P H R L
GGTCGA ATG ACT GAT CCA CAT CGA CTG TAG
tga1 gt1 tb1
Multiple Mutations
Standing Variation
M T G P H R L
GGTAAA ATG ACT GGT CCA CAT CGA CTG TAG
Hufford et al. 2012 Nat. Gen. Chia et al. 2012 Nat. Gen
13 teosinte 23 maizegenomes:
genome-wide evidence of adaptation
Hufford et al. 2012 Nat. Gen. Chia et al. 2012 Nat. Gen
13 teosinte 23 maizegenomes:
genome-wide evidence of adaptation
Hufford et al. 2012 Nat. Gen. Chia et al. 2012 Nat. Gen
13 teosinte 23 maizegenomes:
5-10% selected regions do not include genes
genome-wide evidence of adaptation
whereas others are lost after domestication (Fig. 3B). It should benoted that many of these genes have unique coexpression edges inmaize that are not observed in teosinte (Fig. S4B).
Expression data provide an opportunity to investigate furtherfunctional alterations to genes located within genomic regionsthat population genomic analyses identify as targets of selective
E
DE(n=612)
AEC(n=1115)
Dom/Imp genes(n=1761)
292 230750
894644
1582
A
B
Teosinte network edges Maize network edges
D
C
GRMZM2G068436
GRMZM2G137947
GRMZM2G375302
Mb
Mb
Fig. 3. Analysis of genes with altered expression or conservation and targets of selection during improvement and/or domestication. (A) Venn diagramshowing the overlap between DE genes, AEC genes, and the genes that occur in genomic regions that have evidence for selective sweeps during maizedomestication or improvement (Dom/Imp genes). (B) Teosinte coexpression networks for three genes (GRMZM2G068436, GRMZM2G137947, andGRMZM2G375302). (Right) Edges that are maintained in maize coexpression networks are shown. Although the differentially expressed gene (red node) ishighly connected in teosinte, most of these connections are lost in maize. However, some parts of the teosinte network are still conserved in maize. (C) Cross-population composite likelihood ratio test (XP-CLR) plot shows the evidence for a selective sweep that occurs on chromosome 9. The tick marks along the xaxis represent genes, and the red tick mark indicates the gene (GRMZM2G448355) that was chosen as the candidate target of selection and is differentiallyexpressed in maize and teosinte. The bar plot underneath the graph shows the expression levels of all maize (blue) and teosinte (red) samples. (D) XP-CLR plotfor a large region on chromosome 5. The candidate target of selection is indicated in green and shows similar expression in maize and teosinte. Two othergenes (red) exhibit DE. (E) Neighbor-joining tree shows the relationships among the haplotypes at GRMZM2G141858. (Right) Bar plot shows expression levelsfor each genotype; red bars indicate teosinte genotypes, and blue bars represent maize genotypes. At least one teosinte genotype (TIL15) contains thehaplotype that has been selected in maize and has expression levels similar to maize genotypes.
Table 2. Genes in selected regions with evidence for DE or AEC
Gene listNo. genes selectedduring dom/imp
% up-regulatedin maize Significance
% higher connectedin maize % candidates
AEC and DE (n = 276) 46 76 0.0002 41.3 39.1DE only (n = 336) 44 61 0.0230 40.9 22.7AEC only (n = 839) 89 54 0.1837 57.3 32.6
dom, domestication; imp, improvement.
4 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1201961109 Swanson-Wagner et al.
ExpressionGenealogy
teosintemaize
• ~500 selected regions
• 11M shared vs 3000 fixed SNPs
• show differential expression, decreased expression variation
selection on regulatory sequence, standing variation
Hufford et al. 2012 Nat. Gen. Swanson-Wagner et al. 2012 PNAS
Beissinger et al. BioRxiv
nucl
eotid
e di
vers
ity
distance to nearest substitution (cM)
hard sweeps in genes play minor role in maize
Beissinger et al. BioRxiv
nucl
eotid
e di
vers
ity
distance to nearest substitution (cM)
hard sweeps in genes play minor role in maize
Wallace et al. 2014 PLoS Genetics
QTL alleles enriched for noncoding
Rodgers-Melnick et al. 2016 PNAS
Variance PartitioningGWAS candidate SNPs
Makarevitch et al. 2015 PLoS Genetics
Makarevitch et al. 2015 PLoS Genetics
single TE family many genes
Makarevitch et al. 2015 PLoS Genetics
single TE family many genes
new insertions activate expression
Makarevitch et al. 2014 bioRxiv
-0.5
0.5
1.5
2.5
Lines with the TE insertion
Lines without the TE insertion
GRMZM2G071206
Log 2
(stre
ss/c
ontro
l)
-202468
1012
Lines with the TE insertion
Lines without the TE insertion
-202468
1012
Log 2
(stre
ss/c
ontro
l) GRMZM2G400718 C
-0.50.00.51.01.52.0D
GRMZM2G102447
Lines with the TE insertion
Lines without the TE insertion
GRMZM2G108057
-202468
101214
Lines with the TE insertion
Lines without the TE insertion
GRMZM2G108149
A
B Lo
g 2(s
tress
/con
trol)
Log 2
(stre
ss/c
ontro
l)
E
Log 2
(stre
ss/c
ontro
l)
Lines with the TE insertion
Lines without the TE insertion
on September 9, 2014http://biorxiv.org/Downloaded from
-0.50.00.51.01.52.02.53.03.5
1 2 3 4 5 6 7 8 9 10
Oh43
B73 Mo17
- - + - - + - + - - ++ - - + - - + - - + - - + - - + - - + Gene
Log 2
(stre
ss/c
ontro
l)
TE presence
0%
20%
40%
60%
80%
100%
alaw
dagaf
etug flip
gyma
ipiki
jeli
joem
onnaiba
nihep
odoj
pebi
raider
riiryl
ubel
uwum
Zm00346
Zm02117
Zm03238
Zm05382
Salt
UV
Heat
Cold
B
A
Per
cent
of c
onse
rved
ge
nes
on September 9, 2014http://biorxiv.org/Downloaded from
***
****
*** *
single gene, many individuals
how to adapt: Zea mays
M T G P H R L
GGTAAA ATG ACT GGT CCA CAT CGA CTG TAG
regulatory variation (including TEs)multiple
mutations
“soft” sweeps
standing variation
Sattah et al. 2011 PLoS Gen. Williamson et al. 2014 PLoS Gen Hernandez et al. 2011 ScienceRoss-Ibarra et al. 2009 Genetics
Sattah et al. 2011 PLoS Gen. Williamson et al. 2014 PLoS Gen Hernandez et al. 2011 ScienceRoss-Ibarra et al. 2009 Genetics
Sattah et al. 2011 PLoS Gen. Williamson et al. 2014 PLoS Gen Hernandez et al. 2011 Science
dive
rsity
distance from substitution
Ross-Ibarra et al. 2009 Genetics
Sattah et al. 2011 PLoS Gen. Williamson et al. 2014 PLoS Gen Hernandez et al. 2011 Science
dive
rsity
distance from substitution
20% nonsyn. adaptive 10% nonsyn. adaptive
50% nonsyn. adaptive 40% nonsyn. adaptive
Ross-Ibarra et al. 2009 Genetics
Ne effective number of diploid individuals
s selection coefficient
selection is effective if 2Nes > 1
differences in adaptation due to drift and small population size?
0.05Na
Na
Na3NaNe ~ 450,000
Beissinger et al. BioRxiv
0.05Na
Na
Na3NaNe ~ 450,000
Beissinger et al. BioRxiv
Ne ~ 1,000,000
0.05Na
Na
Na3NaNe ~ 450,000
Beissinger et al. BioRxiv
Ne ~ 1,000,000
1e+05
1e+07
1e+09
1e+03 1e+042e+04 1e+05years(u=3e−8, generation=1)
effe
ctive
pop
ulat
ion
size
popBKN_4HapBKN_6HapTIL_4Hap_JaliscoTIL_6Hap
Ne ~ 1,000,000,000
0.05Na
Na
Na3NaNe ~ 450,000
Beissinger et al. BioRxiv
Ne ~ 1,000,000
1e+05
1e+07
1e+09
1e+03 1e+042e+04 1e+05years(u=3e−8, generation=1)
effe
ctive
pop
ulat
ion
size
popBKN_4HapBKN_6HapTIL_4Hap_JaliscoTIL_6Hap
Ne ~ 1,000,000,000
Ne ~ 5,000,000,000
Sattah et al. 2011 PLoS Gen. Williamson et al. 2014 PLoS Gen Hernandez et al. 2011 Science
dive
rsity
Ne >> 1,000,000 Ne ~ 10,000*
Ne ~ 2,000,000 Ne ~ 600,000
Sattah et al. 2011 PLoS Gen. Williamson et al. 2014 PLoS Gen Hernandez et al. 2011 Science
dive
rsity
µ ∝ 2,500 Mbp µ ∝ 3,100 Mbp
µ ∝ 130 Mbp µ ∝ 220 Mbp
Pyhäjärvi et al. GBE 2013
enric
hmen
t no
<———
>yes
large genomes enriched in noncoding adaptive variants
inte
rgen
ic
syno
nym
ous
nons
ynon
ymou
s
enric
hmen
t in
terg
enic
<———
>cod
ing
Hancock et al 2011 Science Fraser et al. 2013 Gen. Research
Pyhäjärvi et al. GBE 2013
large genomes enriched in noncoding adaptive variants
enric
hmen
t in
terg
enic
<———
>cod
ing
exce
ss a
dapt
ive
SNPs
Hancock et al 2011 Science Fraser et al. 2013 Gen. Research
• Adaptation in maize occurs from standing variation and targets regulatory variants
• Large genomes may have more targets, more standing variation, and more regulatory adaptation
• Efforts to identify functional variation should consider genome size in designing experiments and genotyping
Genome Size and Adaptation
Kew C-Value Database
Acknowledgments
Maize Diversity GroupPeter Bradbury
Ed Buckler John Doebley Theresa Fulton
Sherry Flint-Garcia Jim Holland
Sharon Mitchell Qi Sun
Doreen Ware
CollaboratorsCSI Davis
Nathan Springer
Lab AlumniTim Beissinger (USDA-ARS, Mizzou)
Kate Crosby (Monsanto) Matt Hufford (Iowa State)
Tanja Pyhäjärvi (Oulu) Shohei Takuno (Sokendai)
Joost van Heerwaarden (Wageningen)