Investigating the 3D structure of the genome with Hi-C data analysis
hybrid C. pepo C. moschata C. argyrosperma C. maxima · Cucurbita Genome Sequences Provide Insights...
Transcript of hybrid C. pepo C. moschata C. argyrosperma C. maxima · Cucurbita Genome Sequences Provide Insights...
1/23/2018
1
Shan Wu
Fei Lab Boyce Thompson Institute
PAG 2018
Cucurbita Genome
Sequences Provide
Insights into Polyploid
Genome Evolution
and Heterosis in
Interspecific Hybrid
Outline
Introduction
- The origins and domestication of C. maxima and C. moschata
- The uses of C. maxima and C. moschata
The genome sequences of C. maxima and C. moschata
- De novo genome assembly, anchoring and quality evaluation
- The two distinguishable paleo-subgenomes in Cucurbita
- Genome and gene evolution after allotetraploidization
Gene expression alteration in the interspecific F1 hybrid
- Higher total carotenoid content in the F1 fruit
Origins and domestication of Cucurbita species
C. moschata
Lowlands of Mexico and
Northern South America China and Japan
https://www.marthastewart.com
C
c
c C. pepo
C. moschata
C. argyrosperma
C. maxima
Esquinas-Alcazar and Gulick, 1983
Genetic resources of Cucurbitaceae: a global report
C. maximaSouthern South America India and Myanmar
Harry Potter and the Prisoner of Azkaban
- The Cucurbita crops are consumed all over the world and are a staple food in
many developing countries.
- The fruits are used as ornaments and carved into decorative lanterns around
Halloween.
- C. maxima and C. moschata are also used as rootstocks for other cucurbit crops,
including watermelon, cucumber and melon, to enhance tolerance to soilborne
diseases and abiotic stresses.
The uses of Cucurbita crops
C. maxima, Rimu C. moschata, Rifu
- The interspecific hybrid developed from a cross between C. maxima cv. Rimu
and C. moschata cv. Rifu, ‘Shintosa’, is a popular rootstock for different
cucurbits, and especially preferred in watermelon grafting for its Fusarium wilt
resistance, cold-tolerance, and the ability to increase fruit weight, fruit quality and
plant vigor.
C. maxima and C. moschata genome assemblies
High quality cleaned Illumina readsC. maxima: 109.3 Gb, 283× coverage
C. moschata: 80.2 Gb, 215× coverage
93%
2%
5%
Complete BUSCOs
Fragmented BUSCOs
Missing BUSCOs
93%
1%
6%
Complete BUSCOs
Fragmented BUSCOs
Missing BUSCOs
Assessment of completeness using BUSCO
C. maxima C. moschata
Summary statistics of the assemblies Scaffold Contig
Size (bp) Number Size (bp) Number
C. maxima
Longest 11,871,986 1 400,020 1
N50 3,717,157 24 40,681 1,813
N90 52,757 295 6,851 7,555
Total 271,413,401 8,299 265,448,559 25,524
C. moschata
Longest 11,258,782 1 292,205 1
N50 3,995,720 24 40,480 1,897
N90 593,097 93 10,017 6,788
Total 269,943,085 3,500 262,991,909 17,340
Estimated genome size: 386.8 Mb
Estimated genome size: 372.0 Mb
Repeat sequences accounts for 40.3% and 40.6% of C. maxima and C. moschata genomes, respectively.
C. maxima, RimuC. maxima, SQ026 C. moschata, Rifu C. moschata,
Honey jujube
Anchoring the scaffolds into pseudomolecules
× × ×
Intraspecific F2Intraspecific F2
Interspecific F1BC1
C. maxima map2,030 SNPs
C. moschata map3,487 SNPsInterspecific map
13,783 SNPs
C.maxima C.moschataNum.ofanchoredscaffolds 92 98
Num.oforientedscaffolds 64 71Totalsize(Mb) 211.4 235
%ofassembledgenomeanchored 77.9 88.4Num.oflinkagegroups 20 20
1/23/2018
2
Extensive
synteny between
the two Cucurbita
species
01 02 06 07 08 11 12 18 04 03 05 09 10 13 14 15 16 17 19 20
Subgenome A Subgenome BC. maxima≤0.35 ≥0.40
KsChr
01
02
03
04
05
06
07
08
09
10
11
Melo
n
12
MelonChr02
Chr11 Chr10C. maxima
Two distinguishable paleo-subgenomes of Cucurbita
0.029 Mya
Cucumber
Melon
Watermelon
C. maxima B
C. moschata B
C. maxima A
C. moschata A
Bitter gourd
Walnut
6.51
19.06
3.34
3.37
26.28
30.75
36.13
82.04
[6.06, 6.94]
[3.04, 3.82]
[3.09, 3.84]
[18.34,19.75]
[25.54,27.00]
[29.86,31.60]
[34.94,37.24]
[79.37,84.59]
Divergence of progenitor A from the common ancestor of
progenitor B and Benincaseae
Divergence of progenitor B from
Benincaseae
Divergence between C. maxima and
C. moschata
Allotetraploidization 31 26 3 Mya
Paleo-allotetraploidization in Cucurbita Fractionation bias and genome dominance
WGD
×× ××
××
Radom
fractionation
××× × ××
Biased
fractionation
The dominant
subgenome
Schnable et al., 2011 PNAS
Maize1 dominates expression
Maize2 dominates expression
Radom fractionation and lack of
genome-wide expression bias
0
5
10
15
20
25
30
Fruit Leaf Stem Root
Pe
rce
nt o
f ge
ne
pa
irs (
N=
5,5
81
)
Subgenome A dominance Subgenome B dominance
C. maxima
TE insertions in the 5’ upstream regions of genes
Hypothesis:
"The diploid parent of a tetraploid with the lowest transposon load was
to become the dominant subgenome." -Woodhouse et al., 2014 PNAS
The diploid parental genomes of Cucurbita could have a similar total load of TEs near genes,
leading to the lack of genome dominance and subsequent random fractionation in the polyploid.
1/23/2018
3
Gene expression alteration in the interspecific F1 hybrid
C. maxima ♀ C. moschata ♂
The interspecific F1 ‘Shintosa’
×
P
1
F1 P
2
P
1
F1 P
2
Dominant (equal to one of the parents) and Transgressive (higher than the high parent or lower than
the low parent) patterns might underlie heterotic phenotypes and combined favorable characters from parents.
Exp
ressio
n l
eve
l
In the Shintosa, 4,002 (fruit), 6,718 (leaf), 6,732 (stem) and 7,067 (root) genes were dominantly or
transgressively expressed, representing 12.5-22.0% of the total genes in the genome. The expression patterns
of some of these genes are correlated with disease resistance, higher growth vigor and increased
carotenoid biosynthesis in Shintosa.
Expression patterns of carotenoid biosynthetic genes
C. maxima Shintosa C. moschata
0
200
400
600
800
Cma F1 Cmo
Lutein
0
100
200
300
400
500
600
700
Cma F1 Cmo
α-carotene
0
0.5
1
1.5
2
Cma F1 Cmo
β-carotene
0
10
20
30
40
50
60
Cma F1 Cmo
Lycopene
μg/1
00gF
W
mg/1
00gF
W
μg/1
00gF
W
μg/1
00gF
W
FPKM
<1
1-10
11-20
21-30
31-40
41-50
>50
A B
Cma F1 Cmo
GGPP
Phytoene
ζ-carotene
Lycopene
α-carotene β-carotene
Lutein Zeaxanthin
Carlactone
PSY
PDS
Z-ISO
ZDS
CRTISO
LYCE
LYCB LYCB
CHYB
CYP97A3
CYP97C1
CHYB B-ISO
CCD7
CCD8
## ## ##
## ## ##
## ## ##
## ## ##
## ## ##
## ## ## ## ## ##
## ## ##
## ## ##
## ## ##
## ## ##
! ! !
## ## ##
## ## ##" " "
## ## ##" " "
## ## ##
## ## ##" " "
## ## ##
## ## #### ## ##
## ## ##
! ! !
! ! !
! ! !
## ## #### ## ##
## ## ##
## ## #### ## ##
## ## #### ## ##
! ! !
! ! !
! ! !
## ## ##
## ## ##
## ## ##CYP97A3
Subgenome
Summary
We assembled high-quality genome sequences of C. maxima and C. moschata,
which are a valuable resource for genetic improvement of the crop.
We provide evidence supporting an lineage-specific ancient allotetraploidization
event in Cucurbita. The two diploid progenitors of Cucurbita successively diverged
from Benincaseae around 31 and 26 million years ago (Mya), and the
allotetraploidization happened earlier than 3 Mya, when C. maxima and C. moschata
diverged.
The subgenomes have largely maintained the chromosome structures of their diploid
progenitors. Such long-term karyotype stability after polyploidization has not been
commonly observed in plant polyploids. The two subgenomes have retained similar
numbers of genes, and neither subgenome is globally dominant in gene expression.
We detected transgressive gene expression changes in the F1 hybrid of C. maxima
and C. moschata correlated with heterosis in important agronomic traits.
Acknowledgement
Beijing Academy of
Agriculture and
Forestry Sciences
Zhangjun Fei
Honghe Sun
Chen Jiao
Yong Xu
Haizhen Li
Guoyu Zhang
Shaogui Guo
Yi Ren
Jeff Doyle
William Lucas
Supported by grants from the Beijing Scholar Program, the Beijing
Excellent Talents Program, the Ministry of Agriculture of China, the
Beijing Natural Science Foundation, US NSF and USDA NIFA SCRI.