Using Genetics, Genomics, and Breeding to Understand...
Transcript of Using Genetics, Genomics, and Breeding to Understand...
Using Genetics, Genomics, and Breeding to Understand Diverse Maize
Germplasm
Sherry A. Flint-Garcia United States Department of Agriculture–
Agriculture Research Service (USDA–ARS)
Corn breeding and genetic analysis of agronomic traits relies on phenotypic variation and genetic variation in the genes controlling these traits. My research program focuses on understanding genetic diversity in maize so we can mine beneficial alleles from the appropriate germplasm sources for continued corn improvement. The process of domestication that began 9000 years ago has had profound consequences on maize, where modern corn has moderately reduced genetic diversity across nearly all genes in the genome relative to teosinte, and severely reduced levels of diversity for key genes targeted by domestication. The question that remains is whether these reductions in genetic diversity have impacted our ability to make progress in corn breeding today. As an outcrossing species, maize has tremendous genetic variation compared to most other crops. The complementary combination of genome-wide association mapping (GWAS) approaches, large HapMap datasets, and germplasm resources are leading to important discoveries of the relationship between genetic diversity and phenotypic variation in inbred lines. However, among the traits targeted during domestication and breeding are many yield component traits, including number of ears, kernel row number, seed size, and kernel composition. Therefore, we must reintroduce variation from landraces and/or teosinte if we hope to learn how domestication has impacted these yield component traits, and yield itself.
Using Genetics, Genomics, and Breeding to Understand Diverse Maize Germplasm
Sherry Flint-Garcia USDA-ARS Columbia, MO
Outline Introduction to Maize Domestication & Diversity Inbred Lines Teosinte, the wild ancestor Breeding with Zea
Maize Domestication
Domesticated from Zea mays ssp. parviglumis
Single domestication event ~9,000 years ago in Mexico
Intermediate form of landraces; populations adapted across the Americas to specific microclimates and/or human uses
Artificial Selection
Teosinte (ssp parviglumis) Inbred Lines Landraces
Teosintes
Maize Landraces
Maize Inbred Lines
Unselected (Neutral) Gene Domestication Gene Improvement Gene
Artificial Selection
Plant Breeding
Domestication
98% (~49,000) maize genes
2% (~1,000) maize genes
Nested Association Mapping (NAM) Maize ATLAS Project
Teosinte NILs Teosinte Synthetic Zea Synthetic
Consequences of Artificial Selection
Germplasm Enhancement of Maize (GEM)
Inbred Lines
Linkage-Based QTL Mapping “Genome Scan” Identify genomic regions that contribute to
variation and estimate QTL effects
u162
2
u155
2
b227
7
m23
1
b224
8
b122
5
b207
7
b152
0
0 1 2 3 4 5 6 7 8 9
0 10
20
30
40
50
60
70
80
90
100
110
120
130
140
Position (cM)
LOD
Scor
e
Genotype Phenotype Statistics for Mapping
Parent 1
F1
F2 population
Parent 2
1.3m
1.5m
1.4m
1.8m
2.0m
2.5m
T
C
C
T
T
C
A
G
G
A
G
G
G
G
T
T
T
G
A
A
A
C
A
A
A
A
A
G
G
G
Line 1
Line 2
Line 3
Line 4
Line N …
Gene X
Association Analysis Utilize natural populations
Exploit extensive historical recombination
Candidate gene approach:
Ancestral chromosomes
Subject Population
Linkage vs. Association Analysis Linkage (QTL) Mapping Association Mapping Structured population BC, F2, RIL, etc
Analysis of 2 alleles High power Low resolution 10-20 cM (10-20 Mb in maize)
Genome scan Don’t know anything about the genetics
underlying the trait
Unstructured population Unrelated or distantly-related
Analysis of many alleles Low power High resolution 1000-5000 bp in maize *
* Candidate gene testing Pathway or candidates previously identified. Used as a validation method.
* Genome Scan Don’t know anything about the genetics underlying the trait. Used as a discovery method
* Depends on the species/population
Nested Association Mapping
NAM Founders
0.1
NC33
B115
I137TN
81-1
MEF 156-55-2
IL677A
Ia5125
IA2132
IL101
F2 EP1
F7
CO255
NC366
B52
SC213R
NC238
GT112
Mp339
GA209
D940Y
T232
U267Y
CI28A
B2
F2834T
Mo24W MS1334
IDS28
I-29
SA24
4722
SG18 Sg1533
IDS91 IDS69
F6
F44
Ab28A
CML328
Oh603
A441-5
CML323
SC55
NC264 NC370
NC320
NC318
NC334
NC332
CML92
CML220
CML218 A272
Tzi9
CML77
TZI10
CML311
CML349 CML158Q
CML154Q
CML281
CML91
parvi-14
parvi-36
parvi-03
parvi-49
parvi-30
NC356
TX601
NC340
NC304
NC338 NC302
NC354 TZI18
A6
Ki44
Ki43
Ki2007 Ki21
Ki2021
Ki14
CML238 CML321
CML157Q
CML322
CML331 CML261
CML341 CML45 CML11
CML10
Q6199
CML314
CML258
CML5 CML254
CML61
CML264
CML9
CML108
CML287 Tzi11
CML38 CML14
SC357 L578
T234
N6 E2558W
K64
CI64
CI44
CI31A NC230
CI66 K55
R109B MoG
L317
NC232
VaW6
CH701-30
Va85
CI90C M14
H99
Va99 PA91
Ky226
H95
Oh40B
VA26 Pa762
A619
Va22
Va17
Va14 Va59
Va102
Va35
T8
A654
Pa880
SD44
CH9
Pa875
H49
W64A
WF9
Mo1W
Os420
R168
38-11
NC260
Mo44
CI21E
Hy
A661
ND246
WD
A554
CO125
CO109
B75
DE-3
DE-2
DE1
B57
C123
C103
NC360
NC364
NC362
NC342
NC290A
NC262
NC344
NC258
CI91B
CI187-2
A682
K4
NC222
NC236
CI3A
K148
Mt42
MS153
W401
A556
B77
CM37
CMV3
W117HT
CM7
CO106
B103
R4
Ky228
B164
Mo46
Mo47
Mo45
Hi27
Yu796-NS
I205
Tzi16 Tzi25
B79
B105
N7A
SD40
N28HT
A641
DE811
H105W
A214N
H100
A635
A632
A634 B14A
H91 B68
B64
CM174 CM105
B104
B84
NC250
H84
B76
B37
N192
A679
B109 NC294
NC368 NC326
NC314
NC328 R229
A680
NC310
NC324 NC322 NC330
NC372
B73Htrhm
B73 NC308 NC312 NC306
NC268
B46 B10
A239
C49A C49
A188
W153R
A659
R177
W22
W182B
CI-7
33-16
4226
NC348
NC298
NC352
NC336
NC296
NC346 NC296A
NC292
SWEET
POPCORN TROPICAL- SUBTROPICAL
STIFF STALK
NON STIFF STALK
MIXED Based on 89 SSR loci
MO17
Oh43E
ssp. parviglumis
NC300
NC350
P39
M37W HP301 CML277
B97
CML103 CML69
CML52
CML228
CML247
CML332
IL14H
Ky21
Ki11
Ki3
MS71
Mo18W
OH7B
M162W
Tx303
TZI8 CML333
NC358
OH43
Flint-Garcia, et al. (2005) Plant J.
NAM Development
Yu, et al. (2008) Genetics; McMullen, et al. (2009) Science
Link
age
Map
ping
Association Mapping
Corn Tracker
Starch
Amylose
Amylopectin Fiber
Oil Fatty Acid
Profiles
Protein
Zeins
Amino Acid Profiles
Jason Cook
NAM Kernel Composition
NIR conducted by Syngenta Seeds, Inc.
Oil Composition QTL in NAM
0
10
20
30
40
50
60
70
80
90
- 500,000,000 1,000,000,000 1,500,000,000 2,000,000,000
LO
D
Joint Linkage Mapping - Oil
Physical Distance (bp)
0
10
20
30
40
50
60
70
80
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400
LOD
Genetic Distance (cM)
Joint Linkage Mapping - Oil
Cook, et al. (2012) Plant Phys.
Chr. 1 10 9 8 7 6 5 4 3 2
Chr. 6 Oil Candidate: DGAT1-2
Encodes acyl-CoA:diacylglycerol acyltransferase
Fine mapped by Pioneer-Dupont Zheng, et al. (2008) Nature Genetics High parent = 19% oil High allele = 0.29% additive effect High allele has Phenylalanine insertion in C-terminus
Phenylalanine insertion
Oil 4.4% 5.3% 3.6% 3.9%
Cook, et al. (2012) Plant Phys.
NAM Genome Wide Association (GWAS)
1.6 Million HapMap.v1 SNPs projected onto NAM Bootstrap (80%) sampling to test robustness of models
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
90.0
100.0
- 500,000,000 1,000,000,000 1,500,000,000 2,000,000,000
Physical Distance (bp)
GWAS - Oil
RMIP
Cook, et al. (2012) Plant Phys.
DGAT 1-2 (Chr 6: 105,013,351-105,020,258)
Marker Trait Population Analysis Method BPP P-Value Effect M1 Oil 282 Assn. MLM (Q+K) - 1.2E-04 0.18 M2 Oil 282 Assn. MLM (Q+K) - 9.9E-04 0.16 M3 Oil NAM GWAS - Bootstrap 31 - 0.18 M4 Oil 282 Assn. MLM (Q+K) - 4.3E-05 0.19 M4 Starch NAM GWAS - Bootstrap 51 - -0.38 M5 Oil NAM GWAS - Bootstrap 67 - 0.13 M5 Starch NAM GWAS - Bootstrap 11 - -0.31
NAM Population: 24 HapMap.v1 SNPs in DGAT 281 Association Panel: 2 55K SNPs in DGAT (plus the 3 bp indel)
M1
M3 M5 M4
M2: Phe Insertion
Cook, et al. (2012) Plant Phys.
DGAT 1-2 (Chr6: 105,013,351-105,020,258)
Rare allele?
= B73 Genotype = Non-B73 Genotype
Cook, et al. (2012) Plant Phys.
M2 (indel)
M3 M5 M4 M1
Summary – NAM Kernel Composition Genetic Architecture of Kernel Composition Traits
Governed by many QTL (N = 21-26) with small to moderate effects
GWAS results confirm many QTL DGAT is our favorite gene, but we still don’t have the
complete story!
NAM In General We can identify the common alleles in maize, but still
have problems with rare alleles.
Cook, et al. (2012) Plant Phys.
Ames Plant Introduction Station Inbreds 2,815 inbred lines from the Ames PI Station Genotyping-by-sequencing (GBS) - 681,257 SNPs
Romay, et al. (2013) Genome Biology
Ames Plant Introduction Station Inbreds More than half of the SNPs in collection are rare!
Romay, et al. (2013) Genome Biology
Expired PVPs
77% 48%
42%
302 Association panel = 75%, NAM founders = 57%.
Teosinte - The Wild Ancestor
Development of Teo Introgression Libraries
B73 × teosinte (parviglumis)
10 accessions
F1 : B73 × teosinte
BC1 BC2 BC3 BC4 B73 F1 teosinte
Library Development 10 libraries of 887 BC4S2/DH Near Isogenic Lines (NILs) BC4S4 NILs: GBS & RAD sequencing = 33,000–600,000 SNPs Lines to be released in 2013
Z031E0035
Z035E0012
Illumina GoldenGate
768 SNP
assay
Library Coverage
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10
10 maize-teosinte libraries 804 BC4S2 NILs and 83 BC4DH NILs Each line: 2.3 chromosomal segments 4.1% of the teosinte genome 3.3X genome coverage
KEY
0123456789
BC4S2 vs BC4DH from PI384071 donor
KEY
0123456789
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH
. . 2 2 . . 3 1 7 2 . . 2 1 2 0 3 5 2 03 0 2 3 6 2 . . 6 2 . . 1 1 4 0 5 7 . .3 0 . . 7 3 3 2 7 2 . . . . 3 0 4 5 . .3 0 . . 6 2 3 0 . . . . 1 1 . . 4 5 2 03 0 3 3 6 2 . . . . 3 2 2 1 2 1 4 5 . .. . . . 6 2 2 0 5 0 4 2 2 1 1 1 5 5 . .. . 3 4 4 3 . . 5 0 . . 2 1 . . 4 3 3 03 0 3 4 4 3 . . . . 4 2 . . 1 2 . . . .. . 3 3 4 3 3 0 2 0 4 2 1 1 . . 3 3 . .. . 3 3 . . 2 0 3 0 3 2 2 1 1 2 4 3 1 0. . 2 3 5 4 3 0 . . . . 1 1 . . 3 1 . .1 0 2 3 5 4 . . 2 0 3 2 2 1 . . . . 1 02 0 2 2 . . 2 1 1 0 3 2 3 2 1 0 . . . .2 0 . . 6 1 4 1 1 0 4 2 4 2 1 0 . . 1 02 0 . . 7 1 3 1 1 0 . . 3 2 1 0 1 0 0 02 0 . . 7 1 4 1 1 0 4 2 4 2 1 0 2 1 1 02 0 3 2 . . 4 1 1 0 4 3 3 2 0 0 1 1 . .1 0 3 2 . . . . 2 0 3 2 2 2 2 0 . . 1 01 0 3 2 7 1 4 1 2 0 3 2 2 2 1 0 1 1 1 01 0 . . 8 1 . . 2 0 5 0 2 2 1 0 . . . .. . 5 1 8 1 4 1 3 0 4 1 . . 1 0 . . 2 02 0 5 2 . . 4 1 3 0 5 0 2 2 1 1 1 0 1 12 0 3 2 . . 4 1 3 0 . . 2 2 1 1 1 0 1 02 0 5 2 8 1 4 1 3 0 3 0 3 1 0 1 1 0 1 11 0 . . 8 1 4 1 2 1 . . 4 2 0 2 . . . .. . 5 2 . . 4 1 3 1 4 0 4 2 . . 1 0 . .2 0 5 1 8 1 3 1 3 1 4 0 3 2 0 1 2 0 2 1. . 5 2 7 1 . . 2 1 5 0 . . 1 2 2 0 2 12 0 . . 7 1 4 1 3 1 . . 3 1 0 2 1 0 2 1. . 5 2 . . 5 1 2 2 5 0 2 3 0 2 1 0 2 12 0 4 1 6 1 . . 3 2 5 0 4 3 . . 2 0 . .. . 2 1 7 2 5 1 3 2 7 0 3 2 . . 2 0 2 12 0 1 1 5 2 4 1 3 2 7 0 . . . . 1 0 2 13 0 3 2 . . . . . . . . 4 2 0 2 1 1 . .2 0 . . . . 5 1 4 2 5 1 4 2 1 3 . . . .3 0 . . 6 3 4 1 4 2 6 0 . . 1 3 . . . .2 0 3 2 . . 3 0 4 2 5 2 4 2 1 3 1 1 4 04 0 2 3 6 3 . . 4 2 6 2 3 1 1 3 1 1 4 04 0 3 3 5 4 5 1 4 2 6 2 4 1 1 4 1 1 . .4 0 2 2 . . 6 2 3 1 5 4 4 1 . . . . 4 03 0 3 2 6 3 . . 4 2 4 3 2 2 1 4 1 1 4 03 0 3 2 6 4 . . 3 0 6 2 2 2 1 4 1 0 3 03 0 3 2 6 4 . . . . 4 2 1 1 . . 1 1 . .. . 2 2 2 5 1 1 4 1 . . 0 1 4 4 2 0 2 13 0 3 2 6 4 1 1 4 1 5 2 . . 4 4 2 1 . .3 0 2 2 7 4 1 2 4 1 5 2 1 1 5 4 2 1 2 13 0 2 1 . . 1 1 4 1 5 2 5 3 2 2 4 1. . 2 1 6 4 . . 4 1 5 1 . . 2 2 . .3 0 3 1 6 4 1 3 4 1 5 1 5 4 2 2 4 13 0 3 1 6 6 3 3 4 1 4 0 5 4 . . 4 13 0 3 1 6 7 3 2 4 1 . . 5 4 3 23 0 3 1 6 7 3 3 4 1 2 0 4 4 2 22 0 3 0 5 6 3 3 3 1 . . 5 4 0 03 0 3 1 5 7 3 3 4 1 . . . . . .3 1 4 2 5 7 3 3 4 0 3 1 5 4 1 03 1 4 2 4 3 3 3 2 2 3 1 5 4 2 13 1 . . 4 3 . . . . 6 33 1 4 2 . . . . 2 1 4 23 1 3 1 4 1 3 2 2 2 4 33 1 4 2 . . 3 2 3 2 4 13 1 . . 4 2 3 3 . . 5 13 1 5 2 3 1 4 3 . . 6 2. . 5 2 2 1 3 3 2 2 6 22 3 4 2 2 1 . . 3 2 6 22 3 5 0 1 1 3 3 3 2 . .. . 6 3 1 2 1 2 3 1 . .. . 5 3 1 2 1 2 3 1. . . . . . 1 2 3 12 3 . . . . 1 3 2 12 3 3 3 . . 1 2 2 12 3 . . 3 5 . . 3 12 2 3 6 3 4 . . 6 1. . 2 6 2 4 3 3 . .. . 3 7 3 4 3 3 6 13 1 4 6 3 4 5 13 1 . . 3 4 5 0. . . . 3 4 5 1. . 4 7 4 4 6 13 2 3 7 4 4 5 13 2 5 6 . . 2 24 2 5 5 5 4 5 10 0 . . . . 6 04 2 4 4 5 4 . .5 1 8 3 3 2 6 06 1 8 3 3 2 7 05 1 6 3 7 06 3 . .6 3 . .5 3 7 04 3 7 04 2 7 0. . . .3 2 8 0. . . .3 2 7 1. . . .2 0 . .. . 6 23 1 5 2. . . .2 1 2 03 13 14 13 13 1. .5 14 14 15 1. .5 15 1. .5 12 1
Applications of Teosinte NILs
Application 1. Empirical Genetics 1000 Selected Genes
What do these selected genes do? What traits were targeted by artificial selection during
domestication/breeding? Are selected genes important?
Auxin response factor, ARF1
0
0.01
0.02
1000 2000 3000 1 (bp)
Dive
rsity
(π)
Inbreds Teosinte
Tillering/branching?
Application 2. QTL Mapping Phenotyped at up to 18 reps: Maturity, plant & ear height, kernel row number, kernel weight, kernel shape, leaf length-width,
Grain composition (protein, starch, oil)
Zhengbin Liu
Avi Karn
0102030405060708090
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400
Days to Anthesis
Teosinte NILs
NAM RILs
Kernel Row Number Pop Add. Eff.(rows)Z029 -3.4Z030 -1.8Z031 -0.6Z032 -1.4Z033 -1.0Z034 -1.1Z035 -1.1Z036 -0.3Z037 0.2Z038 -0.9
Pop Add. Eff.(rows)Z029 ---Z030 -1.4Z031 -1.4Z032 -1.1Z033 -2.1Z034 -2.3Z035 -1.7Z036 -0.6Z037 -0.7Z038 -1.4
Seed Weight (50 kernels) Pop Add.Eff.(g)Z029 -1.2Z030 -0.9Z031 -0.6Z032 -1.2Z033 -0.4Z034 -1.3Z035 -2.3Z036 -1.2
Pop Add.Eff.(g)Z029 -0.9Z030 0.2Z031 -0.6Z032 0.8Z033 -1.5Z034 3.3Z035 2.4Z036 -0.2
Pop Add.Eff.(g)Z029 3.7Z030 0.0Z031 -1.2Z032 -1.2Z033 -0.6Z034 -1.1Z035 0.6Z036 -0.3
β γ
α family
δ
Application 3. Reintroduce Variation
Biological hypothesis: A loss of genetic variation results in a loss of phenotypic variation.
Breeding hypothesis: We can improve modern traits.
0
10
20
30
40
50
60
70
80
Carbohydrate Protein Fat
Perc
ent o
f Ker
nel W
eigh
t
Teosinte (N = 11) Landraces (N = 17) Inbred Lines (N = 27)
Flint-Garcia et al. (2009) TAG
seed size
Zein Profile
Embryo Endosperm
Zeins
Teosinte NIL Collaborations Hibbard –
Corn Rootworm Resistance
Buckler – Flowering
Brutnell – Shade Avoidance
Balint-Kurti – SCLB & GLS
Smith – Corn Smut
Tracy – Germination
AgReliant – Agronomic
Traits
Nelson – NCLB
Hoekenga – Iron Bioavailability
Harmon – Circadian
Clock
Dallo – Mal de Rio Cuarto Virus
Turlings – Terpenes & Armyworms
Baxter– Ionomics
Breeding With Zea
Synthetic Populations
Zea Synthetic
B73 & NAM parents are inbreds,
teosinte has NEVER been inbred
NAM Synthetic Teosinte Synthetic ×
Zea Synthetic Inbreeding Depression
Ginnie Morrison
Selfed (S1) F ≈ 0.5
Full Sib (FS) F ≈ 0
S0 Male S0 Female
924 pairs
38% B73, 2% each
NAM parent, 12% teosinte
Zea Synthetic Doubled Haploids
38% B73, 2% each
NAM parent, 12% teosinte
Anna Selby
Goal: 2000 DH First 800+ in 2013 Another 700 in Puerto Rico AgReliant producing more 2014 trial: MO, NC, NY, IA
Doubled Haploids (DH)
Per se Hybrids
Genotype by GBS
Identity by Descent (IBD)
Association Analysis &
Genomic Selection
“Crazy?” Idea
2014
Agronomics Fertilizer Density
Mechanization
?
Select only on yield
ideotype
7000 BC
Teosinte Synthetic 75% B73 (SS), 25% teosinte
Acknowledgements
www.panzea.org
NSF Maize Diversity Project
Syngenta Seeds AgReliant Genetics
Susan Melia-Hancock, Kate Guill, Jason Cook, Zhengbin Liu, Ginnie
Morrison, Christopher Bottoms, Avi Karn,
Anna Selby