MS thesis presentation_FINAL

86
1 CHARACTERIZATION OF MICROSTRUCTURAL MUTATION EVENTS IN PLASTOMES OF CHLORIDOID GRASSES (CHLORIDOIDEAE; POACEAE). Thomas J. Hajek III, M.S. Department of Biological Sciences Northern Illinois University, 2014 Melvin R. Duvall, Director

Transcript of MS thesis presentation_FINAL

Page 1: MS thesis presentation_FINAL

1 CHARACTERIZATION OF MICROSTRUCTURAL MUTATION EVENTS IN PLASTOMES OF CHLORIDOID GRASSES (CHLORIDOIDEAE; POACEAE).

 

Thomas J. Hajek III, M.S.Department of Biological SciencesNorthern Illinois University, 2014

Melvin R. Duvall, Director

Page 2: MS thesis presentation_FINAL

2 Overview Introduction Hypotheses Research methods Results Discussion of key findings Conclusions

Page 3: MS thesis presentation_FINAL

3Dr. M.R Duvall Laboratory published

results..(2009 - Present)

NextGen has increased the amount of data collection 1 complete plastome (2009) and 70% complete draft using Sanger

methods 1 (2010) all sanger 2 (2012) all sanger ≈64 complete plastomes published (2013-2015) using NGS averaging 20/year (1000% production increase) for past 3 years

....but there are MANY more in the pipeline

Page 4: MS thesis presentation_FINAL

4 WHY GRASS?Grasses are BIG BUSINESS Knowledge

Knowing with high degrees of certainty the evolutionary relationships among these extant species.

Complete CDS could allow for integration of genes of interest into existing commercial crops or forage graminoids.

Cereals Rice, Corn, Wheat ≥ 50% human calorie intake. over 70% of all crops grown for human and livestock consumption.

It is important that we understand evolutionary relationships of grasses at a molecular level manage ecosystems, bio-engineer species resistant to plant pathogens, produce high yielding commercial crops.

4

Page 5: MS thesis presentation_FINAL

5

A brief backgroundFossil records suggest that some ancestors of the grass family: (rice and bamboo) began to diversify as early as 107 – 129 Mya (Prasad et al., 2011).radiated into 11K accepted species.fifth largest plant family on earth (Stevens, 2007).includes 12 subgroups or subfamilies of grasses (GPWG II, 2012).grasses dominate over 40% of the land area on earth (Gibson, 2009)

Page 6: MS thesis presentation_FINAL

6 Why subfamily Chloridoideae? well-defined plant lineage

monophyletic subfamily 1420 known species of the 11K described grasses. (~13%) Both Human and Livestock consumption.

may have a role in bioengeneering of drought resistant crops and livestock grazing

share specific evolutionary adaptations (Peterson et al., 2010). C4 photosynthesis. (as opposed to C3 and CAM)

More efficient form of photosynthetic carbon fixation that is effective in arid regions.

Climate changes could affect closely related species ability to thrive in changing environments (i.e. current regions that produce commercial and grazing crops could become more arid).

Use this knowledge to produce GMOs via Genetic manipulation from closely related species that could help them to adapt to a changing environment.

Page 7: MS thesis presentation_FINAL

7

Peterson et al (2010)

• Peterson study included the sequence of only 6 partial gene sequences (6,789 bp) and 814 bp of ITS.

• Advances in sequencing methods have provided larger amounts of data for analysis.

• My study includes sequence for the entire genome of chloroplasts (plastome). (≈140 kbp x 10 spp)

Page 8: MS thesis presentation_FINAL

8Leseberg and Duvall (2009) on

the complete plastome of Coix lacryma-jobi

plastome-scale MMEs are a potentially valuable, underutilized resource that can be used for supporting relationships

THIS STUDY analyzed types of mutations besides substitution mutations

may be able to predict and define genomic relationships among species Microstructural Mutation Events (MMEs)

Slipped-strand mispairing (SSM) insertions/deletions (indels) Non-tandem repeat indels Inversions

8

Page 9: MS thesis presentation_FINAL

9Hypotheses

1. Of the two types of MMEs, indels occur more frequently than inversions.

2. Tandem repeat indels, i.e. those indels occurring in regions of tandemly repeated sequences, occur with greater frequency than indels not associated with such repeats.

3. MMEs that affect fewer nucleotides (shorter indels, smaller inversions) occur with greater frequency than larger MMEs.

4. Plastome-scale MMEs are an effective source of data for the inference of high resolution, highly supported phylogenies consistent with the inference from nucleotide substitutions.

9

Page 10: MS thesis presentation_FINAL

Research Methods DNA sampling Sanger sequencing (E. tef) NextGen sequencing (NGS) Identification of MMEs Phylogenomic analyses

10

Page 11: MS thesis presentation_FINAL

HilariaHilaria

ZoysiaZoysia

NeyraudiaNeyraudia

Eragrostis tefEragrostis tef

Bouteloua

Spartina

Distichlis

Sporobolus E. minor

Centropodia

Research methodsDNA Sampling1111

Page 12: MS thesis presentation_FINAL

Sanger Method & E. tef

Ergrostis tef seedlings were provided by Amanda Ingram, of Wabash College, Crawfordsville, IN

DNA extraction Leaf tissues of all four species were ground in liquid nitrogen.

extraction was performed using Qiagen DNeasy Plant Mini Kits (Qiagen Inc., Valencia, CA) following the manufacturer's protocol.

Amplification Arbitrarily divided into 119 regions (range = 500-1,200

bp) ~250 Primer sites. IR primer set from Dhingra and Folta (2005). Most primers from Leseberg and Duvall (2009)

Target region is “primed” for transcription by Fidelitaq (Affymetrix) or Pfu (Strategen Inc.) polymerases.

PCR

DNA extraction and Amplification

Page 13: MS thesis presentation_FINAL

13 Electrophoresis methods were used to verify the size

and number of amplified DNA fragments. Expected size of amplicons ≈ 1200 bp Ladders (ThermoFisher, Hanover Park, IL) were used in

conjunction with negative controls to assure the legitimacy and size of the DNA fragments.

DNA fragments were cleaned and purified (Wizard kit method, Promega Corp., Madison).

PCR products exported to Macrogen, Inc., (Seoul, Korea) for DNA capillary Sanger sequencing.

Problems: Not all primers yielded amplicons with desired size. Some amplicons yielded sequence that is unusable. Not all primers available actually work (sequence

not conserved in the target sequence). Species specific primers were designed

Page 14: MS thesis presentation_FINAL

14 Sanger Sequencing and Assembly

Macrogen files were imported into Geneious Pro software. Check signal strength and distinctness of peaks from

electropherogram. Trim ambiguous regions of sequence with weak signals. Concatenate forward and reverse sequence for specific regions that

were amplified. Assemble contiguous sequence with ≥15 bp overlap between

regions.Also

Design primers for regions that failed to amplify with standard primer set.

Annotate complete genome for GenBank submission.

Page 15: MS thesis presentation_FINAL

15

Eragrostis tef plastome 134,435

bp

Page 16: MS thesis presentation_FINAL

16

Research methodsNGS

One chloridoid plastome from Neyraudia reynaudiana (Wysocki et al., 2014) was previously published

Bouteloua curtipendula (Michx.) Torr. a S. Burke 27 (DEK) NIU

Distichlis spicata var. stricta(Torr.) Scribn.a Saarela 677 (CAN)

Centropodia glauca (Nees) T. A. Cope a Linder 5410 (BOL) University of Cape Town, South Africa, Western Cape Provence

Eragrostis minor Host a L. Clark 1333 (ISC) Iowa State University

Spartina pectinata Bosc ex Link a P. Peterson 20865 (CAN) Canadian Museum of Nature, Ontario

Sporobolus heterolepis (Gray) A. Gray a M. Duvall s. n. (DEK) NIU

Hilaria cenchroides Kunth a J. T. Columbus 5049 (RSA) Rancho Santa Ana Botanic Garden, CA

Zoysia macrantha Desv. a J. T. Columbus 5049 (RSA) Rancho Santa Ana Botanic Garden, CA

Page 17: MS thesis presentation_FINAL

17

NextGen Sequencing Methods & Materials

Library Preparation & NGS Sequencing D. spicata and H. cenchroides

diluted to 2 ng/μl DNA sonication using the Biorupter sonicator at University of Missouri Libraries prepared using TruSeq (Illumina) kit

B. curtipundula, S. pectinata, S. heterolepis, E. minor, C. glauca, Z. marcrantha. diluted to 2.5 ng/ul Tagmentation vs. sonication Libraries prepared/purified using the Nextera Illumina library preparation kit & DNA Clean and

Concentrator kit Both Library types were submitted to the DNA core facility (Iowa State University, Ames, IA)

for bio-analysis and HiSeq 2000 next generation sequence determination.

Page 18: MS thesis presentation_FINAL

NGS Quality Control Illumina Reads (1- 32 Mbp @ 100 bp

each) Dynamic Trim = (FASTQ) Quality Score

filter LengthSort = retain reads ≥ 25bp

18

Velvet (de novo) assembly Contig assembly via anchored

conserved region extension ACRE (Wysocki, 2014)

Plastome Assembly

Page 19: MS thesis presentation_FINAL

19 Sequence overlap for gaps in the plastomes that were not resolved using ACRE were determined by extracting and

matching sequences from the flanking contigs to the reads produced by NGS to complete the plastid genome.

19

Gap b/w 104-108Gap b/w 112-117

N. reynaudiana Sanger reads aligned to NGS confirmed sequence identity between both methods

NGS assembly verified against Sanger contigs for N. reynaudiana

Page 20: MS thesis presentation_FINAL

20

Examples of identifying MMEs

Inversions ≥ 2 bp w/stem ≥ 3 bp

Indels ≥ 3 bp SSM w/unambiguous

tandem repeats

Page 21: MS thesis presentation_FINAL

21

Scored events with binary matrix

pos type D B H S Sp Z E e N C #BP7147 SSM 0 0 0 1 1 1 0 0 0 0 3

14466 SSM 0 0 0 0 0 0 0 0 1 0 314549 SSM 0 0 0 0 0 0 0 1 0 0 333041 SSM 0 0 1 0 0 0 0 0 0 0 336425 SSM 1 ? ? ? 1 1 1 1 1 0 345802 SSM 0 1 0 0 0 0 0 0 0 0 346936 SSM 0 1 0 0 0 0 0 0 0 0 359287 SSM 0 0 0 0 0 0 1 0 0 0 3

pos type D B H S Sp Z E e N C #BP9364 NTR 0 0 0 1 1 ? 0 ? 1 0 3

16559 NTR 1 1 1 1 1 1 1 1 1 0 319603 NTR 0 1 0 0 0 0 0 0 0 0 322008 NTR 1 0 0 0 0 0 0 0 0 0 327774 NTR 1 1 1 1 1 1 1 1 1 0 362266 NTR 0 0 0 1 1 0 0 0 0 0 368674 NTR 0 0 0 0 0 0 1 1 0 0 372573 NTR 0 0 1 0 0 0 0 0 0 0 3

POS OG SEQ D B H S Sp Z E e N C #BP CDS22 CC 0 0 0 0 0 0 0 1 1 0 22390 TC 1 1 1 1 0 1 0 0 0 0 2 matK152294 GA 0 0 0 1 1 1 0 0 0 0 2109211 CA 0 1 0 0 0 0 1 0 0 0 2110074 AA 0 1 0 1 1 1 0 0 0 0 2 ndhF112304 GA 1 0 0 0 0 0 0 0 0 0 22667 TTG (TTC) 1 1 1 1 0 0 1 1 0 0 3 matK2

SSM indels NTR indels

Inversions

Page 22: MS thesis presentation_FINAL

Phylogenomic Analysis

Phylogenomic analyses were performed using a series of five datasets for ML, MP and BI [1] complete plastome sequences [2] the binary matrix of characterized MMEs [1-2] plastome sequence + binary matrix [3] a matrix of CDS

78 protein CDS four rRNA sequences 32 tRNA sequences

[4] all non-coding sequences introns and intergenic regions

Page 23: MS thesis presentation_FINAL

Phylogenomic Analyses23 Ten species aligned using Geneous Pro MAFFT plugin

Gaps removed (eliminate ambiguities)

1 inverted repeat (Ira) removed (prevent overrepresentation of sequence)

MME added 605 characters to the sequence matrix 581 indels + 24 inversions

Page 24: MS thesis presentation_FINAL

Phylogenomic Analyses

Five maximum-likelihood (ML) analyses jModelTest 2

RAxML-HPC2 on XSEDE on (CIPRES) GTRCAT

plastome sequences BINCAT

MME binary matrix 1000 BS iterations MLBVs via Consense tool (Phylip software package on CIPRIS) Phylogenomic trees were visualized and edited using FigTree v1.4.0

24Centropodia glauca specified as OG for all Phylogenomic (ML, MP and BI) analyses

Page 25: MS thesis presentation_FINAL

Phylogenomic Analyses

Five branch and bound maximum parsimony (MP) analyses PAUP* v4.0b10 MP branch and bound bootstrap analyses were performed using 1,000 replicates in

each case

Five Bayesian Inference (BI) analyses were performed MrBayes 3.2.2 on XSEDE on CIPRES two Markov chain Monte Carlo (MCMC) analyses 20,000,000 generations each model for among-site rate conversion was set to invariant gamma sampled values discarded at burnin was set at 0.25 to generate 50% majority rule

consensus trees

25

Page 26: MS thesis presentation_FINAL

RESULTS

26

Page 27: MS thesis presentation_FINAL

Plastome Assembly, Annotation, and Alignment 1,216,882 bases of

new plastid sequence added to GenBank database

share a general organization of the highly conserved gene content and gene order that are consistent with the grass plastome

Page 28: MS thesis presentation_FINAL

Plastome characterization28

Species LSC IrB IrA SSC Total % AT

B. curtipedula 79309 20975 20975 12606 133865 61.8

E. tef 79802 21026 21026 12581 134435 61.6

C. glauca 80074 21012 21012 12467 134565 61.5

H. cenchroides 80238 21082 21082 12419 134821 61.7

E. minor 80316 21065 21065 12577 135023 61.8

S. heterolepis 80614 21028 21028 12692 135097 61.6

N. reynaudiana 81213 20570 20570 12744 135362 61.7

S. pecinata 80922 20985 20985 12720 135612 62.6

Z. macrantha 81351 20961 20961 12572 135845 61.6

D. spicata 82488 21226 21226 12679 137619 61.7

Page 29: MS thesis presentation_FINAL

Microstructural mutation scoring and analysis29 Number of bases in slipped strand mispairing event

  3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 27 28 29 31 32 39 40 120 ΣD. spicata 5 6 22 5 5 2 4 2 1 1 0 0 1 0 0 1 0 1 1 2 0 1 1 1 1 0 1 0 0 0 0 64B. curtipedula 6 10 30 11 11 6 4 5 2 1 1 0 2 0 1 0 0 1 1 2 0 0 0 0 0 0 1 0 0 0 0 95H. cenchroides 4 7 39 13 5 4 4 2 1 1 1 1 1 1 0 2 1 0 1 3 0 1 0 0 0 0 0 0 0 1 0 93S. heterolepis 5 11 33 3 5 3 3 1 1 1 0 2 1 0 0 0 0 0 1 2 1 0 1 0 0 0 0 0 0 0 0 74S. pecinata 6 11 31 3 4 2 4 0 2 1 0 2 1 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 69Z. macrantha 7 10 32 2 2 2 4 0 1 1 0 2 1 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 66E. tef 4 12 27 6 3 0 5 0 1 1 0 1 1 0 1 0 0 1 0 2 0 0 0 0 0 1 0 0 0 0 1 67E. minor 4 10 24 7 3 0 4 1 1 2 0 1 1 0 0 0 1 2 1 2 0 0 0 0 0 1 0 0 1 0 0 66N. reynaudiana 4 8 26 5 3 0 3 1 1 1 0 0 2 0 1 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 0 58

Page 30: MS thesis presentation_FINAL

Microstructural mutation scoring and analysis30

Number of bases in indel (NTR)

  3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 28 29 30 31 34 35 36 37 39 44 45 46 48 52 55 59 63 67 75 78 84 86 88 94 117 119 121 145 159 182 391 433 Σ

D. spicata 7 9 18 13 3 3 9 6 1 0 3 1 0 2 1 3 2 1 1 0 1 1 0 2 0 0 0 1 1 0 0 0 1 1 2 1 2 0 1 0 0 2 0 1 1 1 0 0 1 1 1 1 1 1 0 1 109

B. curtipedula 5 12 16 19 6 1 8 5 2 0 3 2 0 1 1 1 3 1 1 1 0 1 0 1 0 0 1 1 0 0 0 0 1 1 2 0 1 0 0 1 1 1 1 0 1 0 1 0 0 1 0 0 0 0 0 1 105H.

cenchroides 6 11 23 15 4 2 8 9 2 1 4 1 1 1 1 2 2 2 1 1 0 0 0 1 0 0 1 1 0 1 0 0 1 1 1 0 2 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 110

S. heterolepis 7 11 22 14 3 1 5 6 0 0 6 1 0 1 0 1 2 1 0 1 1 0 0 1 0 0 0 1 0 0 0 0 1 1 2 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 1 97

S. pecinata 6 11 22 15 5 2 5 5 1 0 6 1 0 0 0 1 2 1 0 1 1 0 0 2 0 1 0 1 0 0 1 0 1 1 2 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 101

Z. macrantha 4 10 15 12 3 2 5 5 0 0 5 1 0 0 0 1 2 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 2 1 1 0 0 1 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 81

E. tef 5 16 23 10 4 4 8 3 2 0 3 2 0 2 0 1 2 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 2 1 2 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 1 100

E. minor 5 15 23 10 4 4 8 4 2 0 3 2 0 2 0 1 2 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 2 1 2 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 1 101N.

reynaudiana 5 9 15 6 2 3 7 4 0 1 2 2 0 1 0 3 2 2 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 74

Page 31: MS thesis presentation_FINAL

Microstructural mutation scoring and analysis31

Page 32: MS thesis presentation_FINAL

Inversion scoring and analysis32

Inversion Size Frequency  2 3 4 5 6 7 9 ΣD. spicata 2 6 0 2 0 1 1 12B. curtipedula 3 6 1 2 1 1 2 16H. cenchroides 1 7 1 2 1 1 1 14S. heterolepis 3 5 0 2 1 1 1 13S. pecinata 2 4 0 2 1 1 1 11Z. macrantha 3 2 0 2 1 1 0 9E. tef 1 4 0 2 0 1 1 9E. minor 1 4 0 2 0 1 1 9N. reynaudiana 1 2 0 1 0 1 1 6

24 identified

Page 33: MS thesis presentation_FINAL

Indels in CDS total of 581 indels were identified (plastome alignment)

28 in CDS rpoB, rps14, rps18, clpP, rpoC1, rpoC2, matK, ycf68, ndhF and ccsA Range 1-78 bp

CDS indels = 4.8% of the total

Indels in CDS

  1 3 5 6 9 15 21 30 63 78 Σ

D. spicata 0 3 0 1 2 0 1 0 ? 1 8

B. curtipedula 0 1 0 2 1 1 2 0 ? 0 7

H. cenchroides 0 1 0 1 1 0 0 1 ? 0 4

S. heterolepis 0 1 0 0 1 0 0 0 0 0 2

S. pecinata 0 2 0 0 1 0 0 0 0 0 3

Z. macrantha 0 1 0 1 1 0 1 0 1 0 5

E. tef 3 2 1 2 2 0 0 0 0 0 10

E. minor 0 1 1 1 2 0 1 0 0 0 6

N. reynaudiana 0 2 0 2 0 0 1 0 ? 0 5

Page 34: MS thesis presentation_FINAL

34

CDS specific inversions (4/24)

Inv2 matK

Taxa position nucleotide sequence AA sequenceΔ AA

properties

D. spicata2617 - 2640 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A

B. curtipedula2570 - 2593 ATTTTCTTTTGAAAATAGAAAAAT NEKSFLFI P,A

H. cenchroides2605 - 2628 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A

S. heterolepis2589 - 2612 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A

S. pecinata2597 - 2620 ATTTTCTTTTTTCAAAAGAAAAAT NEKKLLFI (+), NP

Z. macrantha2596 - 2619 ATTTTCTTTTTTCAAAAGAAAAAT NEKKLLFI (+), NP

E. tef2585 - 2608 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A

E. minor2580 - 2603 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A

N. reynaudiana2559 - 2582 ATTTTCTTTTTTCAAAAGAAAAAT NEKKLLFI (+), NP

C. glauca2604 - 2627 ATTTTCTTTTTTGAAAAGAAAAAT NEKKFLFI (+), A

Inv1 matK

Taxa position nucleotide sequence AA sequenceΔ AA

propertiesD. spicata 2342 - 2357 TTTCTTTTGAAAAAGAAG KKQFLL P,AB. curtipedula 2295 - 2310 TTTCTTTTGAAAAAGAAG KKQFLL P,AH. cenchroides 2330 - 2345 TTTCTTTTGAAAAAGAGG KKQFLP P,AS. heterolepis 2314 - 2329 TTTCTTTTGAAAAAGAAG KKQFLL P,AS. pecinata 2322 - 2337 TTTCTTTTTCAAAAGAAG KKKLLL (+), NPZ. macrantha 2321 - 2336 TTTCTTTTGAAAAAGAAG KKQFLL P,AE. tef 2310 - 2325 TTTCTTCTTCAAAAGAAG KKKLLL (+), NPE. minor 2305 - 2320 TTTCTTCTTCAAAAGAAG KKKLLL (+), NPN. reynaudiana 2284 - 2299 TTTCTTCTTCAAAAGAAG KKKLLL (+), NPC. glauca 2329 - 2344 TTTCTTCTTCAAAAGAGG KKKLLP (+), NP

Page 35: MS thesis presentation_FINAL

35

CDS specific inversions

ndhF

Taxa position nucleotide sequence AA sequenceΔ AA

propertiesD. spicata 103962 - 103979 ATCCAAAAAGAACTTTTGGGG DLFFKQP A B. curtipedula 100534 - 100551 ATCAAAAAAGTTCTTTTTTGA DFFNKKS PH. cenchroides 101573 - 101590 ATCCAAAAATAACTTTTTTTG DLFLKKQ A S. heterolepis 102038 - 102055 ATGCAAAAAGTTCTTTTGGGG HLFNKQP PS. pecinata 102162 - 102179 ATGCAAAAAGTTCTTTTTGGA HLFNKKS PZ. macrantha 102588 - 102605 ATGCAAAAAGTTCTTTTGGGG HLFNKQP P E. tef 101078 - 101095 ATCCAAAAAGAACTTTTTGGG DLFFKKP A E. minor 101632 - 101649 ATCCAAAAAGAACTTTTTGGG DLFFKKP A N. reynaudiana 101895 - 101912 ATCCAAAAAGAACTTTTTTGG DLFFKKP A C. glauca 101331 - 101348 ATCCAAAAAGAACTTTTTTGG DLFFKKP A

ccsA

Taxa position nucleotide sequence AA sequenceΔ AA

propertiesD. spicata 108168 - 108182 TTTCGAAATTCTTTCGAT FRNSFD P,PB. curtipedula 104715 - 104729 TTTCGAAAGAATTTCGAT FRKNFD (+), PH. cenchroides 105580 - 105594 TTTCGAAAGAATTTTGAT FRKNFD (+), PS. heterolepis 106265 - 106279 TTTCGAAAGAATTTCTAT FRKNFY (+), PS. pecinata 106402 - 106416 TTTCGAAAGAATTTCTAT FRKNFY (+), PZ. macrantha 106690 - 106704 TTTCGAAAGAATTTCTAT FRKNFY (+), PE. tef 105125 - 105139 TTTCGAAAGAATTTAGAT FRKNLD (+), PE. minor 105687 - 105701 TTTCGAAAGAATTTAGAT FRKNLD (+), PN. reynaudiana 106098 - 106112 TTTCGAAAGAATTTCGAT FRKNFD (+), PC. glauca 105314 - 105328 TTTCGAAAAAATTTCGAT FRKNFD (+), P

Page 36: MS thesis presentation_FINAL

Phylogenomic Analysis

Dataset [1] ML, MP and BI have

identical topology (SPS | MPC) All BV = 100 for ML

and MP except where indicated with (*) where MPBV = 58

Eragrostis minor

Bouteloua curtipendula

Eragrostis tef

Spartina pectinata

Centropodia glauca

Zoysia macrantha

Sporobolus heterolepis

Distichlis spicata

Neyraudia reynaudiana

Hilaria cenchroides

0.0062 | 608

0.003 | 313

0.0064 | 643

0.0035 | 359

0.0051 | 511

0.0082 | 774

0.0019 | 210

0.0042 | 420

0.0097 | 926

0.0078 | 803

0.016 | 1540

0.0141 | 1308

0.0004 | 111

0.0037 | 453

*

0.0023 | 287

0.0014 | 226

0.0054| 1070 0.003

0.0054| 1070

Page 37: MS thesis presentation_FINAL

Phylogenomic Analysis

0.8

Neyraudia reynaudiana

Spartina pectinata

Zoysia macrantha

Distichlis spicata

Centropodia glauca

Eragrostis minor

Sporobolus heterolepis

Eragrostis tef

Hilaria cenchroides

Bouteloua curtipendula

0.124 | 50

0.129 | 44

*

0.243 | 87

4.0E-7 | 13

0.21 | 76

4.0E-7 | 12 ***

**0.063 | 20

0.063 | 27

0.103 | 35

0.041 | 23

0.058 | 29

0.036 | 16

0.02 | 14

0.29 | 72

3.458 | 95

3.458 | 95

0.115 | 36

0.06 | 25

Dataset [2] ML, MP have identical topology BI not able to resolve B.c., H.c. and D.s.

(polytomy) MLBV = 100 on all internal nodes except where

indicated with (**) where MLBV = 92 MPBV = 100 on all internal nodes except

(*) MPBV = 75 (**) MPBV = 99 (***) MPBV = 63

Page 38: MS thesis presentation_FINAL

Phylogenomic Analysis ML dataset [1-

2] BV = 100 on all

internal nodes except (*) MLBV = 85

0.004

Neyraudia reynaudiana

Eragrostis minor

Distichlis spicata

Sporobolus heterolepis

Centropodia glauca

Hilaria cenchroides

Eragrostis tef

Boutelouacurtipendula

Zoysia macrantha

Spartina pectinata

0.0025

0.0021

0.0084

0.004

0.0106

0.0057

0.0037

0.0044

0.0065

0.0088

0.0067

0.0015

0.0151

0.0171

0.0057

0.0004

0.0032

0.0055

*

Zoysia macrantha

Spartina pectinata

Sporobolus heterolepis

Bouteloua curtipendula

Hilaria cenchroides

Distichlis spicata

Eragrostis minor

Eragrostis tef

Neyraudia reynaudiana

Centropodia glauca500 changes

1169

230

300

561

627

392

336

672

481

126

1620

1456

786

1007

221

439

815

1090

*

MP dataset [1-2]

BV = 100 for all internal nodes except (*) MPBV = 56

Page 39: MS thesis presentation_FINAL

Phylogenomic Analysis Dataset [3] ML, MP and BI have

identical topology All BV = 100 except

(*) MLBV = 59 (*) MPBV = 79

Neyraudia reynaudiana

Sporobolus heterolepis

Distichlis spicata

Eragrostis tef

Zoysia macrantha

Centropodia glauca

Eragrostis minor

Spartina pectinata

Hilaria cenchroides

Bouteloua curtipendula

0.0069 | 377

0.0017 | 107

0.0028 | 174

0.0028 | 198

0.0067 | 372

0.0041 | 247

0.0071 | 400

0.0035 | 208

0.0004 | 50

0.0015 | 111

0.0043 | 249

0.0039 | 2410.001 | 95

0.0041 | 475

0.0041 | 489

0.0022 | 135

0.01 | 597

0.0116 | 664

*

0.003

Page 40: MS thesis presentation_FINAL

Phylogenomic Analysis Dataset [4] ML, MP and BI have

identical topology All BV = 100 except

(*) MPBV = 85

Zoysia macrantha

Spartina pectinata

Sporobolus heterolepis

Bouteloua curtipendula

Distichlis spicata

Hilaria cenchroides

Eragrostis minor

Ertagrostis tef

Neyraudia reynaudiana

Centropodia glauca

0.0075 | 587

0.0021 | 128

0.0035 | 163

0.0068 | 270

0.009 | 352

0.0045 | 185

0.0042 | 177

0.01 | 395

0.0052 | 246

0.0006 | 58

0.0224 | 857

0.0094 | 380

0.0199 | 739

0.0137 | 526

0.0023 | 99

0.0051 | 205

0.0107 | 398

0.0075 | 591

*

0.005

Page 41: MS thesis presentation_FINAL

DISCUSSION & Key Findings

41

Page 42: MS thesis presentation_FINAL

Indel analysis

Hypothesis: indels occur more frequently than inversions 581 indels 24 inversions CONFIRMS hypothesis

Hypothesis: Tandem repeat indels, i.e. those indels occurring in regions of tandemly repeated sequences, occur with greater frequency than indels not associated with such repeats NTR indels = 308 occurrences SSM indels = 275 occurrences REFUTES the hypothesis Orton (2015) had contrary result

taxa in this study belong to a more ancient lineage than the congeneric species in Orton’s (2015) study

Orton’s species have had less time to accumulate subsequent mutations that obscure tandem repeat patterns

Page 43: MS thesis presentation_FINAL

Indel analysis

Hypothesis: MMEs that affect fewer nucleotides (shorter indels, smaller inversions) occur with greater frequency than larger MMEs. Smaller MMEs require lower input of energy and so

would occur with frequencies inversely proportional to their size (Wu et al. 1991)

5 bp indels 1.8 to 3.4 fold increase in frequency over 4 bp indels

Orton (2015) had similar result 5 bp indels ≈1.6 fold increase over 4 bp REFUTES hypothesis.

Page 44: MS thesis presentation_FINAL

Small inversions

Kim and Lee (2005) postulate: small inversions are more common than large inversions 3 bp occurrences = 10 2 bp occurrences = 6

Refutes this hypothesis Result of:

steric limitations of loop forming regions errors of inversion size interpretations

the loop was absorbed by the stem regions TACCCAATATCCTGTTGGAACAAGATATTGGGTA

Page 45: MS thesis presentation_FINAL

MME phylogenomics

Hypothesis: Plastome-scale MMEs are an effective source of data for the inference of high resolution, highly supported phylogenies consistent with the inference from nucleotide substitutions. Refuted

Characterized MMEs weakened MLBV ([1] = 100 to [1-2] = 85) on nodes supporting the internal relationships of the Cynodonteae (B.curtipendula sister to D. spicata)

MMEs changed the topology of the MP analysis for the relationship of the Cynodonteae (B.curtipendula sister to H. cenchroides) with LOW MPBVs ([1] = 58 to [1-2] = 56).

0.004

Neyraudia reynaudiana

Eragrostis minor

Distichlis spicata

Sporobolus heterolepis

Centropodia glauca

Hilaria cenchroides

Eragrostis tef

Boutelouacurtipendula

Zoysia macrantha

Spartina pectinata

0.0025

0.0021

0.0084

0.004

0.0106

0.0057

0.0037

0.0044

0.0065

0.0088

0.0067

0.0015

0.0151

0.0171

0.0057

0.0004

0.0032

0.0055

*

Zoysia macrantha

Spartina pectinata

Sporobolus heterolepis

Bouteloua curtipendula

Hilaria cenchroides

Distichlis spicata

Eragrostis minor

Eragrostis tef

Neyraudia reynaudiana

Centropodia glauca500 changes

1169

230

300

561

627

392

336

672

481

126

1620

1456

786

1007

221

439

815

1090

*

Page 46: MS thesis presentation_FINAL

Phylogenomic analyses topologies were largely stable Largely congruent with conclusions of

Peterson (2010; 2014) EXCEPT: Cynodonteae

B. curtipendula, D. spicata, and H. cenchroides

Changed depending on dataset and method

Note that the terminal branches ARE LONG Could produce faulty phylogenomic inferences Long-branch attraction (Felsenstein, 1978)

“homoplasious character state changes on different long terminal branches could be a source of error when conducting phylogenetic analyses”.

Zoysia macrantha

Spartina pectinata

Sporobolus heterolepis

Bouteloua curtipendula

Hilaria cenchroides

Distichlis spicata

Eragrostis minor

Eragrostis tef

Neyraudia reynaudiana

Centropodia glauca500 changes

1169

230

300

561

627

392

336

672

481

126

1620

1456

786

1007

221

439

815

1090

*

MP dataset [1-2]

BV = 100 for all internal nodes except (*) MPBV = 56

Page 47: MS thesis presentation_FINAL

Phylogenomic analyses Dataset [1] Plastome scale datasets include a larger

# of informative characters compared to previous studies.

Recent findings (Duvall et al. in review) show that the

sister relationship between B. curtipendula and D. spicata is more strongly supported under ML, MP and BI when additional plastome sequences from congeneric species are added to the matrix.

Eragrostis minor

Bouteloua curtipendula

Eragrostis tef

Spartina pectinata

Centropodia glauca

Zoysia macrantha

Sporobolus heterolepis

Distichlis spicata

Neyraudia reynaudiana

Hilaria cenchroides

0.0062 | 608

0.003 | 313

0.0064 | 643

0.0035 | 359

0.0051 | 511

0.0082 | 774

0.0019 | 210

0.0042 | 420

0.0097 | 926

0.0078 | 803

0.016 | 1540

0.0141 | 1308

0.0004 | 111

0.0037 | 453

*

0.0023 | 287

0.0014 | 226

0.0054| 1070 0.003

0.0054| 1070

Page 48: MS thesis presentation_FINAL

[2] (*) MLBV = 100 (*) MPBV = 75

0.8

Neyraudia reynaudiana

Spartina pectinata

Zoysia macrantha

Distichlis spicata

Centropodia glauca

Eragrostis minor

Sporobolus heterolepis

Eragrostis tef

Hilaria cenchroides

Bouteloua curtipendula

0.124 | 50

0.129 | 44

*

0.243 | 87

4.0E-7 | 13

0.21 | 76

4.0E-7 | 12 ***

**0.063 | 20

0.063 | 27

0.103 | 35

0.041 | 23

0.058 | 29

0.036 | 16

0.02 | 14

0.29 | 72

3.458 | 95

3.458 | 95

0.115 | 36

0.06 | 25

Phylogenomic analyses

Dataset [2] Only 605 characters

212 parsimoniously informative B. curtipendula and H. cenchroides share

more homoplasious MMEs

Page 49: MS thesis presentation_FINAL

Eragrostis minor

Bouteloua curtipendula

Eragrostis tef

Spartina pectinata

Centropodia glauca

Zoysia macrantha

Sporobolus heterolepis

Distichlis spicata

Neyraudia reynaudiana

Hilaria cenchroides

0.0062 | 608

0.003 | 313

0.0064 | 643

0.0035 | 359

0.0051 | 511

0.0082 | 774

0.0019 | 210

0.0042 | 420

0.0097 | 926

0.0078 | 803

0.016 | 1540

0.0141 | 1308

0.0004 | 111

0.0037 | 453

*

0.0023 | 287

0.0014 | 226

0.0054| 1070 0.003

0.0054| 1070

0.004

Neyraudia reynaudiana

Eragrostis minor

Distichlis spicata

Sporobolus heterolepis

Centropodia glauca

Hilaria cenchroides

Eragrostis tef

Boutelouacurtipendula

Zoysia macrantha

Spartina pectinata

0.0025

0.0021

0.0084

0.004

0.0106

0.0057

0.0037

0.0044

0.0065

0.0088

0.0067

0.0015

0.0151

0.0171

0.0057

0.0004

0.0032

0.0055

*

Zoysia macrantha

Spartina pectinata

Sporobolus heterolepis

Bouteloua curtipendula

Hilaria cenchroides

Distichlis spicata

Eragrostis minor

Eragrostis tef

Neyraudia reynaudiana

Centropodia glauca500 changes

1169

230

300

561

627

392

336

672

481

126

1620

1456

786

1007

221

439

815

1090

*

[1] (*) MLBV = 100(*) MPBV = 58

ML [1-2](*) MLBV = 85

MP [1-2] (*) MPBV =

56

Page 50: MS thesis presentation_FINAL

Eragrostis minor

Bouteloua curtipendula

Eragrostis tef

Spartina pectinata

Centropodia glauca

Zoysia macrantha

Sporobolus heterolepis

Distichlis spicata

Neyraudia reynaudiana

Hilaria cenchroides

0.0062 | 608

0.003 | 313

0.0064 | 643

0.0035 | 359

0.0051 | 511

0.0082 | 774

0.0019 | 210

0.0042 | 420

0.0097 | 926

0.0078 | 803

0.016 | 1540

0.0141 | 1308

0.0004 | 111

0.0037 | 453

*

0.0023 | 287

0.0014 | 226

0.0054| 1070 0.003

0.0054| 1070

[1] (*) MLBV = 100(*) MPBV = 58

Neyraudia reynaudiana

Sporobolus heterolepis

Distichlis spicata

Eragrostis tef

Zoysia macrantha

Centropodia glauca

Eragrostis minor

Spartina pectinata

Hilaria cenchroides

Bouteloua curtipendula

0.0069 | 377

0.0017 | 107

0.0028 | 174

0.0028 | 198

0.0067 | 372

0.0041 | 247

0.0071 | 400

0.0035 | 208

0.0004 | 50

0.0015 | 111

0.0043 | 249

0.0039 | 2410.001 | 95

0.0041 | 475

0.0041 | 489

0.0022 | 135

0.01 | 597

0.0116 | 664

*

0.003

[3](*) MLBV =

59(*) MPBV =

79

B. curtipendula and H. cenchroides share homoplasious sequence identity in CDS Note: low BVs

Page 51: MS thesis presentation_FINAL

Eragrostis minor

Bouteloua curtipendula

Eragrostis tef

Spartina pectinata

Centropodia glauca

Zoysia macrantha

Sporobolus heterolepis

Distichlis spicata

Neyraudia reynaudiana

Hilaria cenchroides

0.0062 | 608

0.003 | 313

0.0064 | 643

0.0035 | 359

0.0051 | 511

0.0082 | 774

0.0019 | 210

0.0042 | 420

0.0097 | 926

0.0078 | 803

0.016 | 1540

0.0141 | 1308

0.0004 | 111

0.0037 | 453

*

0.0023 | 287

0.0014 | 226

0.0054| 1070 0.003

0.0054| 1070

[4] (*) MLBV = 100(*) MPBV = 85

Zoysia macrantha

Spartina pectinata

Sporobolus heterolepis

Bouteloua curtipendula

Distichlis spicata

Hilaria cenchroides

Eragrostis minor

Ertagrostis tef

Neyraudia reynaudiana

Centropodia glauca

0.0075 | 587

0.0021 | 128

0.0035 | 163

0.0068 | 270

0.009 | 352

0.0045 | 185

0.0042 | 177

0.01 | 395

0.0052 | 246

0.0006 | 58

0.0224 | 857

0.0094 | 380

0.0199 | 739

0.0137 | 526

0.0023 | 99

0.0051 | 205

0.0107 | 398

0.0075 | 591

*

0.005

[1] (*) MLBV = 100(*) MPBV = 58

B. curtipendula and D. spicata share homologous sequence identity in non-coding regions

Page 52: MS thesis presentation_FINAL

Conclusions

52

Page 53: MS thesis presentation_FINAL

Conclusions Conventional phylogenetic analyses that utilize

CDS only CDS No longer appears to be reliable means

of defining lineages Topology dataset [3] Cynodonteae NOT

congruent with previous work ML, MP and BI produced a tree with B. curtipendula sister

to H. cenchroides

produces phylogenomic trees with low BVs BVs for B. curtipendula sister to H. cenchroides are low

(MLBV = 59 and MPBV = 79)

Recent studies are showing that B. curtipendula is sister to D. spicata when more congenic species are added to the matrix (Duvall unpublished).

Page 54: MS thesis presentation_FINAL

Conclusions Plastome scale analysis [1] Most informative type of dataset for

drawing inferences INCREASED BVs

divergence of Eragrostideae before Zoysieae and Cynodonteae

INCREASED from MLBV = 90 to MLBV|MPBV = 100|100

relationship between the subtribes Zoysiinae (Z. macrantha) and Sporobolinae (S. heterolepis and S. pectinate)

INCREASED from MLBV = 81 to MLBV|MPBV = 100|100

relationships between sister tribes Zoysieae (Z. macrantha, S. pectinate and S. heterolepis)and Cynodonteae (B. curtipendula, D. spicata and H. cenchroides)

INCREASED from MLBV = 90 to MLBV|MPBV = 100|100

Page 55: MS thesis presentation_FINAL

Conclusions Plastome scale analysis (dataset

[1]) cont. INCREASED BVs supporting the Zoysieae subtribe as sister to

the Hilarinae (H. cenchroides), Monanthochloinae (D. spicata) and Boutelouinae (B. curtipendula) clade

from MLBV = 85 to MLBV|MPBV = 100|100

for the sister relationship of B. curtipendula with D. spicata

from MLBV = 77 to MLBV = 100 NOTE: MPBV = 58 (LBA artifact)

Page 56: MS thesis presentation_FINAL

Indel analysis 5 bp size class of indels occur with

highest frequency It is unknown whether this trend is

a result of some uncharacterized facet of the energetics of slippage,

a limitation on mutation recognition systems,

some feature of DNA repair mechanisms in the plastid,

or an artifact of indel scoring.

Conclusions

Page 57: MS thesis presentation_FINAL

57Future applications

The way in which microstructural mutations arise in plastomes is not well understood

the exact way in which cpDNA repair mechanisms function remains elusive

Further investigation into identifying the gene products that are responsible for cpDNA damage repair is paramount for a better understanding of the mechanisms responsible for indels and inversions and improving our knowledge of chloroplast genome evolution.

Page 58: MS thesis presentation_FINAL

Questions?58

Page 59: MS thesis presentation_FINAL

Acknowledgments

Dr. Mel Duvall Dr. Joel Stafstrom Dr. Thomas Sims Bill Wysocki Sean Burke Lauren Orton Joseph Cotton

59

Page 60: MS thesis presentation_FINAL

Xtra slides60

Page 61: MS thesis presentation_FINAL

61

Bouteloua curtipendula Spartina pectinata Distichlis spicata Centropodia glauca

Human

Eragrostis tef (Africa)

millet/quinoa

Bouteloua curtipendula ornimental drought

tolerant gardens / erosion control

61

Note: some members of this subfamily (such as Z. macrantha) may have unknown evolutionary adaptations that may benefit bioengineering of drought tolerant crops

Livestock

Zoysia macrantha (AU) thrives in highly

acidic to alkaline soils.

Page 62: MS thesis presentation_FINAL

ConclusionsHypotheses revisited

1) Of the two types of MMEs, indels occur more frequently than inversions. Confirmed

581 indels vs. 24 inversions

2) Tandem repeat indels (SSM) occur with greater frequency than indels not associated with such repeats (NTR). Refuted

Tandem repeats could have been obscured by subsequent substitution events Replicating DNA SSM

Tandem repeats can either be excised or duplicated depending on the +/- strands (3’→5’ (insertion)or 5’→3’ (deletion) )

Page 63: MS thesis presentation_FINAL

ConclusionsHypotheses revisited

3) Smaller MMEs occur with greater frequency than larger MMEs. Refuted Increase of 1.8 – 3.4 fold of 5 bp over 4 bp indels

Consistent with recent MS Orton’s findings (1.6 fold increase) Unknown if result of:

Uncharacterized facet of the energetics of slippage Limitation of mutation recognition systems Some feature of plastid DNA repair mechanism Just an artifact of indel scoring

Page 64: MS thesis presentation_FINAL

64Primer design

Conserved sequences from the existing sequences that flanked the incomplete region were selected for the following criteria to be satisfied. newly designed primer to be at least:

25 bp 3’ G or C anchor minimum GC content of 50% minimum melting temperature (Tm) of 50ºC hairpin of ΔG > -6.0 self-dimer of ΔG > -6.0 heterodimer of ΔG > -6.0

~80 bp hole

Page 65: MS thesis presentation_FINAL

65

Primer design (cont’d) Geneious Pro 5.5.6 (Biomatters Ltd, Aukland, NZ) software was initially

used to generate a list of potential primer sequences

Page 66: MS thesis presentation_FINAL

66

Potential primer sequences were analyzed with a web tool (Oligoanalyzer) from www.idtdna.com/site.

Page 67: MS thesis presentation_FINAL

67Potential primer sequences were analyzed with a web tool (Oligoanalyzer) from www.idtdna.com/site.

Page 68: MS thesis presentation_FINAL

68The Grass Phylogeny Working Group II

(GPWG II)

This laboratory is involved in a worldwide collaboration of plant systematists and plant biologists (The Grass Phylogeny Working Group II (GPWG II)) who pool their research together in order to work out a well-supported evolutionary history of the entire family.

The data obtained from the work of this laboratory will aid in determining on a fine scale the exact relationships between all ten of the representative grasses.

Greater support values for determining these relationships.

Page 69: MS thesis presentation_FINAL

69Polymerase chain reactions (PCR)

(ASAP01 program)For primers designed by Dhingra and Folta (2005)

and Leseberg and Duvall (2009) 50 μl mixture consisting of 1.5 μl forward primer, 1.5 μl reverse primer

(each diluted 1:40 with HOH), 1.5 μl DNA template, 0.4 μl dNTP's (1:1:1:1), 5.0 μl 10x TBE buffer, 39.6 μl HOH and 0.5 μl PFU Turbo Polymerase (Strategen Inc, Carlsbad, CA).

Also Fidelitaq® used when PFU failed to produce amplicons. GeneAmp ® PCR System 2700 was used for DNA amplification using

program ASAP01 with the following parameters: 94ºC for 4.0 min with 10 cycles PCR touchdown (55ºC to 50ºC) at 40

seconds each to assure primer specificity would not preclude DNA amplification.

72ºC for 3.0 min; 35 cycles at 94ºC for 40 sec each, 50ºC for 40 sec, then 72ºC for 3.0 min with a final extension time of 7.0 min at 72ºC.

Page 70: MS thesis presentation_FINAL

70 Electrophoresis Electrophoresis methods were used to verify the size and

number of amplified DNA fragments. Expected size of amplicons ≈ 1200 bp

PCR products were placed in a 0.8-1.0% agarose gel in a TBE buffer for 50 min at 100V.

High and low ladders (ThermoFisher, Hanover Park, IL) were used in conjunction with negative controls to assure the legitimacy and size of the DNA fragments.

DNA fragments were cleaned and purified (Wizard kit method, Promega Corp., Madison).

PCR products exported to Macrogen, Inc., (Seoul, Korea) for DNA capillary Sanger sequencing.

Page 71: MS thesis presentation_FINAL

71 Not all primers amplified…..An alternate PCR program (ASAPCL) was created to be used in conjunction with the new primers that were designed.

parameters for this program: 94ºC for 4.0 min; 40 cycles at 94ºC for 40 sec each,

50ºC for 40 sec, then 72ºC for 3.0 min with a final extension time of 7.0 min at 72ºC.

NO TOUCHDOWN Primer sequences identical to template primer specificity should not preclude DNA

amplification

Page 72: MS thesis presentation_FINAL

72Macrogen result example check and trim

Page 73: MS thesis presentation_FINAL

73Forward and reverse sequences were pairwise

aligned to produce a small consensus sequence≥15bp overlap

Page 74: MS thesis presentation_FINAL

74

Adjacent region concensus sequences were assembled to make Contigs

~200 bp overlap

Page 75: MS thesis presentation_FINAL

Continued until

Page 76: MS thesis presentation_FINAL

76Annotation of CDS

Completed plastomes were pairwise aligned to an already annotated genome and annotations were transferred with ≥ 70% identity.

CDS extracted and checked for proper reading frames and manually adjusted when necessary

Page 77: MS thesis presentation_FINAL

77

CDS sequences were extracted and translated into AA sequence to determine proper reading frames.

Annotations manually adjusted to give proper reading frames

Page 78: MS thesis presentation_FINAL

78

Extracted flanking sequence from area around hole was aligned to NextGen sequence reads.

Page 79: MS thesis presentation_FINAL

79

Insertions/deletions (Indels)

• These events were scored if they were ≥3 bp length

MME Scoring and Analyses

Page 80: MS thesis presentation_FINAL

80Inversions reverse compliment base pairing

• Sequence was manually searched for inversions and annotated with base compliment loop forming regions.

• Scored if ≥2 bp with stem ≥3 bp

Page 81: MS thesis presentation_FINAL

81Each event type scored separately Σ Σ Σ Σ Σ Σ Σ

D 0 1 0 0 0 1 2 1 1 1 1 0 0 0 0 1 1 6 0 0 0 1 1 2 0 0 1 1 0 1 1B 0 1 0 1 1 0 3 1 1 1 1 0 0 1 0 0 1 6 0 1 1 1 1 2 1 1 1 1 1 1 2H 0 1 0 0 0 0 1 1 1 1 1 1 0 1 0 0 1 7 1 0 1 1 1 2 1 1 1 1 0 1 1S 0 1 1 0 1 0 3 1 0 1 1 0 0 0 1 0 1 5 0 0 0 1 1 2 1 1 1 1 0 1 1

Sp 0 0 1 0 1 0 2 0 1 1 1 0 0 0 0 0 1 4 0 0 0 1 1 2 1 1 1 1 0 1 1Z 0 1 1 0 1 0 3 0 0 1 1 0 0 0 0 0 ? 2 0 0 0 1 1 2 1 1 1 1 0 ? 0E 0 0 0 1 0 0 1 1 0 0 1 0 1 1 0 0 0 4 0 0 0 1 1 2 0 0 1 1 0 1 1e 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 4 0 0 0 1 1 2 0 0 1 1 0 1 1N 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 2 0 0 0 ? 1 1 0 0 1 1 0 1 1C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

#BP 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 5 5 6 7 9 9

2 3 4 5 6 7 9 ΣD. spicata 2 6 0 2 0 1 1 12B. curtipedula 3 6 1 2 1 1 2 16H. cenchroides 1 7 1 2 1 1 1 14S. heterolepis 3 5 0 2 1 1 1 13S. pecinata 2 4 0 2 1 1 1 11Z. macrantha 3 2 0 2 1 1 0 9E. tef 1 4 0 2 0 1 1 9E. minor 1 4 0 2 0 1 1 9N. reynaudiana 1 2 0 1 0 1 1 6

Inversion Size Frequency

Page 82: MS thesis presentation_FINAL

Phylogenomic Analysis Maximum Parsimony (MP) results from all datasets

Dataset usedTotal

number of characters

Number of parsimony informative characters

Tree length

CI excluding uninformative

charactersRI

[1] 104,248 3143 11647 0.7463 0.7597

[2] 605 212 674 0.7544 0.7971

[1-2] 104,853 3355 12328 0.746 0.7611

[3] 62,486 1437 5191 0.7205 0.7311

[4] 41,012 1688 6356 0.7722 0.7852

Page 83: MS thesis presentation_FINAL

Indels in CDS Only 5.2% of indels occur in CDS supports the assumption that noncoding sequences are more likely to retain

mutations since they do not directly affect gene function. Indels in CDS cause:

frameshift mutations, alter AA sequences, introduce internal stop codons = deleterious

purifying selection acts against deleterious mutations

Page 84: MS thesis presentation_FINAL

CDS specific inversions inversions found in CDS of matK,

ndhF and ccsA Changed physical properties of

AA at these loci from the ancestral condition.

All are essential for cell metabolism

Infer that these mutations do not affect protein function

Reversion to ancestral condition has been observed

Dynamic process

Table 12-a

Inv1 matK

Taxa position nucleotide sequence AA sequence Δ AA

properties D. spicata 2342 - 2357 TTTCTTTTGAAAAAGAAG KKQFLL P,A B. curtipedula 2295 - 2310 TTTCTTTTGAAAAAGAAG KKQFLL P,A H. cenchroides 2330 - 2345 TTTCTTTTGAAAAAGAGG KKQFLP P,A S. heterolepis 2314 - 2329 TTTCTTTTGAAAAAGAAG KKQFLL P,A S. pecinata 2322 - 2337 TTTCTTTTTCAAAAGAAG KKKLLL (+), NP Z. macrantha 2321 - 2336 TTTCTTTTGAAAAAGAAG KKQFLL P,A E. tef 2310 - 2325 TTTCTTCTTCAAAAGAAG KKKLLL (+), NP E. minor 2305 - 2320 TTTCTTCTTCAAAAGAAG KKKLLL (+), NP N. reynaudiana 2284 - 2299 TTTCTTCTTCAAAAGAAG KKKLLL (+), NP C. glauca 2329 - 2344 TTTCTTCTTCAAAAGAGG KKKLLP (+), NP

Page 85: MS thesis presentation_FINAL

85Predictive power?

Page 86: MS thesis presentation_FINAL

86 Predictive power?

Hypothetical sequence with potential to form loop structures