2 Interrelated Modules on Bioinformatics

30

description

2 Interrelated Modules on Bioinformatics. Module 1: To show the ways in which the NCBI online database classifies and organizes information on DNA sequences, evolutionary relationships, and scientific publications. - PowerPoint PPT Presentation

Transcript of 2 Interrelated Modules on Bioinformatics

Page 1: 2 Interrelated Modules on Bioinformatics
Page 2: 2 Interrelated Modules on Bioinformatics

2 Interrelated Modules on Bioinformatics2 Interrelated Modules on Bioinformatics

Module 1: To show the ways in which the NCBI online database classifies and organizes information on DNA sequences, evolutionary relationships, and scientific publications.

Module 2: To identify an unknown nucleotide sequence from the Wolbachia endosymbiont by using the NCBI search tool BLAST

Teaching Time – 45 minutes

Page 3: 2 Interrelated Modules on Bioinformatics

OH NO!OH NO!

Page 4: 2 Interrelated Modules on Bioinformatics

1. No programming skills needed

2.Familiarity with personal computer and internet browser

3.Customizable and free

Page 5: 2 Interrelated Modules on Bioinformatics

What are the broad goals of this lab?What are the broad goals of this lab?

To provide an introduction to bioinformatics To provide an introduction to bioinformatics (NCBI)(NCBI)

To introduce you to searching for articles, To introduce you to searching for articles, sequences, scientists (perhaps yourself)sequences, scientists (perhaps yourself)

To use phylogeniesTo use phylogenies

To put your To put your WolbachiaWolbachia research in the research in the context of whatcontext of what’’s been publisheds been published

Page 6: 2 Interrelated Modules on Bioinformatics

What are the specific goals of this lab?What are the specific goals of this lab?

To look for brand new W strainsTo look for brand new W strains

To make a phylogenetic tree of WTo make a phylogenetic tree of W

To ultimately compare the W tree to an To ultimately compare the W tree to an insect phylogeny to infer lateral vs. vertical insect phylogeny to infer lateral vs. vertical transmission of your W strainstransmission of your W strains

To contribute to a national To contribute to a national ““studentstudent”” sequence database on the genetic diversity of sequence database on the genetic diversity of W 16S rRNA geneW 16S rRNA gene

Page 7: 2 Interrelated Modules on Bioinformatics

WolbachiaWolbachia – Host Interactions: – Host Interactions: Mutualism and Reproductive ParasitismMutualism and Reproductive Parasitism

Parthenogenesis in wasps

Male-killing in insects

Feminization in isopods

Cytoplasmic incompatibility in arthropods

Rep

rodu

ctiv

e pa

rasi

tism

Mut

ualis

m

Required for nematode fertility and larval development

Required for insect oogenesis (Dedeine et al. 2001)

Page 8: 2 Interrelated Modules on Bioinformatics

Dunning-Hottop et al 2006

Wolbachia

Anaplasma

Ehrlichia

Neorickettsia

Rickettsia

Alpha Proteobacteria

ObligatoryIntracellularsin Arthropods

Wolbachia –Anaplasma Split

Wins-Wnem

Split (~120MY)

Application of Bioinformatics to Application of Bioinformatics to WolbachiaWolbachia

Page 9: 2 Interrelated Modules on Bioinformatics

Mutualist

Wolbachia:

Parasite

Page 10: 2 Interrelated Modules on Bioinformatics

Outcomes: A New Outcomes: A New WolbachiaWolbachia Species? Species?

Page 11: 2 Interrelated Modules on Bioinformatics

Wolbachia: Complete genomic sequences from a related parasite and mutualist

wBm (Foster et al. 2005)wMel (Wu et al. 2004)

1.08 Mb1.08 Mb 806 genes806 genes

1.27 Mb1.27 Mb 1270 genes1270 genes

696 shared genes

Wu et al 2004 Foster et al 2005

Page 12: 2 Interrelated Modules on Bioinformatics

ORIGIN 1 ttcttgtatc ccaaacatct cgagcttctt gtacaccaaa ttaggtattc actatggaat 61 tcagagttca cttgcaagct gataatgagc agaaaatttt tcaaaaccag atgaaacccg 121 aacctgaagc ctcttacttg attaatcaaa gacggtctgc aaattacaag ccaaatattt 181 ggaagaacga tttcctagat caatctctta tcagcaaata cgatggagat gagtatcgga

Page 13: 2 Interrelated Modules on Bioinformatics

BLAST:BLAST:

• Compare new genes to old ones• Compare genes from different species or

hosts• Identify possible functions based on

similarities to known sequences.

Query a database for sequences homologous to an input (ie, query) sequence.

GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <——

——> > CCTTAAGGAAGGAAGGCC--GGTTAAGGTTCCAGAGAAGTGGTGTTCTTTGAGTTCCCTTTGAGTTCC

Page 14: 2 Interrelated Modules on Bioinformatics
Page 15: 2 Interrelated Modules on Bioinformatics

National Center for Biotechnology Information (NCBI)National Center for Biotechnology Information (NCBI)http://www.ncbi.nlm.nih.govhttp://www.ncbi.nlm.nih.gov

Page 16: 2 Interrelated Modules on Bioinformatics

Release 2008: 99 billion base pairs 99 million sequences

Page 17: 2 Interrelated Modules on Bioinformatics

Target database: Adjustable using the pull-down menuTarget database: Adjustable using the pull-down menu

Page 18: 2 Interrelated Modules on Bioinformatics
Page 19: 2 Interrelated Modules on Bioinformatics
Page 20: 2 Interrelated Modules on Bioinformatics
Page 21: 2 Interrelated Modules on Bioinformatics

A TraditionalA TraditionalGenBank RecordGenBank Record

LOCUS AY182241 1931 bp mRNA linear PLN 04-MAY-2004DEFINITION Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA, complete cds.ACCESSION AY182241VERSION AY182241.2 GI:32265057KEYWORDS .SOURCE Malus x domestica (cultivated apple) ORGANISM Malus x domestica Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.REFERENCE 1 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Cloning and functional expression of an (E,E)-alpha-farnesene synthase cDNA from peel tissue of apple fruit JOURNAL Planta 219, 84-94 (2004)REFERENCE 2 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USAREFERENCE 3 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USA REMARK Sequence update by submitterCOMMENT On Jun 26, 2003 this sequence version replaced gi:27804758.FEATURES Location/Qualifiers source 1..1931 /organism="Malus x domestica" /mol_type="mRNA" /cultivar="'Law Rome'" /db_xref="taxon:3750" /tissue_type="peel" gene 1..1931 /gene="AFS1" CDS 54..1784 /gene="AFS1" /note="terpene synthase" /codon_start=1 /product="(E,E)-alpha-farnesene synthase" /protein_id="AAO22848.2" /db_xref="GI:32265058" /translation="MEFRVHLQADNEQKIFQNQMKPEPEASYLINQRRSANYKPNIWK NDFLDQSLISKYDGDEYRKLSEKLIEEVKIYISAETMDLVAKLELIDSVRKLGLANLF EKEIKEALDSIAAIESDNLGTRDDLYGTALHFKILRQHGYKVSQDIFGRFMDEKGTLE DFLHKNEDLLYNISLIVRLNNDLGTSAAEQERGDSPSSIVCYMREVNASEETARKNIK GMIDNAWKKVNGKCFTTNQVPFLSSFMNNATNMARVAHSLYKDGDGFGDQEKGPRTHI LSLLFQPLVN"ORIGIN 1 ttcttgtatc ccaaacatct cgagcttctt gtacaccaaa ttaggtattc actatggaat 61 tcagagttca cttgcaagct gataatgagc agaaaatttt tcaaaaccag atgaaacccg 121 aacctgaagc ctcttacttg attaatcaaa gacggtctgc aaattacaag ccaaatattt 181 ggaagaacga tttcctagat caatctctta tcagcaaata cgatggagat gagtatcgga 241 agctgtctga gaagttaata gaagaagtta agatttatat atctgctgaa acaatggatt//

The Flatfile FormatThe Flatfile Format

Page 22: 2 Interrelated Modules on Bioinformatics

LOCUS AY182241 1931 bp mRNA linear PLN 04-MAY-2004DEFINITION Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA, complete cds.ACCESSION AY182241VERSION AY182241.2 GI:32265057KEYWORDS .SOURCE Malus x domestica (cultivated apple) ORGANISM Malus x domestica Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.REFERENCE 1 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Cloning and functional expression of an (E,E)-alpha-farnesene synthase cDNA from peel tissue of apple fruit JOURNAL Planta 219, 84-94 (2004)REFERENCE 2 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USAREFERENCE 3 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USA REMARK Sequence update by submitterCOMMENT On Jun 26, 2003 this sequence version replaced gi:27804758.

The HeaderThe Header

Page 23: 2 Interrelated Modules on Bioinformatics

LOCUS AY182241 1931 bp mRNA linear PLN 04-MAY-2004DEFINITION Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA, complete cds.ACCESSION AY182241VERSION AY182241.2 GI:32265057KEYWORDS .SOURCE Malus x domestica (cultivated apple) ORGANISM Malus x domestica Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.REFERENCE 1 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Cloning and functional expression of an (E,E)-alpha-farnesene synthase cDNA from peel tissue of apple fruit JOURNAL Planta 219, 84-94 (2004)REFERENCE 2 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USAREFERENCE 3 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USA REMARK Sequence update by submitterCOMMENT On Jun 26, 2003 this sequence version replaced gi:27804758.

Header: Locus LineHeader: Locus LineLOCUS AY182241 1931 bp mRNA linear PLN 04-MAY-2004LOCUS AY182241 1931 bp mRNA linear PLN 04-MAY-2004

Molecule typeMolecule typeDivisionDivision

Modification DateModification Date

Locus nameLocus name

LengthLength

Page 24: 2 Interrelated Modules on Bioinformatics

LOCUS AY182241 1931 bp mRNA linear PLN 04-MAY-2004DEFINITION Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA, complete cds.ACCESSION AY182241VERSION AY182241.2 GI:32265057KEYWORDS .SOURCE Malus x domestica (cultivated apple) ORGANISM Malus x domestica Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.REFERENCE 1 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Cloning and functional expression of an (E,E)-alpha-farnesene synthase cDNA from peel tissue of apple fruit JOURNAL Planta 219, 84-94 (2004)REFERENCE 2 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USAREFERENCE 3 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USA REMARK Sequence update by submitterCOMMENT On Jun 26, 2003 this sequence version replaced gi:27804758.

Header: Database IdentifiersHeader: Database Identifiers

ACCESSION AY182241

VERSION AY182241.2 GI:32265057

ACCESSION AY182241

VERSION AY182241.2 GI:32265057

Accession•Stable•Reportable•Universal

Accession•Stable•Reportable•Universal

Page 25: 2 Interrelated Modules on Bioinformatics

LOCUS AY182241 1931 bp mRNA linear PLN 04-MAY-2004DEFINITION Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA, complete cds.ACCESSION AY182241VERSION AY182241.2 GI:32265057KEYWORDS .SOURCE Malus x domestica (cultivated apple) ORGANISM Malus x domestica Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.REFERENCE 1 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Cloning and functional expression of an (E,E)-alpha-farnesene synthase cDNA from peel tissue of apple fruit JOURNAL Planta 219, 84-94 (2004)REFERENCE 2 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USAREFERENCE 3 (bases 1 to 1931) AUTHORS Pechous,S.W. and Whitaker,B.D. TITLE Direct Submission JOURNAL Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab, USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD 20705, USA REMARK Sequence update by submitterCOMMENT On Jun 26, 2003 this sequence version replaced gi:27804758.

Header: OrganismHeader: Organism

SOURCE Malus x domestica (cultivated apple) ORGANISM Malus x domestica Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.

SOURCE Malus x domestica (cultivated apple) ORGANISM Malus x domestica Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.

NCBI-controlled taxonomy

Page 26: 2 Interrelated Modules on Bioinformatics

FEATURES Location/Qualifiers source 1..1931 /organism="Malus x domestica" /mol_type="mRNA" /cultivar="'Law Rome'" /db_xref="taxon:3750" /tissue_type="peel" gene 1..1931 /gene="AFS1" CDS 54..1784 /gene="AFS1" /note="terpene synthase" /codon_start=1 /product="(E,E)-alpha-farnesene synthase" /protein_id="AAO22848.2" /db_xref="GI:32265058" /translation="MEFRVHLQADNEQKIFQNQMKPEPEASYLINQRRSANYKPNIWK NDFLDQSLISKYDGDEYRKLSEKLIEEVKIYISAETMDLVAKLELIDSVRKLGLANLF EKEIKEALDSIAAIESDNLGTRDDLYGTALHFKILRQHGYKVSQDIFGRFMDEKGTLE NHHFAHLKGMLELFEASNLGFEGEDILDEAKASLTLALRDSGHICYPDSNLSRDVVHS LELPSHRRVQWFDVKWQINAYEKDICRVNATLLELAKLNFNVVQAQLQKNLREASRWW ANLGIADNLKFARDRLVECFACAVGVAFEPEHSSFRICLTKVINLVLIIDDVYDIYGS EEELKHFTNAVDRWDSRETEQLPECMKMCFQVLYNTTCEIAREIEEENGWNQVLPQLT KVWADFCKALLVEAEWYNKSHIPTLEEYLRNGCISSSVSVLLVHSFFSITHEGTKEMA DFLHKNEDLLYNISLIVRLNNDLGTSAAEQERGDSPSSIVCYMREVNASEETARKNIK GMIDNAWKKVNGKCFTTNQVPFLSSFMNNATNMARVAHSLYKDGDGFGDQEKGPRTHI LSLLFQPLVN"

The Feature TableThe Feature Table

Coding sequenceCoding sequence

start (atg)start (atg) stop (tag)stop (tag)

Page 27: 2 Interrelated Modules on Bioinformatics

DNADNA RNARNA

cDNAcDNA

phenotypephenotype

DNA sequencesDNA sequencesgenomesgenomes

protein protein sequence sequence databasesdatabases

proteinprotein

Bioinformatics is NOT just information technology. It can teach the central dogmas of molecular biology

Page 28: 2 Interrelated Modules on Bioinformatics

GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100% GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100%

GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100% GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100%

GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100% GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100%

GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100% GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100%

Insect Phylogeny

Top 5 Wolbachia BLAST matches

GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100% GATGGATGCCCCAATTAAGGAAGGCCTTGGTTAAGGTTCCGTGTAACCCCCCT <T <- 100% - 100%

Outcomes:Outcomes:

Lateral Transfer ?Lateral Transfer ?

Page 29: 2 Interrelated Modules on Bioinformatics

LetLet’’s Begin Our Bioinformatic Exercise s Begin Our Bioinformatic Exercise Lab 5Lab 5