This presentation uses animations and is best viewed as a slide show. To start the presentation,...

28
This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of This presentation uses animations and is best viewed as a slide show. To start the presentation,...

Page 1: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

This presentation uses animations and is best viewed as a slide show.

To start the presentation, click Slide Show

on the top tool bar and then View show

Page 2: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Welcome toIntroduction to Bioinformatics

Wednesday, 28 February 2007

Introduction to Viral Metagenome Project

• Discussion of Edward & Rohwer (2005)*

• Exam retrospective (Problem 12)

• Other matters?

*Unless otherwise noted, all figures herein are from: Edwards RA, Rohwer F (2005). Viral metagenomics. Nature Rev Microbiol (2005) 3:504-510.

Page 3: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Phage phylogeny and taxonomy

Placement of unknown phage into phylogeny

SQ11. How to test? Result of test?

~50,000 nt

Blast

~500 nt

Page 4: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)The proviral metagenome

SQ11. What's a provirus or prophage? Why would a virus do such a thing?

Page 5: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

InfectionPhage

Bacterial chromosome

Phage genome

LysogenicpathwayPhage genome

DeathGeneral transduction

Edwards & Rohwer (2005)The proviral metagenome

Lytic pathway

Page 6: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

InfectionPhage

Bacterial chromosome

Phage genome

Phage genome

Life!

Lytic pathwayLysogenic pathway

Edwards & Rohwer (2005)The proviral metagenome

Page 7: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Viral community structure and ecology

SQ14. What means ~1012 viruses but only ~1000 viral genotypes? Two scenarios?

Page 8: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Viral community structure and ecology

SQX. How to measure complexity?

- Sample 1000- How many counted once?- How many counted twice?- How many counted zero times?

- Model the process Use different number of types

Page 9: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Viral community structure and ecology

SQX. How to measure complexity?

0

0.2

0.4

0.6

0.8

1

0 5 10 15

200 types

Times encountered

Probab

ilIty

Page 10: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Viral community structure and ecology

SQX. How to measure complexity?

200 types

Times encountered

Probab

ilIty

0

0.2

0.4

0.6

0.8

1

0 5 10 15

5000 types

Page 11: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Viral community structure and ecology

SQX. How to measure complexity?

0

0.1

0.2

0.3

0.4

0 5 10 15 20

Times encountered

Probab

ilIty

Page 12: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Bioinformatics and viral metagenomics

1. How to identify genes?

2. How to identify genes' viruses?

Page 13: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Bioinformatics and viral metagenomics

How to identify genes?

Sequence

Open reading frames

Sequence 151 TATTTCGTAG TTATGTTGAA CCGATGAAAC TTGTTTGTTC TCAAATTGAG Translation-Frame-1 151 Y F V V M L N R * N L F V L K L STranslation-Frame-2 151 I S * L C * T D E T C L F S N * A Translation-Frame-3 151 F R S Y V E P M K L V C S Q I E Complement 151 ATAAAGCATC AATACAACTT GGCTACTTTG AACAAACAAG AGTTTAACTC Translation-Frame-4 151 I E Y N H Q V S S V Q K N E F Q Translation-Frame-5 151 Y K T T I N F R H F K N T R L N L Translation-Frame-6 151 T N R L * T S G I F S T Q E * I S Sequence 201 CTCAATACAG CTCTTCAACT AGTTAGTAGA GCTGTAGCCA CTAGGCCTTC Translation-Frame-1 201 S I Q L F N * L V E L * P L G L R Translation-Frame-2 201 Q Y S S S T S * * S C S H * A F Translation-Frame-3 201 L N T A L Q L V S R A V A T R P S Complement 201 GAGTTATGTC GAGAAGTTGA TCAATCATCT CGACATCGGT GATCCGGAAG Translation-Frame-4 201 A * Y L E E V L * Y L Q L W * A K Translation-Frame-5 201 E I C S K L * N T S S Y G S P R Translation-Frame-6 201 S L V A R * S T L L A T A V L G E

Open reading frame finder + ORF characteristicsE.g. GeneMark

Page 14: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Bioinformatics and viral metagenomics

How to identify genes?

Sequence

Open reading frames

Predicted function

BlastP

Page 15: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Bioinformatics and viral metagenomics

How to identify genes?

Sequence

Open reading frames

Predicted function

BlastN?

SQ16. Other Blasts? TBlastX? Why so much time?

Page 16: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Bioinformatics and viral metagenomics

How to identify genes' viruses?

Page 17: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Bioinformatics and viral metagenomics

How to identify genes' viruses?

Page 18: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.
Page 19: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Codon usage in different organisms

SQ16. What means "codon usage"? How useful?

Page 20: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

GC content in different organisms

SQ18. GC/AT differences in cyanobacterial genomes?

Page 21: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

GC content in different organisms

S6301 0.5548433 S7942 0.554378 P9313 0.50739753 S6803 0.47359636 Npun 0.4135452 A7120 0.4126833 Tery 0.34196815 PRO1375 0.3644214 S8102 0.594126 Gvi 0.6199786 TeBP1 0.5391793 PMED4 0.3079916 Cwat 0.37098223 A29413 0.4141176

Page 22: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Constancy of sequence characteristics

- GC content

- Codon frequencies

- Dinucleotide frequencies

DNA sequence

Page 23: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Constancy of sequence characteristics

DNA sequence

- GC content

- Codon frequencies

- Dinucleotide frequencies

Page 24: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Constancy of sequence characteristics

DNA sequence

- GC content

- Codon frequencies

- Dinucleotide frequencies

Page 25: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Constancy of sequence characteristics

Karlin S (2001). Trends Microbiol 9:335-343

Page 26: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.

Edwards & Rohwer (2005)Bioinformatics and viral metagenomics

How to identify genes' viruses?

- GC content

- Codon frequencies

- Dinucleotide frequencies

Virus #1Virus #2Virus #3Virus #4Virus #5Virus #6

. . .

Viral fragment

Page 27: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.
Page 28: This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.