Download all complete prokaryotic genomes from the NCBI RefSeq database Extract 16S rRNA sequences...

1
Procedure Goals Download all complete prokaryotic genomes from the NCBI RefSeq database Extract 16S rRNA sequences from each genome. Use UCLUST algorithm to cluster 16S sequences into OTUs at four different levels of similarity: 97, 98, 99, and 100 percent. Run each cluster through a pipeline which compares all genes within each OTU, and identifies genes that are shared among genomes and unique to specific genomes. Below are two Venn diagrams representing OTUs with different numbers of unique and related genes. Each circle represents the genes within a single genome, with shared genes in cyan and unique genes in magenta. References Two of the most important observations were the numbers of unique and related genes. We found that the number of unique genes the number of genes in families at a given percent similarity was distributed approximately normally. At all levels of cluster similarity, the median percentages of unique and shared genes in each two genome cluster were approximately 30% and 65% respectively. Our hypothesis was rejected. Below is a graph of the mean percentages of unique and shared genes in every two genome cluster at each level of similarity. Results Determining the reliability of 16S ribosomal RNA sequences as a proxy for whole genome taxonomic characterization Background 16S ribosomal RNA is often used as a marker to cluster prokaryotes functionally and taxonomically into quasi-phylogenetic clusters known as Operational Taxonomic Units (OTUs). However, organisms with similar or even identical 16S sequences can differ greatly in their protein coding sequences. Objective To investigate the similarity of genomes within OTUs on a gene-by-gene basis. Hypothesis We hypothesized that genomes belonging prokaryotes with similar 16S sequences will be nearly Shane Kochvi 1 , Jordan Ramsdell 2 , Phillip Hatcher 1 , W. Kelley Thomas 2 1. Department of Computer Science, University of New Hampshire 2. Department of Genetics, University of New Hampshire This poster was made possible by the IDeA Program, NIH Grant No. P20GM103506 (National Institute of General Medical Sciences) Thompson JR, Pacocha S, Pharino C, Klepac-Ceraj V, Hunt DE, Benoit J, Sarma-Rupavtarm R, Distel DL, Polz MF. Genotypic diversity within a natural coastal bacterioplankton population. Science 307(5713):1311-3 (2005). Franzosa, Eric A., Tiffany Hsu, Alexandra Sirota-Madi, Afrah Shafquat, Galeb Abu-Ali, Xochitl C. Morgan and Curtis Huttenhower. Sequencing and beyond: integrating molecular ‘omics’ for microbial community profiling. Nature Reviews Microbiology 13, 360–372 (2015). Ward DM, Ferris MJ, Nold SC, Bateson MM. A natural view of microbial biodiversity within hot spring cyanobacterial mat communities. Microbiology and Molecular Biology Reviews.1998 Dec;62(4):1353-70. Eren A. Murat, Loïs Maignien, Woo Jun Sul, Leslie G. Murphy, Sharon L. Grim, Hilary G. Morrison and Mitchell L. Sogin. Oligotyping: differentiating between closely related microbial taxa using 16S rRNA gene data. Methods in Ecology and Evolution 2013, 4, 1111–1119 97 98 99 100 0% 10% 20% 30% 40% 50% 60% 70% 80% Mean Percentage of Unique/Shared Genes for Two Genome Clusters % Unique Genes % Shared Genes Cluster % Similarity Percentage Unique/Shared

Transcript of Download all complete prokaryotic genomes from the NCBI RefSeq database Extract 16S rRNA sequences...

Page 1: Download all complete prokaryotic genomes from the NCBI RefSeq database Extract 16S rRNA sequences from each genome. Use UCLUST algorithm to cluster 16S.

ProcedureGoalsDownload all complete prokaryotic

genomes from the NCBI RefSeq database

Extract 16S rRNA sequences from each genome.

Use UCLUST algorithm to cluster 16S sequences into OTUs at four different levels

of similarity: 97, 98, 99, and 100 percent.

Run each cluster through a pipeline which compares all genes within each OTU, and identifies genes that are shared among

genomes and unique to specific genomes.

Below are two Venn diagrams representing OTUs with different numbers of unique and related genes. Each circle represents the genes within a single genome, with shared genes in cyan and unique genes in magenta.

References

Two of the most important observations were the numbers of unique and related genes. We found that the number of unique genes the number of genes in families at a given percent similarity was distributed approximately normally. At all levels of cluster similarity, the median percentages of unique and shared genes in each two genome cluster were approximately 30% and 65% respectively. Our hypothesis was rejected. Below is a graph of the mean percentages of unique and shared genes in every two genome cluster at each level of similarity.

Results

Determining the reliability of 16S ribosomal RNA sequences as a proxy for whole genome taxonomic characterization

Background 16S ribosomal RNA is often used as a marker to cluster prokaryotes functionally and taxonomically into quasi-phylogenetic clusters known as Operational Taxonomic Units (OTUs). However, organisms with similar or even identical 16S sequences can differ greatly in their protein coding sequences.

Objective To investigate the similarity of genomes within OTUs on a gene-by-gene basis.

Hypothesis We hypothesized that genomes belonging prokaryotes with similar 16S sequences will be nearly identical in gene content.

Shane Kochvi1, Jordan Ramsdell2, Phillip Hatcher1, W. Kelley Thomas2

1. Department of Computer Science, University of New Hampshire2. Department of Genetics, University of New Hampshire

This poster was made possible by the IDeA Program, NIH Grant No. P20GM103506 (National Institute of General Medical Sciences)

Thompson JR, Pacocha S, Pharino C, Klepac-Ceraj V, Hunt DE, Benoit J, Sarma-Rupavtarm R, Distel DL, Polz MF.Genotypic diversity within a natural coastal bacterioplankton population. Science 307(5713):1311-3 (2005). Franzosa, Eric A., Tiffany Hsu, Alexandra Sirota-Madi, Afrah Shafquat, Galeb Abu-Ali, Xochitl C. Morgan and Curtis Huttenhower. Sequencing and beyond: integrating molecular ‘omics’ for microbial community profiling. Nature Reviews Microbiology 13, 360–372 (2015).

Ward DM, Ferris MJ, Nold SC, Bateson MM. A natural view of microbial biodiversity within hot spring cyanobacterial mat communities. Microbiology and Molecular Biology Reviews.1998 Dec;62(4):1353-70.

Eren A. Murat, Loïs Maignien, Woo Jun Sul, Leslie G. Murphy, Sharon L. Grim, Hilary G. Morrison and Mitchell L. Sogin. Oligotyping: differentiating between closely related microbial taxa using 16S rRNA gene data. Methods in Ecology and Evolution 2013, 4, 1111–1119

97 98 99 1000%

10%

20%

30%

40%

50%

60%

70%

80%

Mean Percentage of Unique/Shared Genes for Two Genome Clusters

% Unique Genes% Shared Genes

Cluster % SimilarityPe

rcen

tage

Uni

que/

Shar

ed