Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor...

Post on 24-Dec-2015

216 views 0 download

Tags:

Transcript of Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor...

Doug Brutlag 2011

Next Generation Sequencing andHuman Genome Databases

Doug BrutlagProfessor Emeritus of Biochemistry & Medicine

Stanford University School of Medicine

Genomics, Bioinformatics & Medicinehttp://biochem158.stanford.edu/

Doug Brutlag 2011

Illumina Solexa Sequencing Technology

Doug Brutlag 2011

Emulsion Based Clonal Amplification

Micro-reactors Adapter carrying

library DNAAnneal DNA template

to capture beads

Break micro-reactors Isolate DNA

containing beads

Single test tube generation of millions of clonally amplified sequencing templatesNo cloning and colony picking

“Water-in-oil” emulsion

+ PCR Reagents

+ Emulsion Oil

Perform emulsion PCR

A

B

Doug Brutlag 2011

Pacific Biosciences SMRT Sequencing

Doug Brutlag 2011

Pacific Biosciences Sequencing

Doug Brutlag 2011

Phospholinked Fluorophores

Doug Brutlag 2011

Processive Synthesis

Doug Brutlag 2011

Synthesis of Long Duplex DNA

Doug Brutlag 2011

Circular Templates Gives Redundant Sequencing and

Accuracy

Doug Brutlag 2011

Circular Templates Gives Redundant

Sequencing and Accuracy

Doug Brutlag 2011

Ion Torrent Sequencing

Doug Brutlag 2011

Ion Torrent Sequencing

Doug Brutlag 2011

Ion Torrent Sequencing

Doug Brutlag 2011

The Human GenomeHow fast is the cost going down?

• 2006: $ 50 million• 2008: $500,000• 2009: $50,000• 2010: $20,000• 2011: $5,000• 2012:??? $1,000

Thanks to Serafim Batzoglou

Doug Brutlag 2011

Archon Genomics X-Prize

Doug Brutlag 2011

Archon Genomics X-Prize

Doug Brutlag 2011

Components of a Typical Human Gene

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminatorTFBS

Doug Brutlag 2011

Active Genes are Transcribed into RNA

PrimaryPrimaryTranscriptTranscript

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminator

Doug Brutlag 2011

TranscriptTranscript

mRNAmRNA

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminator

55’’ 33’’

SplicingSplicing

Splicing Transcript Yields Mature mRNA

Doug Brutlag 2011

Mature mRNA contains Coding Region and 5’ and 3’ Untranslated

Regions

TranscriptTranscript

mRNAmRNA

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminator

55’’ 33’’

SplicingSplicing

Coding RegionCoding Region55’’UTRUTR 33’’UTRUTR

55’’UTRUTR 33’’UTRUTR

Doug Brutlag 2011

Mature mRNA contains7-Methyl-Guanosine 5’ Cap and 3’ Poly A

Tail

TranscriptTranscript

mRNAmRNA

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminator

SplicingSplicing

Coding RegionCoding Region55’’UTRUTR 33’’UTRUTR

55’’UTRUTR 33’’UTRUTR

7-Me-G-Cap7-Me-G-Cap 3’ Poly A Tail3’ Poly A Tail

Doug Brutlag 2011

ESTs, Full Length cDNAUniGene & RefSeq Databases

Transcript

mRNA

Gene

Intron IntronExon Exon ExonPromoter Terminator

5’ 3’

3’ ESTs5’ ESTs

Full Length cDNA

Splicing

Doug Brutlag 2011

ESTs, Full Length cDNAUniGene & RefSeq Databases

Transcript

mRNA

Gene

Intron IntronExon Exon ExonPromoter Terminator

5’ 3’

3’ ESTs5’ ESTs

Full Length cDNA

Splicing

Proteins

5’ UTR

5’ UTR 3’ UTR

3’ UTR

Protein

Doug Brutlag 2011

GENSCAN Gene Modelhttp://genes.mit.edu/GENSCAN.html

Hidden Markov models of gene

structure

Doug Brutlag 2011

PromotersE Additional Data

Genomic DNAAssembled contigs

A Mapping uniSTS dbSNP

MouseESTs

HumanESTs

C Expression Data

Entrez GeneMouse

RefSeqMouse

UniGeneHuman

RefSeqHuman Ensembl

cDNA

Genome Databases

nrPROD Protein Similarity pFAM Motifs

GrailEXPB Gene Prediction GenScan FGENESH FGENESH+ GeneMark

F Summary Entrez Gene UCSC Browser Ensembl

Doug Brutlag 2011

Entrez Gene Locihttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene

NR-Pro

Genscan

GrailEXP

FGENESH

Entrez Gene

UniGene

ESTs

Doug Brutlag 2011

Alternative Splicing GeneratesDistinct Proteins in Different

Tissues

Transcript

mRNA-1

Gene

Intron IntronExon Exon ExonPromoter Terminator

5’ 3’

Transcript

mRNA-25’ 3’

Alternate Splicing

Splicing

Doug Brutlag 2011

NCBI Genomeshttp://www.ncbi.nlm.nih.gov/sites/entrez?db=genome

Doug Brutlag 2011

Eukaryote Genome Projectshttp://www.ncbi.nlm.nih.gov/genomes/leuks.cgi

Doug Brutlag 2011

Canis lupus familiaris Genomehttp://www.ncbi.nlm.nih.gov/sites/entrez?db=bioproject&cmd=Retrieve&dopt=Overview&list_uids=10726

Doug Brutlag 2011

NCBI Entrez Genehttp://www.ncbi.nlm.nih.gov/sites/entrez?db=gene

Doug Brutlag 2011

NCBI Entrez Gene: Human Opsinhttp://www.ncbi.nlm.nih.gov/gene?term=human%20opsin

Doug Brutlag 2011

Entrez Gene: Human Opsin OPN1MW

http://www.ncbi.nlm.nih.gov/gene/2652

Doug Brutlag 2011

Entrez Gene: Human Opsin OPN1MW

http://www.ncbi.nlm.nih.gov/gene/2652

Doug Brutlag 2011

MapViewer: Human Opsin OPN1MW

http://www.ncbi.nlm.nih.gov/gene/2652

Doug Brutlag 2011

Evidence Viewer for OPN1MWhttp://www.ncbi.nlm.nih.gov/sutils/evv.cgi?taxid=9606&contig=NT_167198.1&gene=OPN1MW&lid=2652&from=4366022&to=4380289

Doug Brutlag 2011

OMIM Home Pagehttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM

Doug Brutlag 2011

Colorblindness in OMIM

Doug Brutlag 2011

Colorblindness in OMIMhttp://omim.org/entry/303800

Doug Brutlag 2011

Human Genome Resourceshttp://www.ncbi.nlm.nih.gov/genome/guide/human/

Doug Brutlag 2011

RefSeqhttp://www.ncbi.nlm.nih.gov/RefSeq/

Doug Brutlag 2011

RefSeq Genehttp://www.ncbi.nlm.nih.gov/refseq/rsg/

Doug Brutlag 2011

NCBI UniGenehttp://www.ncbi.nlm.nih.gov/sites/entrez?db=unigene

Doug Brutlag 2011

NCBI Homologene Databasehttp://www.ncbi.nlm.nih.gov/homologene

Doug Brutlag 2011

Comparative Genomics

Doug Brutlag 2011

Ensembl Home Pagehttp://www.ensembl.org/

Doug Brutlag 2011

EBI Genomes Home Pagehttp://www.ensembl.org/

Doug Brutlag 2011

Ensembl Human Genomehttp://www.ensembl.org/Homo_sapiens/

Doug Brutlag 2011

Ensembl Human Opsin Searchhttp://uswest.ensembl.org/Homo_sapiens/Search/Results?species=Homo_sapiens;idx=;q=opsin

Doug Brutlag 2011

Ensembl Human Opsin Geneshttp://uswest.ensembl.org/Homo_sapiens/Search/Results?species=Homo_sapiens;idx=;q=opsin

Doug Brutlag 2011

Ensembl Human OPN1MW Gene

http://uswest.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000147380;r=X:153448107-153461633

Doug Brutlag 2011

Ensembl Opsin OPN1MW Gene Location

http://uswest.ensembl.org/Homo_sapiens/Location/View?h=Havana%20gene;r=X:153448107-153461633#r=X:153448109-153461632

Doug Brutlag 2011

Ensembl OPN1MW Transcriptshttp://uswest.ensembl.org/Homo_sapiens/Location/View?h=Havana%20gene;r=X:153448107-153461633#r=X:153448109-153461632

Doug Brutlag 2011

Ensembl OPN1MW Opsin Proteinhttp://uswest.ensembl.org/Homo_sapiens/Transcript/ProteinSummary?db=core;g=ENSG00000147380;r=X:153448107-153461633;t=ENST00000369935

Doug Brutlag 2011

Ensembl Tutorialshttp://uswest.ensembl.org/info/website/tutorials/index.html

Doug Brutlag 2011

UCSC Genome Home Pagehttp://genome.ucsc.edu/

Doug Brutlag 2011

UCSC Genome Browserhttp://genome.ucsc.edu/cgi-bin/hgGateway

Doug Brutlag 2011

UCSC Genome Browserhttp://genome.ucsc.edu/cgi-bin/hgGateway

Doug Brutlag 2011

UCSC Genome Browserhttp://genome.ucsc.edu/cgi-bin/hgTracks?position=chrX:153485203-153499469&hgsid=216983641&knownGene=pack&hgFind.matches=uc004fkd.2,

Doug Brutlag 2011

UCSC Genome Browserhttp://genome.ucsc.edu/cgi-bin/hgTracks?position=chrX:153485203-153499469&hgsid=216983641&knownGene=pack&hgFind.matches=uc004fkd.2,

Doug Brutlag 2011

UCSC Proteome Browserhttp://genome.ucsc.edu/cgi-bin/pbGateway

Doug Brutlag 2011

UCSC Proteome Browserhttp://genome.ucsc.edu/cgi-bin/pbGateway

Doug Brutlag 2011

UCSC Help Filehttp://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html