Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor...

63
Doug Brutlag 201 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicin Stanford University School of Medicine Genomics, Bioinformatics & Medicine http://biochem158.stanford.edu/

Transcript of Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor...

Page 1: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Next Generation Sequencing andHuman Genome Databases

Doug BrutlagProfessor Emeritus of Biochemistry & Medicine

Stanford University School of Medicine

Genomics, Bioinformatics & Medicinehttp://biochem158.stanford.edu/

Page 2: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Illumina Solexa Sequencing Technology

Page 3: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Emulsion Based Clonal Amplification

Micro-reactors Adapter carrying

library DNAAnneal DNA template

to capture beads

Break micro-reactors Isolate DNA

containing beads

Single test tube generation of millions of clonally amplified sequencing templatesNo cloning and colony picking

“Water-in-oil” emulsion

+ PCR Reagents

+ Emulsion Oil

Perform emulsion PCR

A

B

Page 4: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Pacific Biosciences SMRT Sequencing

Page 5: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Pacific Biosciences Sequencing

Page 6: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Phospholinked Fluorophores

Page 7: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Processive Synthesis

Page 8: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Synthesis of Long Duplex DNA

Page 9: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Circular Templates Gives Redundant Sequencing and

Accuracy

Page 10: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Circular Templates Gives Redundant

Sequencing and Accuracy

Page 11: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ion Torrent Sequencing

Page 12: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ion Torrent Sequencing

Page 13: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ion Torrent Sequencing

Page 14: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

The Human GenomeHow fast is the cost going down?

• 2006: $ 50 million• 2008: $500,000• 2009: $50,000• 2010: $20,000• 2011: $5,000• 2012:??? $1,000

Thanks to Serafim Batzoglou

Page 15: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Archon Genomics X-Prize

Page 16: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Archon Genomics X-Prize

Page 17: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Components of a Typical Human Gene

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminatorTFBS

Page 18: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Active Genes are Transcribed into RNA

PrimaryPrimaryTranscriptTranscript

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminator

Page 19: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

TranscriptTranscript

mRNAmRNA

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminator

55’’ 33’’

SplicingSplicing

Splicing Transcript Yields Mature mRNA

Page 20: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Mature mRNA contains Coding Region and 5’ and 3’ Untranslated

Regions

TranscriptTranscript

mRNAmRNA

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminator

55’’ 33’’

SplicingSplicing

Coding RegionCoding Region55’’UTRUTR 33’’UTRUTR

55’’UTRUTR 33’’UTRUTR

Page 21: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Mature mRNA contains7-Methyl-Guanosine 5’ Cap and 3’ Poly A

Tail

TranscriptTranscript

mRNAmRNA

GeneGene

IntronIntron IntronIntronExonExon ExonExon ExonExonPromoterPromoter TerminatorTerminator

SplicingSplicing

Coding RegionCoding Region55’’UTRUTR 33’’UTRUTR

55’’UTRUTR 33’’UTRUTR

7-Me-G-Cap7-Me-G-Cap 3’ Poly A Tail3’ Poly A Tail

Page 22: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

ESTs, Full Length cDNAUniGene & RefSeq Databases

Transcript

mRNA

Gene

Intron IntronExon Exon ExonPromoter Terminator

5’ 3’

3’ ESTs5’ ESTs

Full Length cDNA

Splicing

Page 23: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

ESTs, Full Length cDNAUniGene & RefSeq Databases

Transcript

mRNA

Gene

Intron IntronExon Exon ExonPromoter Terminator

5’ 3’

3’ ESTs5’ ESTs

Full Length cDNA

Splicing

Proteins

5’ UTR

5’ UTR 3’ UTR

3’ UTR

Protein

Page 24: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

GENSCAN Gene Modelhttp://genes.mit.edu/GENSCAN.html

Hidden Markov models of gene

structure

Page 25: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

PromotersE Additional Data

Genomic DNAAssembled contigs

A Mapping uniSTS dbSNP

MouseESTs

HumanESTs

C Expression Data

Entrez GeneMouse

RefSeqMouse

UniGeneHuman

RefSeqHuman Ensembl

cDNA

Genome Databases

nrPROD Protein Similarity pFAM Motifs

GrailEXPB Gene Prediction GenScan FGENESH FGENESH+ GeneMark

F Summary Entrez Gene UCSC Browser Ensembl

Page 26: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Entrez Gene Locihttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene

NR-Pro

Genscan

GrailEXP

FGENESH

Entrez Gene

UniGene

ESTs

Page 27: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Alternative Splicing GeneratesDistinct Proteins in Different

Tissues

Transcript

mRNA-1

Gene

Intron IntronExon Exon ExonPromoter Terminator

5’ 3’

Transcript

mRNA-25’ 3’

Alternate Splicing

Splicing

Page 28: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

NCBI Genomeshttp://www.ncbi.nlm.nih.gov/sites/entrez?db=genome

Page 29: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Eukaryote Genome Projectshttp://www.ncbi.nlm.nih.gov/genomes/leuks.cgi

Page 30: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Canis lupus familiaris Genomehttp://www.ncbi.nlm.nih.gov/sites/entrez?db=bioproject&cmd=Retrieve&dopt=Overview&list_uids=10726

Page 31: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

NCBI Entrez Genehttp://www.ncbi.nlm.nih.gov/sites/entrez?db=gene

Page 32: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

NCBI Entrez Gene: Human Opsinhttp://www.ncbi.nlm.nih.gov/gene?term=human%20opsin

Page 33: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Entrez Gene: Human Opsin OPN1MW

http://www.ncbi.nlm.nih.gov/gene/2652

Page 34: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Entrez Gene: Human Opsin OPN1MW

http://www.ncbi.nlm.nih.gov/gene/2652

Page 35: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

MapViewer: Human Opsin OPN1MW

http://www.ncbi.nlm.nih.gov/gene/2652

Page 36: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Evidence Viewer for OPN1MWhttp://www.ncbi.nlm.nih.gov/sutils/evv.cgi?taxid=9606&contig=NT_167198.1&gene=OPN1MW&lid=2652&from=4366022&to=4380289

Page 37: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

OMIM Home Pagehttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM

Page 38: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Colorblindness in OMIM

Page 39: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Colorblindness in OMIMhttp://omim.org/entry/303800

Page 40: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Human Genome Resourceshttp://www.ncbi.nlm.nih.gov/genome/guide/human/

Page 41: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

RefSeqhttp://www.ncbi.nlm.nih.gov/RefSeq/

Page 42: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

RefSeq Genehttp://www.ncbi.nlm.nih.gov/refseq/rsg/

Page 43: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

NCBI UniGenehttp://www.ncbi.nlm.nih.gov/sites/entrez?db=unigene

Page 44: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

NCBI Homologene Databasehttp://www.ncbi.nlm.nih.gov/homologene

Page 45: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Comparative Genomics

Page 46: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl Home Pagehttp://www.ensembl.org/

Page 47: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

EBI Genomes Home Pagehttp://www.ensembl.org/

Page 48: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl Human Genomehttp://www.ensembl.org/Homo_sapiens/

Page 49: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl Human Opsin Searchhttp://uswest.ensembl.org/Homo_sapiens/Search/Results?species=Homo_sapiens;idx=;q=opsin

Page 50: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl Human Opsin Geneshttp://uswest.ensembl.org/Homo_sapiens/Search/Results?species=Homo_sapiens;idx=;q=opsin

Page 51: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl Human OPN1MW Gene

http://uswest.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000147380;r=X:153448107-153461633

Page 52: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl Opsin OPN1MW Gene Location

http://uswest.ensembl.org/Homo_sapiens/Location/View?h=Havana%20gene;r=X:153448107-153461633#r=X:153448109-153461632

Page 53: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl OPN1MW Transcriptshttp://uswest.ensembl.org/Homo_sapiens/Location/View?h=Havana%20gene;r=X:153448107-153461633#r=X:153448109-153461632

Page 54: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl OPN1MW Opsin Proteinhttp://uswest.ensembl.org/Homo_sapiens/Transcript/ProteinSummary?db=core;g=ENSG00000147380;r=X:153448107-153461633;t=ENST00000369935

Page 55: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

Ensembl Tutorialshttp://uswest.ensembl.org/info/website/tutorials/index.html

Page 56: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

UCSC Genome Home Pagehttp://genome.ucsc.edu/

Page 57: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

UCSC Genome Browserhttp://genome.ucsc.edu/cgi-bin/hgGateway

Page 58: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

UCSC Genome Browserhttp://genome.ucsc.edu/cgi-bin/hgGateway

Page 59: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

UCSC Genome Browserhttp://genome.ucsc.edu/cgi-bin/hgTracks?position=chrX:153485203-153499469&hgsid=216983641&knownGene=pack&hgFind.matches=uc004fkd.2,

Page 60: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

UCSC Genome Browserhttp://genome.ucsc.edu/cgi-bin/hgTracks?position=chrX:153485203-153499469&hgsid=216983641&knownGene=pack&hgFind.matches=uc004fkd.2,

Page 61: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

UCSC Proteome Browserhttp://genome.ucsc.edu/cgi-bin/pbGateway

Page 62: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

UCSC Proteome Browserhttp://genome.ucsc.edu/cgi-bin/pbGateway

Page 63: Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

Doug Brutlag 2011

UCSC Help Filehttp://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html