RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing...
Transcript of RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing...
![Page 1: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/1.jpg)
RNA Sequencing
05-06-2013, Elio Schijlen
Next gen insight into transcriptomes
![Page 2: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/2.jpg)
Transcriptome complete set of transcripts in a cell, and their quantity, for a specific developmental stage or physiological condition. Understanding the transcriptome is essential for interpreting the functional elements of the genome The key aims of transcriptomics are: to catalogue all species of transcripts, including mRNAs, non-coding RNAs and small RNAs; to determine the transcriptional structure of genes, in terms of their start sites, 5′ and 3′ ends, splicing patterns and other post-transcriptional modifications; to quantify the changing expression levels of each transcript during development and under different conditions.
![Page 3: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/3.jpg)
Recently, the development of novel high-throughput DNA sequencing methods has provided a new method for both determining, mapping and quantifying transcriptomes. This method, termed RNA-Seq (RNA sequencing) clear advantages over previous approaches is revolutionizing the manner in which eukaryotic transcriptomes are analysed
![Page 4: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/4.jpg)
454
Illumina HiSeq2000
Pacbio RS Ion proton
SOLiD 5500
![Page 5: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/5.jpg)
From next to 3rd generation sequencing
Illumina HiSeq Fluorescent nt scanning
SOLiD Ligation fluorescent oligos
454 Pyrosequencing
Ion proton Hydrogen detection
Pacbio Real time fluorescent detection
![Page 6: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/6.jpg)
From next to 3rd generation sequencing
Illumina HiSeq ssDNA sequence template
• Clonaly amplified into clusters on glass slide (flow cell)
SOLiD 5500 idem
454 ssDNA sequence template
• Clonaly amplified on beads (emPCR)
Ion proton idem
Pacbio dsDNA sequence template
• Single molecule/polymerase molecule complex
![Page 7: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/7.jpg)
Illumina HiSeq2000
Syringe pumps
Reagents
compartment
Optics
Flow cell
access door
Flow cell
8 channels
![Page 8: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/8.jpg)
Illumina HiSeq2000
Library Preparation
DNA (0.1-5.0 μg)
C
C
C
C
A
A
A
T
T
G
G
G
G
Sequencing
Single molecule array
Cluster Growth 5’
5’ 3’
T G T A C G A T C A C C C G A T C G A A
1 2 3 7 8 9 4 5 6
T G C T A C G A T …
Image Acquisition Base Calling
![Page 9: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/9.jpg)
Eusol BACs 177.14 M PF clusters; 33.8 Gb>Q30
Lane Sample ID Sample Ref Index Description Yield (Mbases) % PF # Reads
% of raw clusters per lane
1 lane1 unknown Undetermined
Clusters with unmatched barcodes for lane 1 3,234 87.47 36,608,108 9.74
1 plate10 EUsol_fill_gaps TAGCTT 3,359 94.77 35,088,534 9.34
1 plate1 EUsol_fill_gaps ATCACG 4,150 95.35 43,091,246 11.47
1 plate2 EUsol_fill_gaps CGATGT 3,480 95.66 36,020,422 9.59
1 plate3 EUsol_fill_gaps TTAGGC 3,496 95.27 36,331,200 9.67
1 plate4 EUsol_fill_gaps TGACCA 4,674 95.4 48,508,022 12.91
1 plate5 EUsol_fill_gaps ACAGTG 2,305 93.65 24,365,574 6.49
1 plate6 EUsol_fill_gaps GCCAAT 1,895 94.83 19,783,144 5.27
1 plate7 EUsol_fill_gaps CAGATC 3,366 94.9 35,115,836 9.35
1 plate8 EUsol_fill_gaps ACTTGA 2,592 95.29 26,934,126 7.17
1 plate9 EUsol_fill_gaps GATCAG 3,232 94.59 33,829,830 9.01
![Page 10: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/10.jpg)
SOLiD 5500
![Page 11: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/11.jpg)
454 sequencing technology & workflow
![Page 12: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/12.jpg)
NGS - 454 pyrosequencing raw read
GCTAAG
![Page 13: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/13.jpg)
Ion semiconductor sequencing
![Page 14: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/14.jpg)
Ion Torrent PGM & Proton
![Page 15: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/15.jpg)
3d Gen Sequencing: PacBio
SMRT sequencing
Kb read length
<50,000 reads
<100 Mb
![Page 16: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/16.jpg)
Pacbio sequencing
Phospholinked
Cleavage by DNA polymerase
• Fluorophore clipped off by polymerase
• DNA synthesized is natural
• No steric hindrance or accumulation of
background signal ZMW Zero Mode Waveguide
![Page 17: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/17.jpg)
Sequence read length (raw), quality
Illumina HiSeq fixed 50 or 100 nt, SR and PE
SOLiD 5500 fixed 75 nt
454 range 50-1,000 nt (av~750)
Ion torrent range 50-200 nt (av ~170)
Pacbio range 50-20,000 nt (av ~3-4 kb)
![Page 18: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/18.jpg)
Sequence read quality
Illumina HiSeq HQ reads, systematic errors
• Lower quality 3’ends
• Low GC coverage
SOLiD very HQ reads
• Lower quality 3’ends
454 HQ reads, sytematic errors
• Homopolymer problems
• Clonality
• Lower quality 3’ends
Ion torrent idem, but lower overall quality
Pacbio Low Quality (0.8-0.85)
• Random errors
• No decrease read quality 3’end
![Page 19: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/19.jpg)
Sequence reads & throughput/run
Illumina HiSeq 1.5 E+09 full flowcell, 12days/run
• Up to 550 Gb (2 cells)
SOLiD 5500XL 1.5 E+09 full flowcell, 6days/run
• Up to 240 Gb (2 flow chips)
454 1 E+06 full PTP, 1 day/run
• Up to 1 Gb
Ion torrent 60-80 E+06 ionPI chip, 4 hours/run
• Up to 10 Gb
Pacbio 300,000 (8 cell strip), 1day/run
• Up to 0.75 Gb
![Page 20: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/20.jpg)
Transcript coverage
![Page 21: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/21.jpg)
DNA Samples for sequencing
1
mRNA
Small RNA
Other Apps ChIP-Sequencing
Genomic DNA Active Chromatin
Library preparation: Ligate adapters to both ends of
fragmented nucleic acid
![Page 22: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/22.jpg)
RNA input requirements
RNA: DNA free, RNAse free, non degraded, No contaminants (proteins, polysaccharides)
![Page 23: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/23.jpg)
![Page 24: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/24.jpg)
Protocol variations Fragmentation methods RNA: nebulization, hydrolysis cDNA: sonication, Dnase I treatment Depletion of highly abundant transcripts Positive selection of mRNA . Poly(A) selection or target specific Negative selection. (RiboMinus, RNAseH) Strand specificity Most RNA sequencing is not strand-specific Single-end or Paired-end sequencing
![Page 25: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/25.jpg)
(Illumina) RNA seq workflow
![Page 26: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/26.jpg)
Aligning the millions of reads to a "reference genome". many tools available for aligning genomic reads to a reference genome (sequence alignment tools), however, special attention is needed when alignment of a transcriptome to a genome, mainly when dealing with genes having intronic regions. As discussed above, the sequence libraries are created extracting mRNA using its poly(A) tail, which is added to the mRNA molecule post-transcriptionally and thus splicing has taken place. Therefore, the created library and the short reads obtained cannot come from intronic sequences and thus, when trying to align these short reads to a reference genome, only short reads aligning entirely inside exonic regions will be matched while short reads from exon-exon junction regions will not. Several software packages exist for short read alignment, and recently specialized algorithms for transcriptome alignment have been developed, e.g. TopHat and Cufflinks.
![Page 27: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/27.jpg)
Sequences coverage
A.thaliana:approx 60E+06 mapped reads
result in plateau of unique gene models
expressedm(approx 20,000)
![Page 28: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/28.jpg)
![Page 29: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/29.jpg)
Multi mapped 50nt SR reads (A.thaliana ~5%) can cause inaccurate expressin estimates
Tubulin B chain
reads mapped to reference
genome (gray)
Blue lines intron spanning reads
Histograms read coverage
Blue multimapped contributed
Green unique mapped contributed
Including multimapped artificially
increases expression value
Readmapping 2 genes sharing
genome region by their 3’end on
opposite strands
Multimapped reads derived from +
strand would severly overestimate
expression of – strand gene.
![Page 30: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/30.jpg)
![Page 31: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/31.jpg)
![Page 32: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/32.jpg)
![Page 33: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/33.jpg)
Ekblom et al., 2012 Comparative and Functional Genomics doi:10.1155/2012/281693
![Page 34: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/34.jpg)
Wenger and Galliot BMC Genomics 2013, 14:204 doi:10.1186/1471-2164-14-204
![Page 35: RNA Sequencing - Netherlands Bioinformatics Centre€¦ · From next to 3rd generation sequencing ... Ion proton idem Pacbio dsDNA sequence template •Single molecule/polymerase](https://reader034.fdocuments.net/reader034/viewer/2022042319/5f08572e7e708231d42186c9/html5/thumbnails/35.jpg)
Some considerations The information gathered by RNAseq has similar limitations as other RNA expression analysis pipelines. RNA status dependent • Biological variable: Tissue specific; Time dependent. Triplicates! • During a cell's lifetime and context, its gene expression levels change. • Strongly RNA quality dependent Library prep method dependent Sequencing technology dependent Analysis method dependent Because of this, care must be taken when drawing conclusions from the sequencing experiment. Results must be verified using independent technology