Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

39
Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS

Transcript of Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

Page 1: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

Gene 760

Jun Lu, PhD

2013-02-25

SMALL RNA ANALYSIS

Page 2: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

OVERVIEW• Small RNA Basics

• Types of Small RNAs

• miRNAs and Other Small RNAs

• Chemical Structures of Small RNAs

• Non-templated Modification

• Small RNA Deep Sequencing

•Other Methods to Quantify miRNAs

• Data Analysis

Page 3: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA BASICSTYPES OF SMALL RNAS• miRNAs and its precursors

• piRNAs

• Endogenous and exogenous siRNAs

• snoRNAs and its derivatives

• tRNA and its derivatives

• Transcriptional start site associated small RNAs

• Enhancer Associated RNAs (eRNAs)

• Repeat associated small RNAs

• Many other types of small RNAs (often without deep understanding)

• Breakdown products from longer RNAs

• Artificial biochemical products

Page 4: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

Primary miRNA

Precursor miRNA

mature miRNA

Winter et al. Nat Cell Bio 2009

MICRORNAS ARE PROCESSED FOR MATURATION

Ago Proteins

Page 5: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA BASICSMIRNAS• The same mature miRNA can be produced from multiple loci in the genome

Hsa-let-7a-1, chr 9

Hsa-let-7a-2, chr 11

Hsa-let-7a-3, chr 22

Page 6: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA BASICSMIRNAS• Sequence Isoforms (Length, Position(start, end))

Page 7: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

PIRNAS• PIWI-interacting RNAs

• Generally larger than miRNAs (~26 to 31 bases; different size range in different species)

Khurana et al, JCB 2010

Page 8: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA BASICSTYPES OF SMALL RNAS

Rother and Meister, Biochimie 2011

Page 9: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA BASICSTYPES OF SMALL RNAS—ARTIFICIAL REACTION PRODUCTS• Example: HITS-CLIP

Chi et al. Nature 2009

Page 10: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA BASICSCHEMICAL STRUCTURES• RNaseIII products have 5’ phosphate group, and 3’ OH group

• But not all small RNAs have the same chemical structure

• Without 5’ phosphate

• 5’ Gppp cap instead of 5’ phosphate

• 2’-OMe modification at 3’ end

5’-P OH-3’

Page 11: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA BASICSNON-TEMPLATED MODIFICATIONS• 3’ Tailing

• Single or mutliple nucleotide additions, such as U addition at the end

• Can be based on target as a template—but not the generating locus as a template

• RNA editing

• ADAR enzymes

• A->I->reverse transcribe as if it is G

Page 12: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

OVERVIEW• Small RNA Basics

• Small RNA Deep Sequencing

• Ligation-mediated Amplification

• Illumina Small RNA Library Preparation

• Considerations when using the Standard Library Prep Protocol

• Alternative Bench-Level Preparations and Choices in Sequencing Parameters

•Other Methods to Quantify miRNAs

• Data Analysis

Page 13: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

miRNAs5’-P OH-3’

5’-P B

B

SMALL RNA DEEP SEQUENCINGLIGATION-MEDIATED AMPLIFICATION

T4 RNA Ligase, ATP

OH-3’

5’-P

BT4 RNA Ligase, ATP

RTB

PCR

3’ Adaptor

5’ Adaptor

Gel-PurifyProduct

Gel-PurifyProduct

Page 14: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA DEEP SEQUENCINGGEL PURIFICATION TO AVOID ADAPTOR DIMER

OH-3’5’ Adaptor 5’-P B

T4 RNA Ligase, ATP

3’ Adaptor

B

RT-PCR

Page 15: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

miRNAs5’-P OH-3’

5’-P B

B

SMALL RNA DEEP SEQUENCINGUSE OF PRE-ADENYLATED 3’ ADAPTOR

T4 RNA Ligase, ATP

5’-P

3’ Adaptor

Self-circularization Product

App B

T4 RNA Ligase 2 Truncated, no ATP

3’ Adaptor

Pre-adenylated 3’ Adaptor

Page 16: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA DEEP SEQUENCINGCURRENT ILLUMINA WORKFLOW

5’-P OH-3’

App B

B

T4 RNA Ligase, ATP

OH-3’

5’-P

BT4 RNA Ligase, ATP

RTB

PCR

3’ Adaptor

5’ Adaptor

B5’-P

Tabacco Acid Pyrophosphatase

B

Total RNAOr PurifiedSmall RNA

Page 17: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA DEEP SEQUENCINGCONSIDERATIONS WHEN USING STANDARD LIB PREPARATION• Rely on the presence of 5’phosphate (depending on the need of analysis)

• Use of pyrophosphatase may introduce some capped small RNAs

• T4 RNA Ligase has some sequence preferences for substrates; T4 RNA Ligase 2 Truncation/mutations may have a different spectrum of sequence preference—sequencing reads do not 100% reflect relative abundance

• Use of total RNA or purified small RNAs may generate quantitatively different profiles

Page 18: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SMALL RNA DEEP SEQUENCINGALTERNATIVES AND SEQUENCING PARAMETERS• Gel purification of small RNAs with a specific size range (use denaturing polyacylamide

gel)

• Phosphatase treat + T4 polynucleotide kinase to capture small RNAs without 5’ phosphorylation

• Use polyA tailing + RT instead of using a sequence-specific 3’-adaptor

• Length of sequencing run

• 50 bases single end sequencing is common on Illumina

Page 19: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

OVERVIEW• Small RNA Basics

• Small RNA Deep Sequencing

•Other Methods to Quantify miRNAs

• Microarray

• qRT-PCR

• Data Analysis

Page 20: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

OTHER METHODS OF MIRNA QUANTIFICATION• Microarrays

• Use ligation-mediated amplification to label miRNAs

• E.g. with a biotinylated primer during PCR

• Use other labeling techniques (use different criteria)

Agilent Method

Page 21: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

OTHER METHODS OF MIRNA QUANTIFICATION• qRT-PCR

• Key-lock-like RT strategy

• PolyA tailing strategy

ABI Method

Qiagen Method

Page 22: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

OVERVIEW• Small RNA Basics

• Small RNA Deep Sequencing

•Other Methods to Quantify miRNAs

• Data Analysis

• Existing Tools

• Adaptor Removal

• Mapping

• Quantification of Expression

• Small RNAs other than miRNAs

Page 23: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

• miRDeep

• miRDeep2

• miRCat

• miRAnalyzer

• miRTools

• And others

DATA ANALYSISAVAILABLE TOOLS

Page 24: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISAVAILABLE TOOLS—MIRDEEP2• Run under Unix/Linux environment

• Perl-based

• Utilize Bowtie (v1) for mapping and RNAfold for folding RNA structures

Page 25: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISSTEP 1: REMOVE ADAPTORS• This is quite unique to small RNA sequencing analysis, because what you sequence is

short RNAs

miRNASequencing Primer

50 bases

5’ Adaptor 3’ Adaptor

Page 26: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISSTEP 1: REMOVE ADAPTORS—DETAILS MATTER• Adaptors were not synthesized to 100% purity!

• Standard miRDeep2 package allows removing only a single adaptor sequence.

• Match first 6 bases of the adaptor to each sequence after 18 nt

• If there is no match, sequentially match 5, 4, 3, 2, 1 of adaptor bases to the end of each read.

• Some issues of such an algorithm

• Single adaptor removal may lead to loss of reads and change of size distribution

• 6nt match may to be short, and may cut off real RNA sequences.

• Ignored small RNAs less than 18 nt in length, which may be helpful to understand small RNA mechanisms

• Artificially create reads in the 47, 48, 49 bp range due to non-stringent adaptor matches at the end of reads

Page 27: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISSTEP 1: REMOVE ADAPTORS• Single adaptor removal drawbacks

• Lose ~ 16 % of reads in the following example, can distort size distribution for specific small RNAs

• TAGCTTATCAGACTGATGTTGACT 533006 reads

• TAGCTTATCAGACTGATGTTGACTTGGACTTCTCGGGTGCCAAGGAACTC 87857 reads

• Different ratios of adaptor-variants for different small RNAs, likely a sequence-dependent phenomenon

• AACCCGTAGATCCGAACTTGTGA 666783 reads

• AACCCGTAGATCCGAACTTGTGATGGACTTCTCGGGTGCCAAGGAACTCC 69 reads

• 0.01%

Page 28: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISSTEP 1: REMOVE ADAPTORS• Adaptors were not synthesized to 100% purity!

• Standard miRDeep2 package allows removing only a single adaptor sequence.

• Single adaptor removal drawbacks

• Modification

• 1. allow removing 2 (or more) adaptor sequence variants.

• 2. use a user-defined length of adaptor for sequence match (e.g. 10nt)

• 3. no limitation on the size of small RNA to be 18nt or more; instead, give user the option to define it.

• 4. do not remove end bases if there are only 3 or fewer nt matches to adaptor, again user definable for this cutoff.

Page 29: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

15 20 25 30 35 40 45

-5

0

5

10

15

20

Length (Nt)

% D

iffer

ence

from

miR

Dee

p2

DETAILS MATTER!BY REMOVING ONE EXTRA ADAPTOR VARIANT

0 10 20 30 40 50 600

2000000

4000000

6000000

8000000

10000000

12000000

miRDeep2Modified

Length (Nt)

# of

read

s

Page 30: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISMAPPING• Many identical reads for the same RNA, often associated with miRNAs.

• E.g TCGTACGACTCTTAGCGG x5733052 times in one run (~10% of all reads!)

• Reducing reads by “collapsing” reads of the same seq can significantly save time in alignment

• Can reduce seqs by >20 fold—depending on miRNA abundance in cell

• Can align to different regions on the genome—i.e. not unique in mapping

• If sequence is too short, it may generate too many hits in the genome

• Consider non-templated modifications

• Non-templated tailing in small RNAs

• Need to distinguish tailing vs. adaptor impurity

• RNA Editing

Page 31: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISMAPPING

• Bowtie or Bowtie2

• Mapping to known small-RNA-generating-sequence collections

• E.g. precursor miRNA collection (downloadable from miRBASE)

• Or snoRNA collections, or tRNA collections

• Benefit:

• can reduce mapping time;

• can allow all non-unique mapping instances;

• Can tolerate more mismatches for understanding of non-templated modifications

• Drawback: can only inform those at known loci

• Mapping to genome directly

• Can help interpret modifications vs imperfect mapping conditions

• Can help identify new small RNA regions

Page 32: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISMAPPING• What the mapping cannot tell:

• If there are RNA editing events, since many small RNAs have defined starting sites, it may be more difficult to differentiate between real RNA editing vs sequencing or PCR introduced errors.

• If one miRNA can come from multiple loci, it is not possible to differentiate which loci the small RNA come from, even though it is possible to tell the opposite strand.

Hsa-miR-125b-1

Hsa-miR-125b-2

Page 33: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISQUANTIFICATION OF EXPRESSION• Problem---how to normalize sequencing data? Can be especially problematic for small

RNA data

0 Hour 12 Hour

Page 34: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISQUANTIFICATION OF EXPRESSION• Problem---how to normalize sequencing data?

0 Hour 12 Hour

Page 35: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISQUANTIFICATION OF EXPRESSION• Problem---how to normalize sequencing data?

• Use total reads to normalize—most commonly used but may introduce artifacts.

• Assume total/mean miRNA is the same

• Quantile normalization

• Use Spike-in controls

• Spike-in controls are artificial small RNA sequences that can be used as “loading controls”

• Spiked into initial RNA samples

• Multiple spike-in RNAs should be used simultaneously to avoid relying on a single sequence to normalize data

Page 36: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISQUANTIFICATION OF EXPRESSION• How to summarize given positional variations

• Allow some flanking bases for tolerance

• Depending on the aim of the analysis (e.g. seed sequence)

Page 37: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISSMALL RNAS OTHER THAN MIRNAS• Use transcriptional start site associated small RNA as an example

• Adaptor removal

• Collapse reads based on sequence

• Map to known small RNA generating loci

• Map the leftover sequences to genome

• Align the mapped positions relative to transcriptional start sites

Page 38: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

DATA ANALYSISSMALL RNAS OTHER THAN MIRNAS• Use transcriptional start site associated small RNA as an example

Page 39: Gene 760 Jun Lu, PhD 2013-02-25 SMALL RNA ANALYSIS.

SUMMARY• Small RNA Basics

• Variations associated with small RNAs

• Small RNA Deep Sequencing

• Biochemical reactions determine interpretation of analysis

•Other Methods to Quantify miRNAs

• Useful in validating results

• Data Analysis

• Key steps in processing small RNA data

• Pay attention to details in bench and bioinformatic methods