2014 wcgalp

25
Exploring Marek’s Disease Resistance with RNAseq C. Titus Brown Michigan State University

description

WCGALP talk, August 18th, 2014.

Transcript of 2014 wcgalp

Page 1: 2014 wcgalp

Exploring Marek’s Disease Resistance with RNAseq

C. Titus BrownMichigan State University

Page 2: 2014 wcgalp

Genetic resistance to Marek’s Disease

• MHC (B) locus has a major influence on MD resistance

• Several haplotypes of B locus have been found to correlate with resistance– B21 most resistance– B19 susceptibility

• Lines 6 and 7 (ADOL*) are B2 homozygous, but line 6 is resistant and line 7 is susceptible to MD

• Relatively few non-MHC genes have been identified*Avian disease and Oncology Laboratory, East Lansing

Page 3: 2014 wcgalp

Research Goal

• Identify non-MHC genes influencing MD resistance from a genome-wide gene and isoform expression analysis based on RNA-Seq data

• Generate hypotheses for studying the mechanism controlling MD resistance

Collaboration with Hans Cheng (ADOL) and Jerry Dodgson (MSU)

Dr. Likit Preeyanon

Page 4: 2014 wcgalp

Research PlanGCCGCGGTTCCGTGGTT

ACCGCGGTGGTGGTTACCGCGTTTGTGGTT

ACCGCGGTGGTGGTTACCGCGGTCCGTGGCC

CCCGCGGTGGTGGTT

Differential Gene Expression

Pathway Analysis

A B C D

B CA D

Differential Exon Usage

Lines 6 and 7Control and infected (4 dpi)

Single-endand Paired-end

Illumina Sequencing

Dr. Likit Preeyanon

Page 5: 2014 wcgalp

RNA-Seq MethodAAAAAAAA

AAAAAAAA

AAAAAAAA

AAA

AAA

AAA

Fragmented and sequenced

Short reads (<200bp)

Adapted from Shirley et al Nat Methods 2009

Page 6: 2014 wcgalp

Gene models and isoforms are woefully incomplete –e.g. ENSEMBL missing many exon-exon junctions.

De novo reconstruction

Ab initio reconstructionDr. Likit Preeyanon

Page 7: 2014 wcgalp

GIMME: Software for Merging Gene Models

Assembly-based

Local Assembly

GIMME

Reference-guided

MergedModels

In-house software

Dr. Likit PreeyanonDr. Likit Preeyanon

Page 8: 2014 wcgalp

Merged Gene ModelsGlobal Assembly

Local Assembly

Reference-guided

Merged (consensus) Model

Newly predicted isoform

Page 9: 2014 wcgalp

Merged models connect fragmented gene models & provide new isoforms

Merged models can glue fragmented gene models and

include unannotated isoforms.

Gene BGene A

Gene A

Reference-guided

Merged model

Page 10: 2014 wcgalp

IDH3A Gene – now with both UTRs!

Merged

RefSeq

ENSEMBL

UTR

Page 11: 2014 wcgalp

IDH3A– different models, different predicted expression…

SE : single-end, PE: paired-end

Not signif..

Signif

Page 12: 2014 wcgalp

Differentially Expressed Genes from Different Gene Model Sets …Differ.

DE genes by EBseq FDR < 0.05

Ref-guided

Ref-guided

Page 13: 2014 wcgalp

In addition, many of the diff expr genes are not annotated in KEGG

Ref-guided

Page 14: 2014 wcgalp

GOseq FDR 0.05

Chicken + HumanKEGG Pathway

40 pathways

Must merge in human KEGG

annotations

Page 15: 2014 wcgalp
Page 16: 2014 wcgalp

Enriched KEGG Pathways by GOSeq

GOseq FDR < 0.05

Page 17: 2014 wcgalp

Biological Processes (BP) categories involved in Adaptive Immune Responses are Enriched in Line 7 (susceptible)

GO ID Description Adjusted p-value

0009615 Response to virus 0.00023

0050670 Regulation of lymphocyte proliferation

0.00048

0002252 Immune effector process 0.00068

0051249 Regulation of lymphocyte activation

0.0027

0042129 Regulation of T cell proliferation

0.0032

0002250 Adaptive immune response 0.0106

At early stage of infection, elicitation of the adaptive immune responsesappears to be delayed in line 6.

Page 18: 2014 wcgalp

Isoform Expression Estimation

Gene Expression = 400x

20%

80%

Gene Expression = 405x

2%

98%

Sample A

Sample B

Page 19: 2014 wcgalp

How to Estimate Isoform ExpressionSpliced reads

Page 20: 2014 wcgalp

Differential Exon Usage of ITGB2 Gene from MISO

Spliced reads

Percent Spliced In (Ψ)

Read coverage

Page 21: 2014 wcgalp

Genes with predicted differential splicing can be categorized into four groups

Cutoff = 0.2

6 Ctrl

6 Inf

7 Ctrl

7 Inf

1

1

1

1

0

0

0

0

Group I

11 Genesψ

1

1

1

1

0

0

0

0

Group II

19 Genesψ

1

1

1

1

0

0

0

0

Group III

20 Genesψ

0 1

0 1

0 1

0 1

Group IV

1 Genesψ

Page 22: 2014 wcgalp

The main point

• We are completely at the mercy of annotations to interpret our large-scale data.

• Need more experimental information!• But also, better methods => better signal

Page 23: 2014 wcgalp

Concluding thoughts (I)

• Computational analysis of high-throughput sequencing data can help refine hypotheses, but cannot conclusively resolve mechanism.

• Don’t knock “refining hypotheses”, though! Complex biological phenomena like disease are refractory to simplifying assumptions.

Page 24: 2014 wcgalp

Concluding thoughts (II)

• Much of the -omic data being gathered by all of you has utility far beyond your specific research question.

• This is particularly true in “semi-model” organisms where annotations are generally poor and not species-specific, and where there may be significant intra-species variation.

• How can we better share this data, to make faster and better progress?

Page 25: 2014 wcgalp

Where should we spend our –omics money?

• Improving genomes is still expensive and requires significant technical expertise.

• mRNAseq is inexpensive, broadly useful and wonderful for building better gene models.

• Proteomics and metabolomics?• Better tools, annotation, and data sharing and

exploration portals are critically important to the future of (agricultural genomics.

Thanks!