Epigenomics - Washington University Genetics...

118
Epigenomics Bio5488 Ting Wang 2.15, 2.17/2016

Transcript of Epigenomics - Washington University Genetics...

Epigenomics

Bio5488

Ting Wang

2.15, 2.17/2016

Outline

• Epigenetic phenomenon– A tale of two mice.

– Where is the queen bee?

– The Monarch!

• What is epigenetics/epigenomics?

• Epigenetic mechanisms

• Epigenome in health, in disease and with environment

• Modern epigenomic technology and resources

A tale of two mice

Yellow mouse

• High risk of cancer,

diabetes, obesity;

• Reduced lifespan

Agouti mouse

• Low risk of cancer,

diabetes, obesity;

• Prolonged lifespanMaternal supplements

with Zinc, methionine,

betaine, choline, folate,

B12

Morgan, Whitelaw,

1999

Waterland, Jirtle,

2004

Where is the Queen?

Larvae

Queen Worker

Figure 1. CpG (o/e) bias of protein-coding regions in the honey bee genome. Since the profiles for both queens and workers are virtuallyidentical, only the queen profile is shown.doi:10.1371/journal.pbio.1000506.g001

Figure 2. Comparison of CpG methylat ion profiles in dif ferentially methylated genes generated by two technologies, Solexagenome-wide shotgun sequencing and 454 sequencing of PCR produced amplicons. The ‘‘heat maps’’ represent the 454 sequencing ofPCRamplified segments, whereas the bars illustrate the Solexa reads. The eight nuclear genes for this experiment were chosen from the list of DMGsshown in Tables 3 and S2, taking into account the availability of convenient CpG-containing regions for primer design. Six genes are shown in thisfigure and the others in Figure 3. Gene annotations: GB18602 - membrane protein; GB18207 - cadherin; GB15132 - TAP42 (TORsignaling); GB14848 -clathrin assembly protein; GB15356 - syd, chromosome segregation; GB11061 - seryl-tRNA synthetase.doi:10.1371/journal.pbio.1000506.g002

Brain Methylomes in Honey Bees

PLoS Biology | www.plosbiology.org 4 November 2010 | Volume 8 | Issue 11 | e1000506

Workers are more methylatedInhibition of Dnmt3 phenocopy Royal Jelly

The King of Butterflies

The Monarch

Same Genome

Different Epigenome

Different Phenotype

What is Epigenetics/Epigenomics?

• A mitotically or meiotically heritable state of different gene activity

and expression (phenotype) that is independent of differences in

DNA sequence (genotype) – based on Conrad Waddington, 1942

• The sum of the alterations to the chromatin template that

collectively establish and propagate different patterns of gene

expression (transcription) and silencing from the same genome.

• Epigenetic changes influence the phenotype without altering the

genotype.

• While epigenetics often refers to the study of single genes or

sets of genes, epigenomics refers to more global analyses of

epigenetic changes across the entire genome.

The EpigenomeThe complete number, location, and types of

epigenetic modifications that occur in a given cell.

Conrad Hal Waddington

(1905–1975)

Developmental biologist

Paleontologist

Geneticist

Embryologist

Philosopher

Founder for systems biology

Epigenetic Landscape

Inheritance, broad definition of

epigenetics

Epigenetics Mechanisms

Gene Expression

RNA Interference

Histone Modifications

Nucleosome PositioningDNA Methylation

• DNA methylation

– Normal cells: role in gene expression and chromosome stability

– Cancer cells: consequences of aberrant hypo- and hyper-methylation

• Histone modification

– Normal cells - the histone code

– Cancer cells - consequences of altered histone modifying enzymes

• Interaction between DNA methylation, histone modifications and small RNAs

• Cell/tissue type specificity

• Gene/Environment interaction, disease susceptibility

Epigenetic mechanisms

What is DNA Methylation?

The 5th base

The mystery of DNA demethylation

DNA methylation is not distributed evenly

in the mammalian genome

• In human somatic cells, 60%-80% of all CpGs

(~1% of total DNA bases) are methylated

– Most methylation is found in “repetitive” elements

• “CpG islands”, GC-rich regions that possess a high density of CpGs, remain methylation-free

– The promoter regions of ~70% of genes are embedded in CpG islands

DNA Methylation is Heritable

me C

me C

G

G

SAM

SAH

DNMT

5’

3’

3’

5’

me C G

G C me

TT G

C

A

me

One-carbon

donors

DNMT: DNA methyltransferaseSAM: S-adenosyl-methionineSAH: S-adenosyl-L-homocysteine

Two classes of DNA methyltransferases

(DNMTs)

Jones and Liang, 2009

Nature Review Genetics

DNMT1 maintenance lethal

DNMT2 remains mystery, possibly a RNA methyl-transferase

DNMT1o maternal imprints lethal*

DNMT3a de novo lethal

DNMT3b de novo lethal

DNMT3L maternal imprints lethal *

(indirect)

function knock-out

Two classes of DNA methyltransferases

(DNMTs)

Normal pattern and function of

DNA Methylation

Function of DNA Methylation in

Mammalian System

• Host defense - endogenous parasitic sequence

(repeats, etc.)

• Imprinting

• X chromosome inactivation

• Heterochromatin maintenance, chromosome

stability, telomere length

• Gene expression controls

DNA methylation changes during

developmental epigenetic reprogramming

Normal Patterns of DNA Methylation

1 2 1 2 3

CpG islands and gene expression

Pericentromeric regions - stability

Transcription

Methylated CpG

Unmethylated CpG

DNA methylation pattern across a gene

structure

Lister et. al, Nature 2009

Methylation and Gene Expression

Gene

Gene

Transcription

No TranscriptionMethylated

CpG dinucleotides

Unmethylated

Mechanisms of gene silencing by

methylation

Direct mechanism:

Inhibition of transcription factor binding (eg.CTCF, UBF)

Not a universal mechanism since not all transcription factor binding sites contain CG dinucleotides

Indirect mechanism:

Inhibition mediated by methyl-CpG binding proteins MeCP1/ MeCP2

Recruitment of corepressor complexes including histone deacetylases(HDAC)

Change in chromatin conformation

MBD

me

Protein Mutant phenotype

MeCP2 Rett syndrome

MBD1

MBD2 behaviour abnormalities

MBD3 lethal

MBD4 increased mutation frequency

(a glycosylase for mismatch repair)

Methylated DNA binding proteins

DNA methylation in cancer

Point mutations

Deletions

CpG island hypermethylation

Chromosomal instability

Gene amplification

Genomic hypomethylation

Translocations

Copy numberchanges

Tumor Suppressor Gene Inactivation in

Human Cancer

Normal Tumor

Methylation

x

Mutation

And

Deletion

Homozygous

Deletion

CH3

Knudsen’s 2-hit hypothesis, 1971

Aberrant DNA Methylation in Cancer

1 2 1 2 3

CpG islands hypermethylation

Pericentromeric hypomethylation

STOP

Methylated CpG

Unmethylated CpG

Robertson, Nature Reviews Genetics, Vol6, 597

Different regions of the genome are

hypermethylated or hypomethylated in cancer

Hypomethylation and Cancer

• The overall level of 5-methylcytosine is decreased in a proportion of human tumors. (Gama-Sosa et al. 1983)

• Hypomethylation of c-Ha-ras and c-Ki-ras, two cellular oncogenes, was identified in primary colon tumors. (Feinberg and Vogelstein 1983)

• Loss of IGF2 imprinting increases colorectal cancer risk (Cui et al. 2003)

• Consequences of Hypomethylation?

– Chromosome instability (ICF syndrome)

– Transcriptional activation of retrotransposons

– Transcriptional activation of oncogenes

Methylation promotes mutation

Jones & Baylin (2002)

DirectIndirect

DNA methylation and other

diseases

-- Imprinting Disorder:

• Beckwith-Wiedemann syndrom (BWS)

• Prader-Willi syndrome (PWS)

• Transient neonatal diabetes mellitus (TNDM)

-- Repeat-instability diseases

• Fragile X syndrome (FRAXA)

• Facioscapulohumeral muscular dystroph

-- Defects of the methylation machinery

• Systemic lupus erythemtosus (SLE)

• Immunodeficiency, centromeric instability and

facial anomalies (ICF) syndrome

Imprinting Diseases:

Angelman and Prader-Willi Syndromes

Angelman Syndrome

• “Happy puppet”

• Severe mental retardation

• Absence of speech

• Happy disposition

• Excessive laughing

• Hyperactive, with jerky repetitive motions

• Red cheeks, large jaw and mouth

Prader Willi Syndrome

• Small hands and feet

• Underactive gonads,

tiny external genitals

• Short stature

• Mentally retarded

• Slow-moving

• Compulsive overaters

• Obese

Imprinting Diseases:

Angelman and Prader-Willi Syndromes

Chr15 deletion, ~4Mb

Loss of maternal contribution

UBE3A

Angelman Syndrome

Loss of paternal contribution

SNRPN

Prader-Willi Syndrome

Genomic imprinting

• Imprinting is unique to mammals and flowering plants. In mammals, about 1% of genes are imprinted.

• For imprinted genes, one allele is expressed and the other is silent.

• This is typically controlled epigenetically. The expressed alleles are unmethylated and associated with loosely packed chromatin.

• Imprinted genes bypass epigenetic reprogramming.

• Imprinting is required for normal development

Why imprinting?• The Genetic Conflict Hypothesis

– Many imprinted genes are involved in growth and metabolism.

– Paternal imprinting favors the production of larger offspring, and maternal imprinting favors smaller offspring

• Imprinted genes are under greater selective pressure.– No back up!

– Any variation in single gene is expressed

– Closely related species have different imprinting patterns

• Liger and Tigon

• Imprinted genes are sensitive to environmental signals.

What we don’t know about imprinting

• What targets a gene for imprinting?

– Why are some genes expressed from both alleles and other expressed from only one allele?

• How are the imprints imposed?

– Do males and females have different mechanisms for imprinting genes?

Rett Syndrome

• X-linked trait

• Mainly girls affected

• Normal at birth

• At 6-18 months, begin

losing purposeful

movement

• Persistent wringing of

hands

• Loss of speech, gait

• Mental retardation

ensues

Nature Genetics (1999) 23:127-128

Rett’s is due to defect in MeCP2

• Methyl-cytosine binding protein 2 (MeCP2) binds methylated DNA and recruits binding of a histonedeacetylase

• Normal role is tightening chromatin packing, leading to gene silencing

Mouse model for Rett’s

• Mice have a gene that is homologous to MeCP2

• Knocking out the gene in mouse gives a phenotype similar to human Rett’s

• This model offers good experimental system for studying the human disease

Male mice with MeCP2 knockout

develop normally for a while (middle),

but at 6 weeks of age, they begin to

develop neurological symptoms, such

as hindlimb clasping (right).

Natu

re G

enetics (

2001)

27:3

32

-336

Why is the phenotype neurological?

• The phenotype suggests that the targets are genes in the brain

• Normal neurological differentiation requires silencing of MeCP2 gene target(s)

• The target(s) of MeCP2 are not known

DNA Methylation with age,

environment

Differences in DNA methylation patterns between

identical twins increase with aging

Fraga et alPNAS 102:10604, 2005

Maternal Bisphenol A (BPA) Exposure

• Monomer that makes up polycarbonate plastic

• Endocrine active compound

• Found in commonly used products

• Present in 95% of humans tested

• Some animal studies reveal negative health outcomes

p-value =0.0074

Yellow SlightlyMottled

Mottled HeavilyMottled

PseudoAgouti

Perc

ent

of

Offsp

ring

0

5

10

15

20

25

30

35

Control Diet (n=60)

BPA Diet (n=73)

Maternal BPA Exposure Results: Coat Color Distribution

OHOH

HO

O

O

OHOH

HO

OOHOH

HO

O

O

OHOH

HO

OOHOH

HO

O

O

OHOH

HO

OOHOH

HO

O

O

Maternal Genistein Supplementation

• Plant phytoestrogen

• Found in soy and soy products

• Selective estrogen receptor modulator

• Worldwide exposure varies by diet

• Chemoprevention and decreased adipocyte deposition

Maternal Genistein SupplementationResults: Coat Color Distribution

0

10

20

30

40

Yellow SlightlyMottled

Mottled HeavilyMottled

PseudoAgouti

Perc

ent

of

Offsp

ring

Control (n=52)

Genistein Diet (n=44)

p-value =0.0005

Dolinoy, D et al (2006) Environ Health Perspect

Maternal Nutritional SupplementationResults: Genistein – Coat Color

Control Diet (n=60)

BPA Diet (n=73)

Perc

ent

of

Offsp

ring

0

5

10

15

20

25

30

35

Genistein Supplemented BPA Diet (n=39)

Yellow Slightly Mottled Heavily Pseudo-Mottled Mottled agouti

p-value=0.96 (control versus genistein supplementation)

DNA methylation

+

Histone modification

Chromatin

- 2 each of histones:

H2A,H2B, H3 and H4

Chromatin

DNA plus Protein in cells with nuclei

146 bp of DNA

Nucleosome

The Nucleosome core particle

Nucleosome

H3

H4

http://www.nature.com/nsmb/journal/v14/n11/images/nsmb1337-F1.gif

Post-translational Histone Modifications

Post-translational Histone Modifications

H3 tail Modifications:

Active

HDACsHATs

=Acetylation

=Methylation

KMTases

Repressive

Li e. al. (2007) Cell 128, 707

Histone Modifications in Relation to Gene Transcription

Histone Modifications in Relation to Gene Transcription

SRPK1 SLC26A8

MAPK14

Bisulfite-Seq

H3K27ac

H3K4me1

H3K4me3

H3K36me3

H3K27me3

H3K9me3

RefSeq genes

Predicting non-coding RNA?

• From sequence?

– Not clear which properties can be exploited

– Sequence features such as promoters are too

weak

• Histone modifications + conservation worked

DNA methylation mediated repression

Repression independent of DNA methylation

H3K9 methylation

“condensed” chromatin

H3K27 methylation mediated repression

1. H3K27 methylation

1. DNA methylation

Mechanisms of Epigenetic Crosstalk

“Epigenetic cancer therapy”

DNA-methylation and HDAC inhibitors

in clinical trials

Summary

• Dnmt1, Dnmt3A, Dnmt3b - the mammalian DNMTs

• Chromatin structure is influenced by covalent

modification of histone tails

• Multiple chromatin modification pathways involved in silencing

of genes which may show “crosstalk” with DNA methylation

How to detect epigenetic marks?

Restriction Landmark Genome Scanning (RLGS)

NotI

NotI

NotI

NotI

NotI

NotI

UnmethylatedPartial

MethylationHomozygous

Methylation

EcoRV, 1-D, HinfI, 2-D

1st-D

2nd-D

1st-D

2nd-D

1st-D

2nd-D

NotINotI

NotI

NotI digest endlabeling

Scaling to high coverage, high resolution

• Enrichment based methods– MeDIP-seq

– MBD-seq/MethylMiner

• Restriction enzyme based methods – MRE-seq

– HELP, Methyl-MAPS, Methyl-seq

• Bisulfite based methods– MethylC-seq

– RRBS, bisulfite padlock

• Direct reading of modified nucleotides– SMRT

– Nanopore sequencing

Enriching for methylated DNA

targets

CCGG

GGCC

CCGG

GGCC

Me

Me

Me MeMe

Me

MeMe

Sonicate gDNA

Size selection, end repair

Adapter ligation

Denaturation

Size selection

IP

MeDIP-seq and MBD-seq

5’ CpG islands

are unmethylated

3’ CpG island is

partially methylated

Methylated

Unmethylated

Typical MeDIP data on a genome browser

Taking advantage of methylation

dependent restriction enzymes

MRE-seq

CCGG

GGCC

CCGG

GGCC

Me

Me

CCGG

GGCC

C CGG

GGC C

Me

MeDigest gDNA (here with Hpa II)

Combine parallel digest

Size selection, end repair

Adapter ligation

Size selection

5’ CpG islands

are unmethylated

3’ CpG island is

partially methylated

Methylated

Unmethylated

Typical MeDIP/MRE data on a genomebrowser

Scale

chr3:

CpG Islands

CpG sites

5 kb

37033000 37034000 37035000 37036000 37037000 37038000 37039000 37040000 37041000EPM2AIP1

MLH1

MLH1

MLH1

MLH1

hMLH1

MLH1

MLH1

MLH1

Endo Normal Medip Density

Endo Normal MRE CpG

Endo 1099 Medip Density

Endo 1099 MRE CpG

Endo 1113 Medip Density

Endo 1113 MRE CpG

Endo 1647 Medip Density

Endo 1647 MRE CpG

Endo 2242 Medip Density

Endo 2242 MRE CpG

Endo 2249 Medip Density

Endo 2249 MRE CpG

Endo 2263 Medip Density

Endo 2263 MRE CpG

MLH1 Promoter

Normal

Type II

Type I

Allele-specific methylation at imprinted genes

Chr 7:

SNURF/SNURP

MeDIP-seq

MRE-seq

Chr 7:

MEST

MeDIP-seq

MRE-seq

Chr 18:

SMAD4

MeDIP-seq

MRE-seq

imprinted

imprinted

not

imprinted

DMR

DMR

Gold standard: bisulfite

sequencing

Bisulfite sequencing

Methylation call

ACGGGCTTACTTGCTTTCCTACGGGCTTACTTGCTTTCCTACGGGCTTACTTGCTTTCCTACGGGCTTACTTGCCGGGTTTATTTGCTTTTTTATGGGCTGGGTTTATTTGCTTTTTTATGGGCTGGGTTTATTTGCTTTCCTATGGGCCGGGCTTATTTGCTTTCCTATGGGCCGGGCTTATTTGCTTTCCTATGGGC

3/5 0/5

60% methylated 0% methylated

CpG islands

98% of the genome

1 CpG/100bp

majority methylated

<2% of the genome

1 CpG/10bp short stretches (~1000bp)

majority unmethylated

RRBS: Reduced RepresentationsAllow Enrichment of CpG Dinucleotides

Direct detection of modified

nucleotides

Single-molecule, real-time (SMRT) sequencing

Nanopore sequencing

Modern DNA MethylomicsD

ete

ctio

n R

eagent

Sequencin

g P

latfo

rm

Microarray Massively Parallel Beads Arrays

Sequencing

Methylation-sensitive

Restriction Digest

Bisulfite

(C to T;mC to C)

5-meC Antibody

Immunoprecipitation

Methylated DNA

Affinity Column

Methylation-sensitive

Restriction Digest

MRE-seqMicroarray Massively Parallel Beads Arrays

Sequencing

Methylation-sensitive

Restriction Digest

Bisulfite

(C to T;mC to C)

5-meC Antibody

Immunoprecipitation

Methylated DNA

Affinity Column

5-meC Antibody

Immunoprecipitation

Methylated DNA

Affinity Column

MeDIP-seqMBD-seq

(MethylMiner)Microarray Massively Parallel Beads Arrays

Sequencing

Methylation-sensitive

Restriction Digest

Bisulfite

(C to T;mC to C)

5-meC Antibody

Immunoprecipitation

Methylated DNA

Affinity Column

Bisulfite

(C to T;mC to C)

Microarray Massively Parallel Beads Arrays

Sequencing

Methylation-sensitive

Restriction Digest

Bisulfite

(C to T;mC to C)

5-meC Antibody

Immunoprecipitation

Methylated DNA

Affinity Column

MethylC-seq

RRBS

Direct sequencing

(Pol. kinetics)

SMRT

Nanopore

CpG coverageCpG island

coverage

Resolution

(bp)

Illumina

Lanes

Genome

total28 M 28 K NA NA

BS

shot gun26 M 27 K 1

207 (2009)

10 (now)

RRBS 0.2-1M 15 K 1 1

MRE-seq

and

MeDIP-seq

25 M 27 K 1 and 200 8 (2009)

1 (now)

Golden-

Gate1,500 800 1 NA

Infinium27,500

480,000 (2012)12,000 1 NA

Comparison of Methylome coverage

Predicting single CpG methylation level with

Conditional Random Field (methylCRF)

95

Theoretical MRE

fragments

MeDIP

MRE

Infinium

RRBS

Lister BS

GIS BS

methylCRF

Technologies for Interrogating Chromatin States

ChIP-chip

Antibody specific

to one type of

histone

modification

Histone Modifications

ChIP-seq

Deep sequencing

Chromatin-IP Sequencing

K4me1K4me2K4me3K27me3 “repressive”

“active”

A. Align reads

B. Infer positions of ChIP fragments

C. Count fragments at each genomic position

K4me3

K27me3

Silent developmental geneTranscribed gene

K9me3

K20me3

Constitutive heterochromatin

FoxP1 Olig1

Olig1

K4me3

K27me3

‘Poised’ developmental gene

Histone methylation and transcriptional state

Nucleosome Positioning from Histone ChIP-seq

• Barski et al, Cell 2007– Nucleosome resolution ChIP-seq of 21 histone

marks in CD4+ T-cells

– Total 185.7 M

25 nt tags sequenced

– Analysis not at

nucleosome resolution

to map nucleosomes

at specific regions

Antibody for

MNase digest

Combine Tags From All ChIP-Seq

Extend Tags 3’ to 150 nt

Check Tag Count Across Genome

Take the middle 75 nt

DNase I hypersensitivity ~ Regulatory DNA

Inaccessible Inaccessible

Accessible

Precise delineation of the accessible regulatory DNA compartment

Digital DNaseI profiling

Digital DNaseI profiling: direct access to

regulatory sequences

ATAC-seq(assay for transposase-accessible chromatin)

Epigenetic annotation of GWAS hits

ChromHMM

Transcriptio

n Start SiteEnhance

rDNA

Observed

chromatin

marks.

Called based

on a Poisson

distribution

Most likely

Hidden State

Transcribed Region

1 6 53 4 6 6 6 6 5

1:

3:

4:

5:

6:

5

High Probability Chromatin Marks in State

2:

0.8

0.9

0.9

0.80.7

0.9

200bp

interval

s

All probabilities are

learned from the

data

2

K4me3 K36me3 K36me3 K36me3 K36me3K4me1 K4me3 K4me1

K27ac

0.8

K4me1

K36me3

K27ac

K4me1K4me3

K4me3

K4me1 K4me1

ChromHMM

110

Pro

mo

ter

Tran

scri

be

dA

ctiv

e in

terg

en

icR

epet

itiv

eR

epre

sse

d

Chromatin Marks from (Barski et al, Cell 2007; Wang et al Nature Genetics, 2008); DNAseI hypersensitivity from (Boyle et al, Cell 2008); Expression Data from (Su et al, PNAS 2004); Lamina data from (Guelen et al; Naature 2008)

Application of ChromHMM to 41 chromatin marks in CD4+ T-cells (Barski’07, Wang’08)

3D chromatin (HiC-seq)

Other interesting topics

• RNA component

– RNAi, miRNA, X inactivation, HOTAIR,

PiwiRNA

• Reprogramming

• Cloning

• Population epigenetics

• Evolution of DNA methylation

• Evolution of epigenome

Engaging today’s epigenomic

technologies and resources

WUSTLhttp://VizHub.wustl.edu

Accessing the community resource

Multiple Sclerosis:

• an autoimmune disease that affects the brain and

spinal cord (central nervous system).

• a collaborative GWAS involving 9,772 cases of

European descent collected by 23 research groups

working in 15 different countries

Example – got my GWAS hits! Now what?