Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the...

30
Training materials Ensembl training materials are protected by a CC BY license http://creativecommons.org/licenses/by/4.0/ If you wish to re-use these materials, please credit Ensembl for their creation If you use Ensembl for your work, please cite our papers http://www.ensembl.org/info/about/publications.html

Transcript of Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the...

Page 1: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

Training materials

• Ensembl training materials are protected by a CC BY license • http://creativecommons.org/licenses/by/4.0/• If you wish to re-use these materials, please credit Ensembl for

their creation• If you use Ensembl for your work, please cite our papers • http://www.ensembl.org/info/about/publications.html

Page 2: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

EBI is an Outstation of the European Molecular Biology Laboratory.

Annotating your variants:

Ensembl Variant Effect Predictor (VEP)

Helen Sparrow

Ensembl

EMBL-EBI

2nd November 2016

Page 3: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Course materials

http://tinyurl.com/VEPCrete

• VEP Presentation

• VEP Coursebook (screenshots of demo)

• VEP Exercises

Page 4: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

Ensembl Features

- Gene builds for ~70 species

- Gene trees

- Regulatory build

- Variation display and VEP

- Display of user data

- BioMart (data export)

- Programmatic access via the APIs

- Completely Open Source

Page 5: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

What is the VEP?

Determine the effect of variants (SNPs, insertions, deletions, CNVs or structural variants):

- Variant Coordinates- VCF- HGVS- Variant IDs

- Affected gene, transcript and protein sequence

- Pathogenicity

- Frequency data

- Regulatory consequences

- Splicing consequences

- Literature citations

Page 6: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

What is the VEP?

Perl scriptWeb interface REST API

XML

ensembl.org/Tools/VEP rest.ensembl.org

Page 7: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and REST API

● Over 5,000 species

● Input

● Transcript sets

● Regulatory regions

● Known Variants

● Plugins

● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

Page 8: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input● Transcript sets

● Regulatory regions

● Known Variants

● Plugins

● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

Web: 50mb (~2 million variants)

Script: unlimited variants

Page 9: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input● Transcript sets

● Regulatory regions

● Known Variants

● Plugins

● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

● VCF● rsID● HGVS● BED● Pileup

Page 10: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input

● Transcript sets● Regulatory regions

● Known Variants

● Plugins

● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

GENCODE

GENCODE Basic

RefSeq

GENCODE & RefSeq

Page 11: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input

● Transcript sets

● Regulatory regions● Known Variants

● Plugins

● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

The Ensembl Regulatory Build:● ENCODE● BLUEPRINT● NIH Epigenomics Roadmap

Can be limited to regulatory regions observed in specific cell types.

Page 12: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input

● Transcript sets

● Regulatory regions

● Known Variants● Plugins

● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

● dbSNP● Cosmic● Clinvar● ESP● HGMD-Public● Phencode

Page 13: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input

● Transcript sets

● Regulatory regions

● Known Variants

● Plugins ● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

E.g.● Splicing predictions● Loss of Function predictions● Expression levels across

transcripts

Anything - customisable!

Page 14: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input

● Transcript sets

● Regulatory regions

● Known Variants

● Plugins

● Allele frequencies● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

● 1000 Genomes

● ESP

● ExAC projects

● GnomAD - coming soon!

Page 15: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input

● Transcript sets

● Regulatory regions

● Known Variants

● Plugins

● Allele frequencies

● Associated phenotype, disease or trait● Clinical significance states

● Important projects which use VEP

● OMIM● Orphanet● GWAS Catalog● others

Page 16: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input

● Transcript sets

● Regulatory regions

● Known Variants

● Plugins

● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states ● Important projects which use VEP

Assigned by ClinVar

Page 17: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Features of VEP● Web, Perl script and API

● Over 5,000 species

● Input

● Transcript sets

● Regulatory regions

● Known Variants

● Plugins

● Allele frequencies

● Associated phenotype, disease or trait

● Clinical significance states

● Important projects which use VEP

● 1000 Genomes● ExAC● DECIPHER● OpenTargets● LRG● GnomAD

Page 18: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

Your own variant dataVariant coordinates 1 881907 881906 -/C +

5 140532 140532 T/C +12 1017956 1017956 T/A +2 946507 946507 G/C +14 19584687 19584687 C/T -

HGVS notation ENST00000285667.3:c.1047_1048insC5:g.140532T>CNM_153681.2:c.7C>TENSP00000439902.1:p.Ala2233AspNP_000050.2:p.Ile2285Val

VCF #CHROM POS ID REF ALT20 14370 rs6054257 G A20 17330 . T A20 1110696 rs6040355 A G,T20 1230237 . T .

Variant IDs rs41293501COSM327779rs146120136FANCD1:c.475G>Ars373400041

Page 19: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Variant types

1) Small scale in one or few nucleotides of a gene

• Small insertions and deletions (DIPs or indels)

• Single nucleotide polymorphism (SNP)

A G A C T T G A C C T G T C T - A A C T G G AT G A C T T G A C - T G T C T G A A C G G G A

2) Large scale in chromosomal structure (structural variant)

• Copy number variants (CNV)

• Large deletions/duplications, insertions, translocations

deletion duplication insertion translocation

Page 20: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Variant consequences

ATG AAAAAAA

Regulatory

3’ UTRIntronic

CODINGMissense

CODINGSynonymous

Splice site5’ Upstream 5’ UTR 3’ Downstream

● Identify transcripts that overlap the coordinates of the variants - Gencode or RefSeq or BOTH

● Predict the consequences of the variants

Page 21: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://www.ensembl.org/info/docs/variation/predicted_data.html

Consequence terms

Page 22: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Missense variants- pathogenicity

SIFT PolyPhen1

0

0.05Deleterious

Tolerated

1

0

0.2Probably damaging

Benign

0.1Possibly damaging

Page 23: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

VEP plugins

• Plugins add extra functionality to the VEP

• They may extend, filter or manipulate the output of the VEP

• Plugins may make use of external data or code

• Available on the web tool and with the script

Page 24: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

http://tinyurl.com/VEPCrete

Pathogenicity Prediction Plugins

• dbNSFP - annotation database for missense SNPs

• Condel - consensus deleteriousness from SIFT and PolyPhen

• LoFtool - ranks susceptibility to disease based on Loss of Function to synonymous variants in ExAC data

Page 25: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

Hands on

We have identified four variants on human chromosome nine, an A deletion at 128328461, C->A at 128322349, C->G at 128323079 and G->A at 128322917.

We will use the Ensembl VEP to determine:- Whether my variants have already been annotated

in Ensembl- What genes are affected by my variants?- Do any of my variants affect gene regulation?

Page 26: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

Questions?

Page 27: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

Help and documentationCourse online http://www.ebi.ac.uk/training/online/subjects/11

Tutorials www.ensembl.org/info/website/tutorials

Videos

www.youtube.com/user/EnsemblHelpdesk

Email us [email protected]

Page 28: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

• Invite one of our outreach team to teach at your institution for free (except trainer’s expenses)

• E-mail us: [email protected]

Browser Course

½ - 2 day course on the Ensembl browser, aimed at wet-lab scientists. 1-2 trainers.

API course

2-4 day course on the Ensembl APIs (Perl or REST) aimed at bioinformaticians. 1-4 trainers.

Host a FREE Workshop!

Page 29: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

AcknowledgementsThe Entire Ensembl Team

Funding

Co-funded by the European Union

Page 30: Ensembl training materials are protected by a CC BY ...€¦ · EBI is an Outstation of the European Molecular Biology Laboratory. Annotating your variants: Ensembl Variant Effect

Training materials

• Ensembl training materials are protected by a CC BY license

• http://creativecommons.org/licenses/by/4.0/• If you wish to re-use these materials, please

credit Ensembl for their creation• If you use Ensembl for your work, please cite our

papers • http://www.ensembl.org/info/about/publication

s.html