Variation and the VEP: Ensembl Online Webinar series

42
Denise CarvalhoSilva Ensembl Outreach team European Molecular Biology Laboratory European Bioinforma9cs Ins9tute Ensembl online training series 2016

Transcript of Variation and the VEP: Ensembl Online Webinar series

Page 1: Variation and the VEP: Ensembl Online Webinar series

Denise  Carvalho-­‐Silva  Ensembl  Outreach  team  

European  Molecular  Biology  Laboratory  

European  Bioinforma9cs  Ins9tute  

 Ensembl  online    

training  series  2016      

Page 2: Variation and the VEP: Ensembl Online Webinar series

Course  Objec=ves  

What  is  Ensembl?    

What  type  of  data  can  you  get  in  Ensembl?    

 

How  to  navigate  the  Ensembl  browser  website?    

 

How  to  connect    with  Ensembl  

Page 3: Variation and the VEP: Ensembl Online Webinar series

This  online  course  Date   Webinar  topic   Instructor  

24th  March  

Introduc9on  to  Ensembl   Emily  Perry  

31st  March  

Ensembl  genes   Denise  Carvalho-­‐Silva  

7th  April   Data  export  with  BioMart   Helen  Sparrow  

14th  April  

Varia=on  data  in  Ensembl  and  the  Ensembl  VEP   Denise  Carvalho-­‐Silva  

21st  April  

Comparing  genes  and  genomes  with  Ensembl  Compara   Helen  Sparrow  

28th  April  

Finding  features  that  regulate  genes  –  the  Ensembl  Regulatory  Build  

Emily  Perry  

5th  May   Uploading  your  data  to  Ensembl  and  advanced  ways  to  access  Ensembl  data  

Ben  Moore  

hSp://www.ebi.ac.uk/training/events/2016/ensembl-­‐online-­‐training-­‐series-­‐2016    

Page 4: Variation and the VEP: Ensembl Online Webinar series

Our  Polls:  finding  more  about  you  

• Previous  webinars    

Introduc9on,  Genes  and  Transcripts,  BioMart  

• Poll  1:  ASendance  • Poll  2:  Exercises  

Page 5: Variation and the VEP: Ensembl Online Webinar series

Structure  for  this  hour  webinar  

Presenta=on:  SNPs,  CNVs,  SV  available  in  Ensembl  VEP:  tool  for  variant  annota9on  

Demo:  View  variants/  

run  the  VEP  

Exercises:  On  the  train  online  course  

Page 6: Variation and the VEP: Ensembl Online Webinar series

Ques=ons?  

• We’ve  muted  all  the  microphones  •  Ask  ques9ons  in  the  Chat  box  in  the  webinar  interface  

• My  Ensembl  colleagues  will  respond  during  my  talk  

•  Please  respond  with  @username  

Helen  Sparrow  Ben  Moore  Emily  Perry  

Page 7: Variation and the VEP: Ensembl Online Webinar series

EBI is an Outstation of the European Molecular Biology Laboratory.

Compara9ve  Genomics  Gene  models  

Regula9on  Varia9on  

Custom  data  display  Programma9c  access  

Toolkit  

Ensembl  Features  

Page 8: Variation and the VEP: Ensembl Online Webinar series

EBI is an Outstation of the European Molecular Biology Laboratory.

Compara9ve  Genomics  Gene  models  

Regula9on  Varia9on  

Custom  data  display  Programma9c  access  

Toolkit  

Ensembl  Features  

Page 9: Variation and the VEP: Ensembl Online Webinar series

EBI  is  an  Outsta9on  of  the  European  Molecular  Biology  Laboratory.    

Module  4:    Gene=c  Varia=on  in  Ensembl  

Page 10: Variation and the VEP: Ensembl Online Webinar series

Outline  

•  Classes  of  varia9on,  species  and  sources  

•  Browsing  varia9on  data:  some  entry  points  Loca9on  tab  Gene  tab  Varia9on  tab  

 •  Phenotype  and  popula9on  gene9cs  data  

•  How  to  annotate  your  own  variants  

 

Page 11: Variation and the VEP: Ensembl Online Webinar series

1)  Large  scale:  structural  (>  50  base  pairs)  

Gene=c  varia=on  

duplica9on   dele9on   inversion   transloca9on   loss  

2)  Short  scale:  SNPs  (or  SNVs),  indels  

G   A   C   T  G   A   C   T  A   T   C   G   G  G   G  T  T   T  C  C   C  A   A   A  

G   A   A   T  G   A   C   T  T   T   C   G   G  -­‐   G  -­‐  T   T  C  C   -­‐  A   A   A  

Page 12: Variation and the VEP: Ensembl Online Webinar series

Species  with  varia=on  data  

Understand  the  types  of  gene9c  varia9on  data  and    how  to  view  them  in  the  context  of  our  genomes  

Page 13: Variation and the VEP: Ensembl Online Webinar series

Sources  of  varia=on  data  

•  Import  alleles  and  frequencies  

•  Annotate  variants  

hSp://www.ensembl.org/info/docs/varia9on/sources_documenta9on.html  

Page 14: Variation and the VEP: Ensembl Online Webinar series

Loca=on  tab:  across  a  region  

SVs  SNPs  

Ensembl  genes  

Page 15: Variation and the VEP: Ensembl Online Webinar series

Gene  tab:  gene-­‐centric  SNPs  

SVs  

Page 16: Variation and the VEP: Ensembl Online Webinar series

Varia=on  tab:  variant  centric  

summary  data  

SNP  or    SV  

Page 17: Variation and the VEP: Ensembl Online Webinar series

Variants  on  the  karyotype  

Page 18: Variation and the VEP: Ensembl Online Webinar series

Phenotype  data  in  Ensembl  species  and  sources  

Page 19: Variation and the VEP: Ensembl Online Webinar series

Popula=on  data  for  variants  

hSp://hapmap.ncbi.nlm.nih.gov/  

hSp://www.1000genomes.org  

Page 20: Variation and the VEP: Ensembl Online Webinar series

pie  charts:    1KG  super  popula9ons  

Human  Popula=on  Gene=cs  

Page 21: Variation and the VEP: Ensembl Online Webinar series

Coffee  intake  is  a  worldwide  phenomenon    

with  Finland  at  the  top,  and  UK  in  the  44th    

place.  Is  caffeine  consump9on  in  our  genes?  

A)  What   are   the   chromosome   loca9ons   of   variants   associated  with  this  phenotype?  

B)  Which  variant  has  got  the  most  significant  associa9on?  

C)  What   is   the  ancestral  allele  of  this  variant?   Is   it  conserved   in  eutherian  mammals?  

D)  What  is  the  most  frequent  allele  in  Great  Britain?  

E)  Can   you   download   this   variant   and   200   nt   upstream   and  downstream  flanking  sequence  in  RTF  (Rich  Text  Format)?  

Live  demo  

Page 22: Variation and the VEP: Ensembl Online Webinar series

You  can  annotate  your    SNPs  and  SVs  too!  

 •  Variant  Effect  Predictor  

•  Different  input  formats  

•  SIFT/PolyPhen  for  missense  variants  

PMID:  20562413    

Perl  script  Web  interface   REST  API  

XML  

Page 23: Variation and the VEP: Ensembl Online Webinar series

CODING  Synonymous    

INTRONIC  5’  UTR    

ATG   AAAAAAA  

Regulatory    

Splice  sites  

CODING  Missense  

3’  UTR    5’  Upstream     3’  downstream    

Mapping  variants  on  transcripts    

Iden9fy  transcripts  that  overlap  variants  and  predict  the  consequence   of   these   on   Ensembl   (or   RefSeq)  transcripts  using  

Page 24: Variation and the VEP: Ensembl Online Webinar series

Consequence  terms  for  variants    

www.ensembl.org/info/genome/varia9on/predicted_data.html#consequence_type_table  

*  defined  by  the  Sequence  Ontology  (SO)  project  (hSp://www.sequenceontology.org/)  

Page 25: Variation and the VEP: Ensembl Online Webinar series

Consequence:  missense  GAG  >GGG  Glu  >  Gly  

SIFT sift.jcvi.org/

PolyPhen-2 genetics.bwh.harvard.edu/pph2/ Condel

dbNSFP

Page 26: Variation and the VEP: Ensembl Online Webinar series

Ensembl  tools  hSp://www.ensembl.org/tools.html  

hSp://www.ensembl.org/vep  

Page 27: Variation and the VEP: Ensembl Online Webinar series

Inpu[ng  data  into  the  

Chromosome  Start  End  Alleles  Strand  

Page 28: Variation and the VEP: Ensembl Online Webinar series

Output  op=ons  in  the    

GAG > GGG Glu > Gly

GAG > GAA Glu > Glu

Page 29: Variation and the VEP: Ensembl Online Webinar series

Queued  Running  Done  Failed  

 Save  to  your  account  (log  in)        Edit  and  resubmit  your  job  

   Delete  job  

Ticket  system  in  the    

Ticket  iden9fier   Job  name  

Page 30: Variation and the VEP: Ensembl Online Webinar series

Viewing  the                      results  

SO  consequence  terms*  

*hSp://www.sequenceontology.org/index.html  

Page 31: Variation and the VEP: Ensembl Online Webinar series

ensembl.org/info/docs/tools/vep/online/results.html#summary  

Table  •  Before  /  aper  filtering  •  novel  /  exis9ng  variants  

Pie  charts  (consequence  terms)    •  total  observed  (more  than  one  per  variant)  •  Separate  chart:  coding  consequences  

Viewing  the                      results  

Page 32: Variation and the VEP: Ensembl Online Webinar series

Navigate  results  (one  row  per  variant/  transcript  overlap)  

 

Show/hide  columns  in  results  table   more  columns:  scroll  right  

•  Download  results  •  Send  results  to  BioMart  

Create  and  edit  filters  

ensembl.org/info/docs/tools/vep/online/results.html#table  

results  table  

Page 33: Variation and the VEP: Ensembl Online Webinar series

Filters  consist  of  three  components  Field    •  e.g.  Consequence,  biotype  

Operator  •  e.g.  is,  matches  (par9al  string  matches)  

Value  •  the  value  to  compare  against  •  some  fields  have  autocomplete  values  

Mul9ple  filters  allowed  with  logical  rela9onship  (AND,  OR)  Ac9ve  filters  can  be  edited  too!  

ensembl.org/info/docs/tools/vep/online/results.html#filter  

Filtering  the      e            results  

Page 34: Variation and the VEP: Ensembl Online Webinar series

I’ve  got  a  list  of  gene9c  variants  from  my  resequencing  project  of  a  cohort  study  of  breast  cancer  in  Cambridge.  The  posi9ons  are  all  on  chromosome  9,  GRCh37  assembly:  

131084628  C/A  (posi9ve  strand)  

131085358  C/G  (posi9ve  strand)  

131085196  G/A  (posi9ve  strand)  

1)  Do  any  of  these  cause  a  change  at  the  amino  acid  level?  

2)  Are  these  predicted  to  be  deleterious?  

3)  Can  I  get  the  flanking  sequence  (200  nucleo9de  both  up  and  downstream)  

for  the  known  variants  in  this  set?  

Tutorial:  VEP  

Page 35: Variation and the VEP: Ensembl Online Webinar series

Tutorial:  VEP  1)  Have  I  got  genomic  coordinates?  Ensembl  default  format  

9  131084628  131084628  C/A  +  9  131085358  131085358  C/G  +  9  131085196  131085196  G/A  +  

Page 36: Variation and the VEP: Ensembl Online Webinar series

VEP                video  

hSp://9nyurl.com/vep-­‐video  

Page 37: Variation and the VEP: Ensembl Online Webinar series

Things  to  bear  in  mind  

1)  No   dis9nc9on   between   polymorphisms   and   muta9ons.  Excep9on  HGMD  and  COSMIC:  all  muta9ons;  

2)  C/T   à   first   allele   is   the   one   in   the   reference   genome,   not  necessarily  the  major  or  the  ancestral;    

3)  Ensembl   reports   all   alleles   on   the   forward   strand   (different  from  dbSNP).    

Page 38: Variation and the VEP: Ensembl Online Webinar series

Next  webinar  –  Comparing  genes  and  genomes  

Ensembl  allows  you  to  perform  detailed  analysis  of  gene  models  between  species.          During  this  webinar  we  will  take  a  look  at  the  gene  trees  and    homologues  of  a  set  of  genes,  and  at  whole  genome  alignments  between  pairs  and  groups  of  species.    See  you  next  week,  same  9me.  

Helen  Sparrow  

Page 39: Variation and the VEP: Ensembl Online Webinar series

Course  exercises  hSp://www.ebi.ac.uk/training/online/course/ensembl-­‐

browser-­‐webinar-­‐series-­‐2016  

This  text  will  be  replaced  by  a  YouTube  (link  to  YouKu  too)  video  of  the  webinar  and  a  pdf  of  the  slides.  

The  “next  page”  will  be  the  exercises  

A  link  to  exercises  and  their  solu9ons  will  appear  in  the  page  

hierarchy   The  “previous  page”contains  the  solu9on  of  previous  modules’  exercises  

Page 40: Variation and the VEP: Ensembl Online Webinar series

Get  help  with  the  exercises  

•  Use  the  exercise  solu9ons  in  the  online  course  

•  Join  our  Facebook  group  and  discuss  the  exercises  with  everybody  (see  the  online  course  for  the  link)  

•  Email  us  [email protected]  

Page 41: Variation and the VEP: Ensembl Online Webinar series

Connect  with  Ensembl  

?  ?  ?  ?  ?

?  ?

?  ?  ?

[email protected]  

www.youtube.com/user/EnsemblHelpdesk    www.ensembl.org/info/genome/genebuild/index.html  

 

Page 42: Variation and the VEP: Ensembl Online Webinar series

Acknowledgements  The  En=re  Ensembl  Team  

Funding

Co-funded by the European Union