ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

21
WormBase beyond www.wormbase.org WormBase ParaSite New home for parasitic worm genomes in WormBase UCSC WormBase assembly hub View current WormBase data on UCSC genome browser

Transcript of ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

Page 1: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

WormBase beyond www.wormbase.org

WormBase ParaSite

• New home for parasitic worm genomes in WormBase

UCSC WormBase assembly hub• View current WormBase data on UCSC genome browser

Page 2: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

WormBase ParaSite

Motivation

• Many (100s) of parasitic nematode genome sequences available/iminent

• Helminth genomes scattered across a number of resources

• Much of data is “draft” quality

Introducing WormBase ParaSite (parasite.wormbase.org)

• Consistent, integrated access to hundreds of parasitic nematode draft genomes

• Encompass all parasitic worms (i.e. nematodes and flatworms)

Page 3: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

WormBase ParaSite genomes (v2)

Nematodes• 63 species (70 genomes)

•Clade I – 7 species (9)

•Clade III – 22 species (24)

•Clade IV – 16 species (16)

•Clade V - 18 species (21)

• Largest and smallest•Teladorsagia circumcincta (700 Mb)

•Parastrongyloides trichosuri (42 Mb)

Platyhelminthes• 25 species (26 genomes)

•Cestodes – 12 species

•Trematodes – 11 species

•Other– 2 species

• Largest and smallest•Spirometra erinaceieuropaei (1250 Mb)

•Hydatigera taeniaeformis (100 Mb)

Orthologs and paralogs• Ensembl “Compara” protein-tree pipeline

• 118 genomes

•9 additional nematode genomes (free living)

•13 comparator genomes

•Including human, mouse, zebrafish

• ~150,000 protein multiple alignments

• ~1000 CPU days

Page 4: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

http://parasite.wormbase.org

Page 5: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

http://parasite.wormbase.org

Page 6: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

ParaSite Downloads

Page 7: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

ftp://ftp.wormbase.org/pub/wormbase/parasite• Consistent file naming and data organisation

• Genome project (NCBI BioProject) disambiguation

• Files for each genome

• Genome fasta(s)

• Protein fasta

• Transcript fasta

• Annotation GFF3

Page 8: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

http://parasite.wormbase.org

Page 9: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

ParaSite Mart

• Table-based data-mining tool

• Like WormMine, but different interface

• Complementary to WormMine

•Less depth for C. elegans, but…

•Comprehensive species set (all nematode genomes)

•Some additional functionality

Page 10: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

ParaSite Mart - orthologs

Page 11: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

ParaSite Mart – sequence extraction

Page 12: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

The UCSC WormBase genome Hub

Background● Many researchers like the UCSC genome browser

○ Familiar interface

○ Comparative genomics (alignments / conservation)

● Worm data at UCSC is 5 years out of date

UCSC hubs● A new mechanism for remote hosting of collections of genome browser tracks

● Emerging standard for cross-browser compatibility

● The WormBase hub

○ View up-to-date WormBase data on UCSC!

○ View some data not viewable anywhere else: genomic alignments

Page 13: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

Nematode genomic alignments

Progressive Cactus (Nguyen et al, 2014)

• New tool (UCSC) for genome multiple alignments (100s

genomes)

• Creates “virtual” ancestor genomes

• Output = HAL file (HDF5 database)

WormBase cactus alignments

• 29 nematode genomes (more in future)

• Viewable on UCSC browser (“SNAKE” tracks)

Page 15: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

UCSC

New Dropdowns:

● Nematodes● core

genomes● WormBase

assembly identifiers

Search:● seq. names● WBGeneIDs● gene symbols

Page 16: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

UCSC

Release Tracks:

● transcripts● current + reference● pseudogenes● ncRNAs● mRNA alignments● WormBase links

Assembly Tracks:● repeats● conservation● comparative hub

Page 17: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

UCSC

Page 18: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

UCSC

Page 19: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

EnsEMBL (and friends)

Development Hub

http://ftp.ebi.ac.uk/pub/databases/wormbase/releases/current-development-release/COMPARATIVE_ANALYSIS/hub/parasite_hub.txt

Production Hub

http://ftp.ebi.ac.uk/pub/databases/wormbase/releases/current-production-release/COMPARATIVE_ANALYSIS/hub/parasite_hub.txt

current:metazoa.ensembl.org

coming soon:parasite.wormbase.orgensembl.org

configure tracks

Page 20: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

… and more

GBrowse

BioDalliance

JBrowse

Page 21: ParaSite Ensembl Genomes and UCSC Assembly Hub WormBase Workshop International Worm Meeting 2015

Summary

WormBase ParaSite

• parasite.wormbase.org

• Poster 952C (Saturday)

UCSC WormBase assembly hub• ftp.ebi.ac.uk/pub/databases/wormbase/releases/current-production-release/

COMPARATIVE_ANALYSIS/hub/hub.txt

• blog.wormbase.org

More information• [email protected]

• Come and see us for a tutorial!