EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular...

49
EBI web resources I: databases and tools Yanbin Yin 1

Transcript of EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular...

Page 1: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

EBI web resources I: databases and tools

YanbinYin

1

Page 2: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

Outline

• IntrotoEBI

• Databasesandwebtools– UniProt– GeneOntology

• HandsonPractice

MOSTMATERIALSAREFROM:http://www.ebi.ac.uk/training/online/course-list

2

Page 3: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

Threeinternationalnucleotidesequencedatabases

3

Page 4: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

TheEuropeanBioinformaticsInstitute (EBI)

Createdin1992aspartof EuropeanMolecularBiologyLaboratory (EMBL)

EMBLwascreatedin1974andisa molecularbiology researchinstitutionsupportedby20EuropeancountriesandAustralia

Wellcome TrustGenomeCampus, Hinxton,Cambridge,UKNeighborofWellcome TrustSangerInstitute

4

Page 5: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

5

http://www.ebi.ac.uk/

Page 6: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

ResearchgroupsinEBI

6

InterPro

UniProt

miRBase

Page 7: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

MajordatabasesinEBI

EMBL-Bank (DNAandRNAsequences)Ensembl (genomes)ArrayExpress(microarray-basedgene-expressiondata)UniProt (proteinsequences)InterPro(proteinfamilies,domainsandmotifs)PDBe (macromolecularstructures)

Others,suchasIntAct (protein–proteininteractions)Reactome (pathways)ChEBI (smallmolecules)IntEnz (enzymeclassification)GO (geneontology)

GenBankGenomeMapView

GEOnr(GenPept)

CDDMMDB

SwissInstituteofBioinformaticsSangerInstitute

7

Page 8: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

8

http://www.ebi.ac.uk/training/online/course/nucleotide-sequence-data-resources-ebi

chromatograms

Page 9: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

9

SequencemightfirstenterENAasSRA (SequenceReadArchive)fragmented sequencereads;itmightbere-submittedasassembledWGS(WholeGenomeShotgun)sequenceoverlapcontigs;itmightbere-submittedagainwithfurtherassemblyasCON(Constructed)sequenceentries,withtheolderWGSentriesbeingconsignedtotheSequenceVersionArchive

Page 10: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

10

Dataisfirstsplitintoclasses,thenitissplitintointersectingslicesbytaxonomy

Page 11: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

UniProt

11

http://www.uniprot.org/help/uniparc

Page 12: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

12

SourcesofannotationfortheUniProtKnowledgebase

Page 13: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

13

Lifeasa ScientificCuratorhttp://www.ebi.ac.uk/about/jobs/career-profiles/scientific-curator

ScientificDatabaseCuratorjob:Cambridge,UnitedKingdomhttp://www.nature.com/naturejobs/science/jobs/589083-hgnc-gene-nomenclature-advisor

Curation generationhttp://cys.bios.niu.edu/yyin/teach/PBB/Bioinformatics%20Curation%20generation.pdf

Page 14: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

Handsonpractice1:UniProt

14

Page 15: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

15

www.uniprot.orghttp://www.uniprot.org/help/abouthttp://www.uniprot.org/docs/uniprot_flyer.pdf

Page 16: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

16

WearegoingtodoIDmapping

Page 17: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

17

http://cys.bios.niu.edu/yyin/teach/PBB/at-id.txt

ChooseAraport hereandUniProtKB here

Page 18: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

18

TheseareUniProt IDs

Page 19: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

19

SelectthePALproteinsandalignthem

Clustal omegaprogramwillbecalledtoaligntheselectedproteinseqsMaytake1mintofinish

Page 20: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

20

ThisistheMSAresultpageToggletheseoptionsonwilladdcolorsinthealignment

Page 21: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

21

GobacktotheproteinlistpageSelectingoneproteinwillenabletheBLASTbutton

ChooseadvancedwillallowtochangeBLASTparameters

Page 22: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

22

Hereyoucanmakechanges

Page 23: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

23

WearegoingtosearchUniProt proteomesforhumanproteinsetClickonAdvancedyouwillseeapop-outwindow

Hereyoucanspecifysearchterms

Page 24: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

24

Clickheretogethelp

Clickheretoopenanewpage

Page 25: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

25

TheGeneOntology(GO)projectisacollaborativeefforttoaddresstheneedforconsistentdescriptionsofgeneproductsindifferentdatabases

Theprojectbeganasacollaborationbetweenthreemodelorganismdatabases, FlyBase (Drosophila),the Saccharomyces GenomeDatabase (SGD)andthe MouseGenomeDatabase (MGD),in1998

Threestructuredcontrolledvocabularies(ontologies)thatdescribegeneproductsintermsoftheirassociatedbiologicalprocesses,cellularcomponentsandmolecularfunctionsinaspecies-independent manner.

Therearethreeseparateaspectstothiseffort:

1,thedevelopmentandmaintenanceoftheontologies themselves;2,theannotation ofgeneproducts,whichentailsmakingassociationsbetweentheontologiesandthegenesandgeneproductsinthecollaboratingdatabases;and3,developmentoftools thatfacilitatethecreation,maintenanceanduseofontologies.

http://geneontology.org/page/documentation

GeneOntology

Page 26: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

26

GOisnotadatabaseofgenesequences,noracatalogofgeneproducts.Rather,GOdescribeshowgeneproductsbehave inacellularcontext.

GOisnotadictatedstandard,mandatingnomenclatureacrossdatabases.Groupsparticipatebecauseofself-interest,andcooperatetoarriveataconsensus.

GOisnotawaytounifybiologicaldatabases(i.e.GOisnota'federatedsolution').Sharingvocabularyisasteptowardsunification,butisnot,initself,sufficient.

GeneOntologycoversthreedomains:

cellularcomponent,thepartsofacelloritsextracellularenvironment;

molecularfunction,theelementalactivitiesofageneproductatthemolecularlevel,suchasbindingorcatalysis;

biologicalprocess,operationsorsetsofmoleculareventswithadefinedbeginningandend,pertinenttothefunctioningofintegratedlivingunits:cells,tissues,organs,andorganisms

ThescopeofGO

Page 27: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

27

ThestructureofGOcanbedescribedintermsofagraph,whereeachGOtermisanode,andtherelationshipsbetweenthetermsareedgesbetweenthenodes.GOislooselyhierarchical,with'child'termsbeingmorespecializedthantheir'parent'terms,butunlikeastricthierarchy,atermmayhavemorethanoneparentterm

http://geneontology.org/page/ontology-structure

Page 28: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

28http://www.ebi.ac.uk/training/online/course/go-quick-tour/what-can-i-do-go

id: GO:0000016 name: lactase activity namespace: molecular_function def: "Catalysis of the reaction: lactose + H2O = D-glucose + D-galactose." [EC:3.2.1.108] synonym: "lactase-phlorizin hydrolase activity" BROAD [EC:3.2.1.108] synonym: "lactose galactohydrolase activity" EXACT [EC:3.2.1.108] xref: EC:3.2.1.108 xref: MetaCyc:LACTASE-RXN xref: Reactome:20536 is_a: GO:0004553 ! hydrolase activity, hydrolyzing O-glycosyl compounds

Page 29: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

29

Enrichmentanalysis:usestatisticalteste.g.FisherexacttestExample:inhumangenomebackground(20,000genetotal),40genesareinvolvedinp53signalingpathway.Agivengenelisthasfoundthat3outof300belongtop53signalingpathway.Then weaskthequestionif3/300ismorethanrandomchancecomparingtothehumanbackgroundof40/20000

http://david.abcc.ncifcrf.gov/helps/functional_annotation.html#E4

Page 30: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

30

UniProt-GOannotation(GOA)

http://www.ebi.ac.uk/training/online/course/uniprot-goa-quick-tour/what-uniprot-goa

Page 31: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

31

The reference usedtomaketheannotation(e.g. ajournalarticle)An evidencecode denotingthetypeofevidenceuponwhichtheannotationisbasedThedateandthecreatoroftheannotation

Gene product: Actin, alpha cardiac muscle 1, UniProtKB:P68032GO term: heart contraction ; GO:0060047 (biological process) Evidence code: Inferred from Mutant Phenotype (IMP) Reference: PMID 17611253Assigned by: UniProtKB, June 6, 2008

UniProt-GOAformat

Page 32: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

32

Ifyouhaveanewgenome/transcriptome sequenced,howdoyouperformaGOannotationforit?

1. FindaclosetmodelorganismwhichhasbeenannotatedbyGO2. BLASTyourdataagainstthisclosestorganism3. TransfertheGOannotationofthebestmatchtoyourquerysequences

Forinstance,ifwewanttoannotateferntranscriptome withGOfunctiondescriptions….

1. FindArabidopsisUniProt proteindataset2. FindtheArabidopsisGOAassociationfile3. BLASTx fernreads(orassembledUniGenes)againsttheUniProt set4. AnalyzeBLASTresulttolinkfernreadsGOterms

TheideaofGOannotationfornewsequences

Page 33: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

Handsonpractice2:GOannotation

33

Page 34: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

34

http://geneontology.org/

Page 35: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

35

http://amigo1.geneontology.org/cgi-bin/amigo/blast.cgi

Getanexampleproteinsequencefilefromhttp://cys.bios.niu.edu/yyin/teach/PBB/csl-pr.fa

Page 36: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

36

Page 37: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

37

Thisiseasy.Nowlet’strytogetalistofdifferentiallyexpressedgenesandthenfindwhat’scommoninthislistofgenesintermsoffunctions.

We’regonna useNCBIGEOwebsitetogetthegenelistandthenfeedthegenelisttoGOenrichmentanalysistools

Page 38: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

38

GotoNCBIhomepage,searchGEODataSets withkeyword“GDS4831”,andhitsearch

Page 39: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

39

Choose“Compare2setsofsamples”

Choose“Valuemeansdifference”Choose“8+fold”Choose“higher”

ThengotoStep2

SelecttochoosegroupA:threesamplesforCOP1depletionandHuh7cellline

GroupB:threesamplesfornegativecontrolandHuh7cellline

Hitok,andgotoStep3

Page 40: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

40

Total256geneprofilesarefoundwith8+foldhigherexpressioninCOP1depletionthaninnegativecontrolinHuh7cellline

Togetthelistofgenes,chooseGenedatabaseandhitFinditems

Page 41: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

41

Total225genescorrespondto256geneprofilesTodownloadthelistofGeneIDs,hitSendto,chooseUIlistasformatandhitCreatefile

Afilenamed“gene_result.txt”willbeautomaticallydownloadedtoyourlocalcomputerFindoutwhereitisdownloadedto,openitusingnotepad++

Page 42: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

42

Viewthefileusingnotepad++

NextwewilluseDAVIDtoperformfunctionenrichmentanalysis

Page 43: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

43

The Databasefor Annotation, Visualizationand IntegratedDiscovery (DAVID )

Hitstartanalysis

Page 44: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

44

UploadthelistofGeneIDs

SelectENTREZ_GENE_ID

ClickonGenelist

Page 45: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

45Checkthesubmittedgenelist

ThisallowsyoutoviewfunctionalannotationfromvariousresourcesincludingGO

Page 46: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

46

IfyouhaveclickedonFunctionalAnnotationtool,youareatthispage

Allthesecanbechangedbyusers(toshowornottoshowandshowwhat)

Uncheckthis

Page 47: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

47

SelectjustGO

Clickherewillopenanewwindowtoshowthe225differentiallyexpressedgenesareenrichedinwhatGO

Page 48: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

48

GenesareenrichedinwhatGOcategories(comparedtothegenomebackground)?

Page 49: EBI web resources I: databases and toolsbcb.unl.edu/yyin/teach/PBB/ebi-go.pdf · cellular component, the parts of a cell or its extracellular environment; molecular function, the

Nextlecture: EBI web resources II (ENSEMBL

and InterPro)

49