Download - EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator [email protected] Introduction to InterPro.

Transcript
Page 1: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

EBI is an Outstation of the European Molecular Biology Laboratory.

Amaia SangradorInterPro [email protected]

Introduction to InterPro

Page 2: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

What is InterPro?

DIAGNOSTICS RESOURCE :

InterPro uses signatures from several different databases (referred to as member databases) to predict information

about proteins

*

Provides functional analysis of proteins by classifying them into families and predicting domains and important sites

*

Adds information about the signatures and the types of proteins they match

Page 3: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Consortium

Consortium of 11 major signature databases

Page 4: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Why do we need predictive annotation tools?

Page 5: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Based on the original work on PIR , Swiss-Prot and TrEMBL

Collaboration between EBI, SIB and PIR

The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information.

What is UniProt?

Page 6: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

UniParc - Sequence archive Current and obsolete sequences

UniMES

Metagenomicand environmentalsample sequences

UniProtKB/Swiss-Prot

Reviewed

UniProtKB/TrEMBL

Unreviewed

UniProtKBProtein knowledgebase

EMBL/GenBank/DDBJ, Ensembl, RefSeq, PDB, other resources

UniRefSequence clusters

UniRef100

UniRef90

UniRef50

High-quality manual annotation

Automatic annotation

Page 7: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Annotation using InterPro

Swiss-Prot

groups of related proteins

(same family or share domains)

TrEMBL

uncharacterised sequence

protein signatures

InterPro

automatic annotation

pipelineCGCGCCTGTACGCTGAACGCTCGTGACGTGTAGTGCGCG

CGCGCCTGTACGCTGAACGCTCGTGACGTGTAGTGCGCG

manually annotated sequence

Page 8: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Protein family classificationProtein family classification

• Given a set of sequences, we usually want to know:

– what are these proteins; to what family do they belong?

– what is their function; how can we explain this in structural terms?

Page 9: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Protein family classification : Protein family classification : BLAST (BLAST (pairwise comparisons)

Page 10: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Protein family classification: Protein family classification: BLASTBLAST

Page 11: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Limitations with Pairwise comparisons

• BLAST alignment of 2 proteins: • 60S acidic ribosomal protein P0 from 2 species

Page 12: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Limitations with Pairwise comparisons

Page 13: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Protein family classification: Protein family classification: signature databasessignature databases

• Alternatively, we can seek ‘patterns’ that will allow us to infer relationships with previously-characterised sequences

• This is the approach taken by ‘signature’ databases

Page 14: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Protein signatures

• More sensitive homology searches

• Each member database creates signatures using different methods and

methodologies:

manually-created sequence alignments

automatic processes with some human input and correction

entirely automatically.

Page 15: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

What are protein signatures?

Multiple sequence alignment

Protein family/domainBuild model

Search

Mature model

ITWKGPVCGLDGKTYRNECALL

AVPRSPVCGSDDVTYANECELK

UniProtit.

Significant match

Protein analysis

Page 16: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Member databases

Hidden Markov Models Finger-Prints

Profiles PatternsSequence Clusters

Structural Domains

Functional annotation of families/domains

Prediction of conserved domains

Protein features (active sites…)

METHODS

Page 17: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Full domain alignment methods

Single motif methods

Multiple motif methods

Regex patterns (PROSITE)

Profiles (Profile Library)

HMMs (Pfam)

Identity matrices (PRINTS)

Diagnostic approaches (sequence-based)

Page 18: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Patterns

Extract pattern sequencesxxxxxxxxxxxxxxxxxxxxxxxx

Sequence alignment

MotifDefine pattern

Pattern signature

C-C-{P}-x(2)-C-[STDNEKPI]-x(3)-[LIVMFS]-x(3)-CBuild regular expression

PS00000

Page 19: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Patterns

Patterns are mostly directed against functional residues: active sites, PTM, disulfide bridges, binding sites

• Anchoring the match to the extremity of a sequence<M-R-[DE]-x(2,4)-[ALT]-{AM}

• Some aa can be forbidden at some specific positions which can help to distinguish closely related subfamilies

• Short motifs handling - a pattern with very few variability and forbidden positions, can produce significant matches e.g. conotoxins: very short toxins with few conserved cysteines C-{C}(6)-C-{C}(5)-C-C-x(1,3)-C-C-x(2,4)-C-x(3,10)- C

Drawbacks

• Simple but less powerful

Advantages

Page 20: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

>sp|P29197|CH60A_ARATH Chaperonin CPN60, mitochondrial OS=Arabidopsis thaliana MYRFASNLASKARIAQNARQVSSRMSWSRNYAAKEIKFGVEARALMLKGVEDLADAVKVT MGPKGRNVVIEQSWGAPKVTKDGVTVAKSIEFKDKIKNVGASLVKQVANATNDVAGDGTT CATVLTRAIFAEGCKSVAAGMNAMDLRRGISMAVDAVVTNLKSKARMISTSEEIAQVGTI SANGEREIGELIAKAMEKVGKEGVITIQDGKTLFNELEVVEGMKLDRGYTSPYFITNQKT QKCELDDPLILIHEKKISSINSIVKVLELALKRQRPLLIVSEDVESDALATLILNKLRAG IKVCAIKAPGFGENRKANLQDLAALTGGEVITDELGMNLEKVDLSMLGTCKKVTVSKDDT VILDGAGDKKGIEERCEQIRSAIELSTSDYDKEKLQERLAKLSGGVAVLKIGGASEAEVG EKKDRVTDALNATK

AAVEEGILPGGGVALLYAARELEKLPTANFDQKIGVQIIQNALKTP VYTIASNAGVEGAVIVGKLLEQDNPDLGYDAAKGEYVDMVKAGIIDPLKVIRTALVDAAS VSSLLTTTEAVVVDLPKDESESGAAGAGMGGMGGMDY

EXAMPLE:  PS00296; Chaperonins cpn60 signature  (PATTERN)

A-[AS]-{L}-[DEQ]-E-{A}-{Q}-{R}-x-G(2)-[GA]

Pattern/motif in sequence regular expression

Prosite patterns

Page 21: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Fingerprints

Sequence alignment

Correct order

Correct spacing

Motif 2 Motif 3Motif 1Define motifs

Fingerprint signature 1 2 3

PR00000

Extract motif sequences

xxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxx

Weight matrices

Page 22: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

The significance of motif context

order

interval

• Identify small conserved regions in proteins

• Several motifs characterise family

• Offer improved diagnostic reliability over single motifs by virtue of the biological context provided by motif neighbours

Page 23: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

PRINTS families are hierarchical Different motifs describe subfamilies

G protein-coupled receptors

rhodospin-like secretin-like cAMP receptors

metabotropicglutamatereceptors

etc

adenosine receptors

opsin receptors

dopamine receptors

somatostatin receptors

histaminereceptors

etc

somatostatin receptor type 1

somatostatin receptor type 2

somatostatin receptor type 3

etc

Page 24: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Profiles & HMMs

Sequence alignment

Entire domainDefine coverage

Whole protein

Use entire alignment for domain or protein xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Build model Models insertions and deletions

Profile or HMM signature

Page 25: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Hidden Markov Models (HMM)

Models insertions and deletions

More flexible (can use partial alignments)

Profiles

Built using weight matrices

More sophisticated algorithm

Page 26: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

• PROSITE domains: high quality manually curated seeds (using biologically characterized UniProtKB/Swiss-Prot entries), documentation and annotation rules. Oriented toward functional domain discrimination.

• HAMAP families: manually curated bacterial, archaeal and plastid protein families (represented by profiles and associated rules), covering some highly conserved proteins and functions.

PROSITE and HAMAP profiles:a functional annotation perspective

Page 27: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

HMM databases

Sequence-based

• PIR SUPERFAMILY: families/subfamilies reflect the evolutionary relationship

• PANTHER: families/subfamilies model the divergence of specific functions

• TIGRFAM: microbial functional family classification

• PFAM : families & domains based on conserved sequence

• SMART: functional domain annotation

Structure-based

•SUPERFAMILY : models correspond to SCOP domains

• GENE3D: models correspond to CATH domains

Page 28: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Why we created InterPro

By uniting the member databases, InterPro capitalises on their individual strengths, producing a powerful diagnostic tool & integrated database

– to simplify & rationalise protein analysis

– to facilitate automatic functional annotation of uncharacterised proteins

– to provide concise information about the signatures and the proteins they match, including consistent names, abstracts (with links to original publications), GO terms and cross-references to other databases

Page 29: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro entry

Page 30: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro entry

Page 31: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

The InterPro entry: types

Proteins share a common evolutionary origin, as reflected in their related functions, sequences or structure

Family

Distinct functional, structural or sequence units that may exist in a variety of biological contextsDomain

Short sequences typically repeated within a proteinRepeats

PTM Active Site

Binding Site

Conserved Site

Sites

Page 32: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Entry

Adds extensive annotation

Links to other databases

Structural information and viewers

Groups similar signatures together

Adds extensive annotation

Links to other databases

Quality control

Removes redundancy

Page 33: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Entry

Adds extensive annotation

Links to other databases

Structural information and viewers

Groups similar signatures together

Adds extensive annotation

Links to other databases

Hierarchical classification

Page 34: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Interpro hierarchies: Families

FAMILIES can have parent/child relationships with other Families

Parent/Child relationships are based on:

• Comparison of protein hits

child should be a subset of parent

siblings should not have matches in common

• Existing hierarchies in member databases

• Biological knowledge of curators

Page 35: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Interpro hierarchies: Domains

DOMAINS can have parent/child relationships

with other domains

Page 36: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Domains and Families may be linked through Domain Organisation

Hierarchy

Page 37: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Entry

Adds extensive annotation

Links to other databases

Structural information and viewers

Groups similar signatures together

Adds extensive annotation

Links to other databases

Page 38: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Entry

Adds extensive annotation

Links to other databases

Structural information and viewers

Groups similar signatures together

Adds extensive annotation

Links to other databases

The Gene Ontology project provides a controlled vocabulary of terms for

describing gene product characteristics

Page 39: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Entry

Adds extensive annotation

Links to other databases

Structural information and viewers

Groups similar signatures together

Adds extensive annotation

Links to other databases

UniProt

KEGG ... Reactome ... IntAct ...

UniProt taxonomy

PANDIT ... MEROPS ... Pfam clans ...

Pubmed

Page 40: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Entry

Adds extensive annotation

Links to other databases

Structural information and viewers

Groups similar signatures together

Adds extensive annotation

Links to other databases

PDB 3-D Structures

SCOP Structural domains

CATH Structural domain classification

Page 41: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Understanding signatures:

Page 42: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Non-overlapping signatures can be describing the same thing

Not always possible to use signature overlap to determine how family signatures are related

PF03157 336 protein hitsPR00210 331 protein hits

Two very different signatures both describing the same thing!

e.g. High molecular weight glutenins

Page 43: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

PFAM shows domain is composed of two types of repeated sequence motifs

SUPERFAMILY shows the potential domain

boundaries

www.ebi.ac.uk/interpro

Some signatures give us similar, but complementary information

Page 44: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

4) Non-contiguous domains

3) Repeated elements

2) Duplicated domains

1) Signature method

www.ebi.ac.uk/interpro

Discontinuous Signatures Require Interpretation

Page 45: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

• e.g. PRINTS – discrete motifs1) Signature methodSignature method

3) Repeated elements

2) Duplicated domains

4) Non-contiguous domains

www.ebi.ac.uk/interpro

Discontinuous Signatures Require Interpretation

Page 46: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

1) Signature method

2) Duplicated domainsDuplicated domains

3) Repeated elements

4) Non-contiguous domains

• e.g. SSF - duplication consisting of 2 domains with same fold

www.ebi.ac.uk/interpro

Discontinuous Signatures Require Interpretation

Page 47: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

3) Repeated elementsRepeated elements

2) Duplicated domains

• e.g. Kringle, WD40

4) Non-contiguous domains

1) Signature method

www.ebi.ac.uk/interpro

Discontinuous Signatures Require Interpretation

Page 48: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

3) Repeats

4) Non-contiguous domainsNon-contiguous domains

2) Duplicated domains

1) Signature method

• Structural domains can consist of non-contiguous sequence

www.ebi.ac.uk/interpro

Discontinuous Signatures Require Interpretation

Page 49: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Discontinuous Signatures Require Interpretation

4) Non-contiguous domains

3) Repeats

2) Duplicated domains

1) Signature method

www.ebi.ac.uk/interpro

Page 50: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Searching InterPro:

Page 51: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

WHEN TO USE INTERPRO

Use InterPro to predict family, domain or active site information for a given protein or amino acid sequence.

You can search InterPro if you have

•a protein sequence•a UniProtKB protein identifier, •a Gene Ontology term, •a protein structure code •a general search term

keywordshort phrase

and require further information regarding your protein of interest.

Page 52: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

http://www.ebi.ac.uk/interpro/

Search tools include:

• Text Search

• InterProScan (sequence search)

• BioMart (builds queries)

Beta version: http://wwwdev.ebi.ac.uk/interpro/

Page 53: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Search

wwwdev.ebi.ac.uk/interpro

Search using:• text• protein ID• InterPro ID• GO term ID: GO:0006915

Name : apoptosis

Page 54: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Search

Search results for GO:0006915 (apoptosis )

Page 55: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Search

wwwdev.ebi.ac.uk/interpro

protein ID

Page 56: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Search Results

Structural data

Link to PDBe

Unintegrated signatures

Domains and sites

Family

Page 57: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Structural information

CATH and SCOP divide PDB structures into domains

Swiss-Model and ModBase can predict structure for regions not covered by PDB

Note that one domain is discontiguous

Page 58: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Searching InterPro:

InterProScan

Page 59: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterProScan – Searching New Sequence

wwwdev.ebi.ac.uk/interpro

Paste in unknown sequence

Additional options

Page 60: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterProScan New Search Results

Links to signature database

s

Link to InterPro entry

Page 61: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Searching InterPro:

BioMart

Page 62: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

• Large volumes of data can be queried efficiently

• The interface is shared with many other bioinformatics resources

• It allows federation with other databases

PRIDE (mass spectrometry-derived proteins and peptidesREACTOME (biological pathways)

BioMart Search

BioMart allows more powerful and flexible queries

Page 63: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

BioMart Search

1) Choose Dataseta. Choose InterPro BioMart

Page 64: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

BioMart Search

1) Choose Dataseta. Choose InterPro BioMart

b. Choose InterPro entries or protein matches

Page 65: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

BioMart Search

2) Choose FiltersSearch specific entries, signatures or proteins

Page 66: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

BioMart Search

2) Choose Filters e.g. Filter by specific proteins

Page 67: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

BioMart Search

3) Choose Attributes What results you want

Page 68: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

BioMart Search

4) Choose additional Dataset (optional) This is where you link results to Pride and Reactome

Page 69: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

BioMart Search Results

User manual

HTML = web-formatted tableCSV = comma-separated valuesTSV = tab-separated valuesXLS = excel spreadsheet

Click to view results

Page 70: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro – the numbers

Our member databases all have their particular niche or focus......but InterPro is a combination of all their areas of expertise!

• InterPro 32.0: 21516 entries

101175 signatures covering 85.5% of UniProtKB

• Frequent releases – both protein and method updates

• 45 000 unique visitors per month

• The database has grown almost 10-fold in ~11 years

Page 71: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

Caveats

We need your feedback!missing/additional referencesreporting problemsrequests

InterPro is a predictive protein signature database. Small changes with a large impact may not be well represented.

•for example, inactive peptidases, such as Q8N3Z0, Q9W3H0

InterPro entries are based on signatures supplied to us by our member databases

•....this means no signature, no entry!

EBI support page.

Page 72: EBI is an Outstation of the European Molecular Biology Laboratory. Amaia Sangrador InterPro curator amaia@ebi.ac.uk Introduction to InterPro.

InterPro Team:

Acknowledgements

Amaia Sangrador

David Lonsdale

Craig McAnulla

MatthewFraser

Anthony Quinn

Maxim Scheremetjew

PhilJones

Siew-Yit Yong

Alex Mitchell

Sebastien Pesseat

PrudenceMutowo

SarahHunter

ChristopherHunter