EBI patent related services Non-redundant patent sequences enriched. Sequence archives ENA SVA &...

download EBI patent related services Non-redundant patent sequences enriched. Sequence archives ENA SVA & UniSave

of 28

  • date post

    21-May-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of EBI patent related services Non-redundant patent sequences enriched. Sequence archives ENA SVA &...

  • EBI is an Outstation of the European Molecular Biology Laboratory.

    EBI patent related services

    Jennifer McDowall Senior Scientist, EMBL-EBI

    4th Annual Forum for SMEs

    October 18-19th 2010

  • • Patent sequence data

    • Sequence archives

    • Sequence searches

    Overview

    2 www.ebi.ac.uk

  • • Patent sequence data

    • Sequence archives

    • Sequence searches

    Overview

    3 www.ebi.ac.uk

  • 4

    September 2010 nucl > 17.5m sequences prot > 4.9m sequences

    GenBank

    ENA

    DDBJ

    EPO

    USPTO JPO

    EPO policy: data released to public (and to EMBL) 18 months After patent application date, independent of whether patent has been granted.

    Sequence data from patent literature

    www.ebi.ac.uk

  • Patent Sequence records

    Universal Protein Resource

    (UniProt)

    Non-redundant Patent

    Sequence Databases

    European Nucleotide Archive

    (ENA, formerly EMBL-Bank)

    5 www.ebi.ac.uk

  • ENA

    • ENA-Annotation >124m sequences

    • Includes patent class (PAT): EPO, USTPO, JPO, KIPO

    www.ebi.ac.uk6

    • ENA old EMBL-Bank

    raw data archives

    ENA-Annotation

    Trace Archive Sequence Read Archive

    +

    • Dates include: date sequence went public, date of last revision

  • www.ebi.ac.uk

    Patent sequence record in ENA

    www.ebi.ac.uk7

    Graphical viewer

    Sequence

    Patent reference

    Navigate to related data e.g. Version

    archive

    Navigate to external data

    sources e.g. UniProt

    Download data

    DNA source

    Dates (first public and last updated)

    Sequence version

  • UniProt

    www.ebi.ac.uk8

    • UniParc >23m sequences

    • Includes patent class (PRT): EPO, USTPO, JPO, KIPO

    • Composed of 4 sections

    • UniParc

    • UniProtKB

    • UniMES

    • UniRef

    • Dates include: date sequence went public, date of last revision

    SwissProt / TrEMBL Non-redundant archive

    Metagenomic

    Sequence clusters

  • Patent sequence record in UniProt

    www.ebi.ac.uk 9

    Sequence

    Navigate to individual entries

    Download data

    REMTREMBL (deprecated database)

    Accession

    List of databases containing sequence

  • www.ebi.ac.uk10

    Non-redundant patent databases

    www.ebi.ac.uk10

    ENA (redundant)

    Remove sequence redundancy

    Level-1 NR

    Remove patent family redundancy

    Level-2 NR

    Additional annotation, including priority dates

    for patent family

  • Bulk Downloads

    www.ebi.ac.uk11

    http://www.ebi.ac.uk/patentdata/

    Patent proteins

    Patent nucleotides

    Non-redundant sequences

  • www.ebi.ac.uk12

    • Patent sequence data

    • Sequence archives

    • Sequence searches

    Overview

  • Sequence archives

    www.ebi.ac.uk13

    • ENA nucleotide sequence version archive (SVA) www.ebi.ac.uk/embl/sva

    • UniSave – UniProt sequence/annotation version archive www.ebi.ac.uk/uniprot/unisave

    Search by date  get specific record

    Search by accession only  get all records

  • Provides complete version list

    www.ebi.ac.uk14

    View old entries

    Compare different versions

  • www.ebi.ac.uk15

    View old entries

  • Compare different versions

    www.ebi.ac.uk16

  • www.ebi.ac.uk17 www.ebi.ac.uk17

    • Patent sequence data

    • Sequence archives

    • Sequence searches

    Overview

  • EB-eye: text search

    www.ebi.ac.uk18

    Fast, easy to use

    Search for patent WO0146262

    Lists sequences associated with

    WO0146262

    Lists all entries associated with

    WO0146262

  • www.ebi.ac.ukwww.ebi.ac.uk

    SRS: advanced text search For more complex searches

    http://srs.ebi.ac.uk/

    Select resources to search

    Create query

    then Patent literature

    Patent DNA

    Patent proteins

  • 20 www.ebi.ac.uk20 www.ebi.ac.uk

    Sequence Similarity & Analysis Search for patent

    sequence

    BLAST

    FASTA

    Iterative searches

    Fragment searches

  • 21 www.ebi.ac.uk

    FASTA nucleotide patent search

    Search ENA patent class

    or non-redundant patent datasets

  • 22 www.ebi.ac.uk

    FASTA protein patent search

    Search individual patent offices

    or non-redundant patent datasets

  • 23 www.ebi.ac.uk

    Results: patent protein v UniProt

    Provide UniProt records

    Provide additional annotation

  • 24 www.ebi.ac.uk24 www.ebi.ac.uk

    Additional annotation (protein searches)Nucleotide sequences

    Structures

    GO mapping

    Literature

    Genome information

    Domain/family classification

    Reactions & pathways

    Chemical information

    Gene expression

    Enzyme data

    Molecular interactions

  • 25 www.ebi.ac.uk

    Functional predictions (protein searches)

    • Visual comparison • InterPro classification • Helps identify mis- or

    partial matches

  • 26 www.ebi.ac.uk

    Functional predictions (protein searches)

    34% ID • Matches:

    • family signature • 3 domain signatures

    28% ID • Matches:

    • 1 domain signature

    24% ID • Matches:

    • No signatures

    100% ID • Matches:

    • family signature • 4 domain signatures

    Prioritize results

    Extract information

    Presenter Presentation Notes Compare InterPro coverage to find mis- or partial matches

  • 27 www.ebi.ac.uk

    Summary

    Comprehensive sequence databases  ENA & UniParc (PAT / PRT class data)  Non-redundant patent sequences  enriched

    Sequence archives  ENA SVA & UniSave  track changes

    Multiple search engines

    Broad patent sequence coverage  Protein/nucleotides: EPO, USTPO, JPO, KIPO

     EB-eye text search  >40 databases  SRS  advanced text searching >100 databases  Multiple sequence search tools  annotation-enhanced

  • EBI is an Outstation of the European Molecular Biology Laboratory.

    Contacts: http://www.ebi.ac.uk/support/

    04.09.08

    QUESTIONS?

    Slide Number 1 Overview Overview Sequence data from patent literature Patent Sequence records ENA Patent sequence record in ENA UniProt Patent sequence record in UniProt Non-redundant patent databases Bulk Downloads Overview Sequence archives Provides complete version list View old entries Compare different versions Overview EB-eye: text search Slide Number 19 Slide Number 20 Slide Number 21 Slide Number 22 Slide Number 23 Slide Number 24 Slide Number 25 Slide Number 26 Slide Number 27 Contacts:�http://www.ebi.ac.uk/support/