EBRCN General Meeting, Paris, 28-29/11/2002 1
WP4Analysis of non-EBRCN databases and
network services of interest to BRCs
Current status
Paolo Romano
EBRCN General Meeting, Paris, 28-29/11/2002 2
WP4: databases of interest
Short delay: 1 month ca.
· Definition of a list of databases and services that could be of interest to BRCs done
· Selection of a subsets of those databases and services done
EBRCN General Meeting, Paris, 28-29/11/2002 3
WP4: identifiers and methods
· Selection of information of interest to BRCs within selected databases
ongoing, done for Medline & EMBL
· Analysis of identifiers and information and of methods for linking ongoing,
done for Medline
EBRCN General Meeting, Paris, 28-29/11/2002 4
WP4: Pubmed IDs
· CABRI catalogue production guidelines update ongoing, done for Literature in animal and human cells
· Retrieval of needed PUBMED IDs for linking ongoing, done for ICLC, BCCM/LMBP, NCCB plasmids, support from DSMZ (Kracht) and BCCM (Guissart)
EBRCN General Meeting, Paris, 28-29/11/2002 5
WP4: structure and syntax
· Catalogue structures update ongoing, done for Literature in animal and human cells
· SRS structure and syntax files
ongoing, depending on deadlines for submission of catalogues, done for ICLC
EBRCN General Meeting, Paris, 28-29/11/2002 6
WP4: catalogues updates
Catalogues updates:
done ICLC: November 2002
Plasmids and cell lines: January 2003
“Other catalogues”: February 2003
Bacteria: March 2003
Fungi and Yeasts: May 2003
EBRCN General Meeting, Paris, 28-29/11/2002 7
WP4: EMBL links
• EMBL Data Library is the European database for DNA sequences
• It is updated daily and a coordination with NCBI and DDBJ ensures its completeness
• It is offered at EBI by means of SRS
EBRCN General Meeting, Paris, 28-29/11/2002 8
WP4: EMBL links
• Test have been conducted to identify how to link to EMBL Data Library through SRS, without IDs
• Tests performed on:• Bacteria and Archaea• Animal and Human Cell Lines• Fungi and Yeasts• Plasmids• Viruses
EBRCN General Meeting, Paris, 28-29/11/2002 9
WP4: EMBL links variability
• Links are different for different materials• Links can use various EMBL fields:
• All-text (not very useful)• Organism (for micro-organisms)• Division (useful for viruses and plasmids)• Feature Table data (allow for a correct definition of a
source through Key, Qualifier, Description)
EBRCN General Meeting, Paris, 28-29/11/2002 10
WP4: EMBL links variability
• Example search: CBS 100.20 in CBS_FIL• Fields and values:
• Organism: fungi• Ft-Key: source• Ft-Qualifier: strain• Ft-Description: "cbs 100.20"
EBRCN General Meeting, Paris, 28-29/11/2002 11
WP4: EMBL links variability
• Annotation problems:• CBS 100.20 can be annotated as CBS 100.20 or
CBS100.20• CBS 112345 can be annotated as CBS12345
• Indexing problems:• CBS 100.20 is indexed as CBS, 100 and 20• The dot is not included and is used as a space
EBRCN General Meeting, Paris, 28-29/11/2002 12
WP4: EMBL links variability
Examples of searches:
• Query: Bacteria & source & cip* ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & [emblrelease-FtDescription:cip*]) < [emblrelease-Organism:bacteria*] )
• Query: Cell line & source & dsm* ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:cell_line] & [emblrelease-FtDescription:dsm*]) < [emblrelease-Organism:mammalia*] )
EBRCN General Meeting, Paris, 28-29/11/2002 13
WP4: EMBL links variability
Examples of search:
• Query: Bacteria & source & cbs 100.20( ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & ( ( [emblrelease-FtDescription:cbs] & [emblrelease-FtDescription:100] ) | [emblrelease-FtDescription:cbs100] ) & [emblrelease-FtDescription:20]) ) < [emblrelease-Organism:fungi*] )
EBRCN General Meeting, Paris, 28-29/11/2002 14
WP4: extracted databases
Extracted databases
• Selection of a meaningful subset of information (strain identification) for each material, including links to external dbs/services ongoing, proposal sent to collections next month
Top Related