Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely...
-
Upload
kevin-walton -
Category
Documents
-
view
214 -
download
0
Transcript of Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely...
![Page 1: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/1.jpg)
Chemical Database Projects Delivered by RSC eScience
at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”
Antony Williams
![Page 2: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/2.jpg)
RSC eScience
What was once just ChemSpider is much more…
ChemSpider Reactions Chemicals Validation and Standardization
Platform Learn Chemistry Wiki National Chemical Database Service Open PHACTS PharmaSea Global Chemistry Hub
![Page 3: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/3.jpg)
We are known for ChemSpider…
The Free Chemical Database
A central hub for chemists to source information >28 million unique chemical records Aggregated from >400 data sources Chemicals, spectra, CIF files, movies, images,
podcasts, links to patents, publications, predictions
A central hub for chemists to deposit & curate data
![Page 4: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/4.jpg)
We Want to Answer Questions
Questions a chemist might ask… What is the melting point of n-heptanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?
![Page 5: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/5.jpg)
I want to know about “Vincristine”
![Page 6: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/6.jpg)
Vincristine: Identifiers and Properties
![Page 7: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/7.jpg)
Vincristine: Vendors and Sources
![Page 8: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/8.jpg)
Vincristine: Articles
![Page 9: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/9.jpg)
How did we build it?
We deal in Molfiles or SDF files – with coordinates Deposit anything that has an InChI – we support
what InChI can handle, good and bad Standardization based on “InChI standardization” InChIs aggregate (certain) tautomers
![Page 10: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/10.jpg)
The InChI Identifier
![Page 11: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/11.jpg)
Downsides of InChI
Good for small molecules – but no polymers, issues with inorganics, organometallics, imperfect stereochemistry. ChemSpider is “small molecules”
InChI used as the “deduplicator” – FIRST version of a compound into the database becomes THE structure to deduplicate against…
![Page 12: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/12.jpg)
Side Effects of InChI Usage
![Page 13: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/13.jpg)
SMILES by comparison…
![Page 14: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/14.jpg)
Side Effects of InChI Usage
![Page 15: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/15.jpg)
Searches: The INTERNET
![Page 16: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/16.jpg)
Search by InChI
![Page 17: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/17.jpg)
ChemSpider Google Searchhttp://www.chemspider.com/google/
![Page 18: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/18.jpg)
How did we build it?
We deal in Molfiles or SDF files – with coordinates Deposit anything that has an InChI – we support
what InChI can handle, good and bad Standardization based on “InChI standardization” InChIs aggregate (certain) tautomers
We deal with “various forms” of data
![Page 19: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/19.jpg)
Crowdsourced “Annotations”
Users can add Descriptions/Syntheses/Commentaries Links to PubMed articles Links to articles via DOIs Add spectral data Add Crystallographic Information Files Add photos Add MP3 files Add Videos
![Page 20: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/20.jpg)
![Page 21: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/21.jpg)
ChemSpider : Spectra Linked
![Page 22: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/22.jpg)
ChemSpider ID 24528095 H1 NMR
![Page 23: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/23.jpg)
ChemSpider ID 24528095 HHCOSY
![Page 24: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/24.jpg)
How did we build it?
We deal in Molfiles or SDF files – with coordinates Deposit anything that has an InChI – we support
what InChI can handle, good and bad Standardization based on “InChI standardization” InChIs aggregate (certain) tautomers
We deal with “various forms” of data We are challenged with the complexities of
chemical names
![Page 25: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/25.jpg)
Antony Williams vs Identifiers
Passport ID
Dad, Tony, others
SSN
Green Card
License5 email addressesChemSpiderman (blog, Twitter account, Facebook, Friendfeed)OpenID….
![Page 26: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/26.jpg)
Aspirin names and synonyms
• Text searches depend on correct association
• >300 suggested identifiers for Aspirin just on PubChem
• Disambiguation dictionaries are necessary, not just for authors!
![Page 27: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/27.jpg)
![Page 28: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/28.jpg)
The Final Search Strategy
![Page 29: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/29.jpg)
All Those Names, One Structure
![Page 30: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/30.jpg)
Curated Dictionaries Matter
![Page 31: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/31.jpg)
Crowdsourcing ChemSpider
ChemSpider is crowdsourced
Community deposition, annotation and curation
Anyone can “Leave Feedback”
Registered users can add data
![Page 32: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/32.jpg)
“Curate” Identifiers
![Page 33: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/33.jpg)
“Curate” Identifiers
![Page 34: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/34.jpg)
“Curate” Identifiers
![Page 35: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/35.jpg)
Success Depends on Dictionaries
![Page 36: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/36.jpg)
Vincristine: Identifiers and Properties
![Page 37: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/37.jpg)
Vincristine: PatentsLinked by Name
![Page 38: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/38.jpg)
Validated Names for Searching…
![Page 39: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/39.jpg)
And yes..there are challengeshttp://www.cas.org/legal/infopolicy
![Page 40: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/40.jpg)
Licensing Data is Tough…
![Page 41: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/41.jpg)
Data Licensing, Open Data
The use of CAS data in third party Data Mining Tools is permitted as long as CAS Records are downloaded via STN® AnaVist™. All of these new "freedoms" are aimed at further enabling the dissemination of scientific information and the advancement of scientific research.
CAS does not permit the building of Databases that have wide and general availability and no longer fulfill the purpose of individual or team research that CAS permits but instead serve as a substitute for the use of CAS Databases.
![Page 42: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/42.jpg)
A Comment on Quality
For >28 million chemical compounds there are some errors:
“Incorrect” structure representations Mismatched name-structure relationships Experimental properties (the values, the units) Real vs. virtual compounds – text-mining and
conversion
We have deprecated a LOT of data…
![Page 43: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/43.jpg)
Identifier Dictionaries
Reciprocal curation processes…share curation with each other.
If a database has a compound already then use InChiKeys to match “suggested” validation against the compound.
A series of “added” and “removed” synonyms against InChIKeys for matching.
![Page 44: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/44.jpg)
Federated Data Curation SharingWho wants to work with us?
![Page 45: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/45.jpg)
Structure Validation using feed
Look for approved synonyms
Compare feed InChIKey with database InChIKey
If different, flag for inspection
![Page 46: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/46.jpg)
Many Problems Can be Solved…
Clean up databases – structure validation, structure standardization
Warn about Valency, charge balance, depiction issues,
bond types, absent stereo, and another 100 rules (or so…)
Standardize Agree community rules to “Standardize”
![Page 47: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/47.jpg)
Structure Validation
![Page 48: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/48.jpg)
Structure Validation - Fixed
![Page 49: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/49.jpg)
What needs to happen?
If we could validate Catch errors in databases (and clean) Proactively catch errors in publications/patents Reduce junk in the ether – improve QUALITY!
If we standardized Interlinking should improve
![Page 50: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/50.jpg)
CVSP: result of processing
![Page 51: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/51.jpg)
NCATS Dataset
![Page 52: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/52.jpg)
![Page 53: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/53.jpg)
DrugBank dataset (6516 records)
Marked as Errors (arbitrary)
2 records with query bonds 3 records with invalid atoms (asterisk in polymers) Unusual valence: ~70 (oxygen 3, sulfur 3 and 5, Mg 4, B 5, etc.)
Warnings INCHI not matching structure (100+) SMILES not matching structure (100+)
![Page 54: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/54.jpg)
DrugBank ID: DB00755 InChI=1S/C20H28O2/c1-15(8-6-9-16(2)14-19(21)22)11-12-18-
17(3)10-7-13-20(18,4)5/h6,8-9,11-12,14H,7,10,13H2,1-5H3,(H,21,22)/b9-6+,12-11+,15-8+,16-14+
DrugBank ID: DB00614
![Page 55: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/55.jpg)
Connecting Chemistry across the web
So much of what is seen on ChemSpider is retrieved in real time using services
![Page 56: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/56.jpg)
Connecting Chemistry across the web
![Page 57: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/57.jpg)
Online Predictions
![Page 58: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/58.jpg)
Web Services Open Up Collaboration
Agilent, Bruker, Waters and Thermo all use our web-based services for compound lookup
Many academic sites integrating directly – metabonomics, name lookup, semantic markup
Mobile app integration
Commercial structure drawing packages
![Page 59: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/59.jpg)
Web Services
![Page 60: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/60.jpg)
ChemSpider Everywhere: Spectral Game
![Page 61: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/61.jpg)
ChemSpider EverywhereCrowdsourced Curation of Spectra
![Page 62: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/62.jpg)
Web Services Integrate INTERNAL Projects
Integration between ChemSpider and…
Our publishing platform for structure display ChemSpider SyntheticPages LearnChemistry Wiki National Chemical Database Service And….a growing list….
![Page 63: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/63.jpg)
What ChemSpider Does Not Handle
Polymers Markush structures Organometallics Many Inorganics Materials Reactions…but….
![Page 64: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/64.jpg)
ChemSpider Reactions
![Page 65: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/65.jpg)
![Page 66: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/66.jpg)
DERA
Digitally Enabling the RSC Archive…back to 1841
Extracting data and making available via appropriate platform Chemicals Reactions Analytical Data Figures
![Page 67: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/67.jpg)
![Page 68: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/68.jpg)
Chemical Database Service
![Page 69: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/69.jpg)
Data for life sciences
What’s the structure?What’s the structure?
Are they in our file?
Are they in our file?
What’s similar?What’s
similar?
What’s the target?
What’s the target?Pharmacology
data?Pharmacology
data?
Known Pathways?
Known Pathways?
Working On Now?
Working On Now?Connections
to disease?Connections to disease?
Expressed in right cell type?Expressed in
right cell type?
Competitors?Competitors?
IP?IP?
![Page 70: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/70.jpg)
OpenPHACTS
Open PHACTS is an Innovative Medicines Initiative (IMI) – 3 years project
To reduce the barriers to drug discovery in industry, academia and for small businesses
To build an open platform, integrating chemistry and biology data from public domain resources
Semantic web platform
Open Standards, Open Data and Open Source
![Page 71: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/71.jpg)
Crowdsourcing across drug discovery Open PHACTS : partnership between European
Community and European Pharma Companies 22 partners, 8 pharmaceutical companies, 3
biotechs working together for 3 years
Freely accessible for knowledge discovery and verification. Data on chemistry and biology Pharmacological profiles Proprietary and public data sources.
![Page 72: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/72.jpg)
![Page 73: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/73.jpg)
![Page 74: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/74.jpg)
PharmaSea• FP7 Initiative. PharmaSea: increasing value and
flow in the marine biodiscovery pipeline (2012-2017)
• Improve the quality, volume and value of active agents discovered in the marine environment and increase the speed at which they can be delivered
• RSC: Providing dereplication via ChemSpider, analytical data algorithms, integration with computer-assisted structure elucidation algorithms
![Page 75: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/75.jpg)
Conclusions RSC eScience supporting increasing number of
grant-based projects ChemSpider grows daily – community depositions
and data from RSC Content with a focus on expanding data while improving quality
ChemSpider is an integration platform for MANY projects through web services
CVSP processing is available to use and provide feedback – will be available as a service also
We believe in curation sharing - who wants to collaborate?
![Page 76: Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”](https://reader036.fdocuments.net/reader036/viewer/2022081520/56649ea35503460f94ba719d/html5/thumbnails/76.jpg)
Thank you
Email: [email protected] Twitter: ChemConnectorBlog: www.chemspider.com/blogPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams