ChemSpider: Connecting Chemistry & Mass Spectrometry on the Internet
Navigating an Internet of Chemistry via ChemSpider
-
Upload
antony-williams-chemconnector -
Category
Technology
-
view
12.854 -
download
1
description
Transcript of Navigating an Internet of Chemistry via ChemSpider
Navigating an Internet of Chemistry via ChemSpider
Antony WilliamsUniversity of Arkansas, Little Rock, October 2011
UALR Chemistry Seminar Guest Lecture
Overview
What type of chemistry is available on the internet?
Representative flavors of chemistry
How can the internet be searched by chemical?
Quality on the Internet
Contributing to the chemistry internet
Where is chemistry online? Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Patents with chemical structures Drug Discovery data Scientific publications Compound aggregators Blogs/Wikis and Open Notebook Science
Representative Flavors of Chemistry
Molfiles Molfiles are the primary exchange format between
structure drawing packages Can be different between different drawing packages Most commonly carry X,Y coordinates for layout Can support polymers, organometallics, etc. Can carry 3D coordinates
Molfiles 10 9 0 0 1 0 0 0 0 0 1 V2000 31.2937 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 26.6526 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 31.2937 -7.7066 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 30.1161 -9.6877 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 25.5096 -9.6877 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 28.9731 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 27.8163 -9.7016 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 26.6664 -7.7066 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 32.4367 -9.6877 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 30.1161 -11.0177 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3 1 2 0 0 0 0 4 1 1 0 0 0 0 9 1 1 0 0 0 0 7 2 1 0 0 0 0 5 2 2 0 0 0 0 8 2 1 0 0 0 0 6 4 1 0 0 0 0 4 10 1 6 0 0 0 7 6 1 0 0 0 0 M END
SMILES (http://en.wikipedia.org/wiki/SMILES)
SMILES is a common format Can support polymers,
organometallics, etc. Does NOT carry X,Y or Z
coordinates for layout so requires layout algorithms – can be problematic!
Generally different between drawing packages
Stereo
Tautomeric forms
Vendor-dependent SMILES ACD/LabsCC(C)CCC[C@@H](C)CCC[C@@H](C)CCCC(\
C)=C\CC2=C(C)C(=O)c1ccccc1C2=O
OpenEyeCC1=C(C(=O)c2ccccc2C1=O)C/C=C(\C)/
CCC[C@H](C)CCC[C@H](C)CCCC(C)C
ChEMBLCC(C)CCC[C@@H](C)CCC[C@@H](C)CCC\C(=C\
CC1=C(C)C(=O)c2ccccc2C1=O)\C
The InChI Identifier
InChI
SINGLE code base managed by IUPAC – integrated into drawing packages. No variability as with SMILES
InChI Strings can be reversed to structures – same problem as with SMILES – no layout
Adopted by the community (databases, blogs, Wikipedia) – good for searching the internet
Multiple Layers
Tautomers – “Mobile H Perception”
Stereo
Checking for Stereochemistry
Checking for StereochemistryUse your drawing package!
Checking for Stereochemistry
Checking for Stereochemistry
Checking for Stereochemistry
Databases and Standardization
Databases and Standardization
InChIStrings Hash to InChIKeys
Vancomycin
Vancomycin
Search Molecular SKELETON
Search Full Molecule
Searching Chemistry on the Internet
Searching Vincristine Name searching Google Name searching Wikipedia Name searching Wolfram Alpha Name, name, name, name…searching Structure searching DOZENS of websites,
each with different information or…
Searching Chemistry on the Internet
Searching Vincristine Name searching Google Name searching Wikipedia Name searching Wolfram Alpha Name, name, name, name…searching Structure searching DOZENS of websites,
each with different information or…
Search ONE website integrating the others!
www.chemspider.com
I want to know about “Vincristine”
Vincristine: Identifiers and Properties
Vincristine: Identifiers and Properties
Vincristine: Vendors and Sources
Vincristine: Patents
Vincristine: Articles
Vancomycin
Search Molecular SKELETON
Search Full Molecule
Full Skeleton Search: 104 Hits
Full Molecule Search: 4 Hits
Quality on the Internet
Trust everything on the web???
What’s said on the web is true…
What’s said on the web is true…
What’s said on the web is true…
“We then established a collaboration with professor Sum Ting Wong, a fugitive from the North Korean University Hu Yu Hai Ding, currently in Rome (Italy).”
“This was identified as the new protein Wai So Dim (WSD).”
Contributing Chemistry to the Web If it was not just about me
Contributing Chemistry to the Web If it was not just about me We might have a community
built encyclopedia I might know where the best
restaurants are I might get good advice on
books to read I might know which movies to
watch I might know which plumber
to call Data might just be Open
Contributing Chemistry to the Web If it was not just about me We might have a community
built encyclopedia I might know where the best
restaurants are I might get good advice on
books to read I might know which movies to
watch I might know which plumber
to call Data might just be Open
Contributing Chemistry to the Web
ChemSpider as a host for community contributions Curation and validation input Structures Movies Images Analytical data – especially spectra
Contributing Chemistry to the Web
Sites allow direct feedback – leave it!
Sites allow deposition of data Text – chemical names, properties Structures Spectra
Curation of existing data
Spectra
ChemSpider SyntheticPages
Submission Process Simple template-based submission process
Submissions reviewed by editorial board. Published as is or comments sent to author
Online Peer Review process
Data supported include web movies, images, live spectra etc.
DOI issued to author
Conclusion
Diverse types of chemistry are available on the web Searching of the internet is possible based on
Text Structure searching Substructure searching
The InChI has enabled linking on the internet Quality on the Internet is diverse – separating the
wheat from the chaff is not always easy! It is possible to contribute to the chemistry internet!
Thank you
Email: [email protected] Twitter: ChemConnectorBlog: www.chemspider.com/blogPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams