UniProtKB/Swiss-Prot:Why sparql?
-
Upload
jerven-bolleman -
Category
Science
-
view
971 -
download
0
Transcript of UniProtKB/Swiss-Prot:Why sparql?
![Page 1: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/1.jpg)
SPARQL: UniProtKB/Swiss-Prot why do it?
Jerven Bolleman Developer Swiss-Prot Group
![Page 2: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/2.jpg)
What is UniProtKB/Swiss-Prot
• Central database in the Life Sciences
– Proteins -‐> you are made out of them
– Summarises current scientific knowledge
– Links 150+ databases together
• Swiss-‐Prot & Vital-‐IT group activities are funded by the Swiss-‐Confederation through the SERI (State secretariat for education, research and innovation)
![Page 3: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/3.jpg)
Our Goals
• Provide core Bioinformatics resources
– UniProtKB/
–
– …
• Provide services and infrastructure
– Vital-‐IT : HPC for the life-‐sciences
– …
![Page 4: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/4.jpg)
Why provide a public SPARQL endpoint
• A 10 man wet laboratory can not afford:
– to host their own database in house holding all or even a bit of all life science data.
– not to have access, and use, existing life science information.
![Page 5: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/5.jpg)
← Not CPU Time...But Brain Time
↓
The right kind of optimisation
![Page 6: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/6.jpg)
Why provide a public SPARQL endpoint
• Classical SQL can be provided on the web
–Is not practical –No federation –Poor standards conformance
• Local SQL is expensive • Local JSON is no better
• Nor is local XML
![Page 7: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/7.jpg)
Data Integration Traditional
Pathway.txt
UniProt.txt
Pathway Parser
UniProt Parser
Pathway Schema
UniProt Schema
Own Lab Data
Data warehouse
SQL queries
$
$
$
$
$
$
![Page 8: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/8.jpg)
Data Integration RDF/SPARQL
Pathway.rdf
UniProt.rdf
Own Lab Data
Triple Store SPARQL Queries
$
$?
![Page 9: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/9.jpg)
Why provide a public SPARQL endpoint
• Document centric REST is not enough
–Swiss-‐Prot available as REST –(over e-mail !!) since 1986
–expasy.ch since 1993 –www.uniprot.org since 2002
• Most user use a GUI not a CLI • developers build GUI on a CLI
![Page 10: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/10.jpg)
10© 2015 SIB
![Page 11: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/11.jpg)
![Page 13: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/13.jpg)
100
10'000
1'000'000
2015-01
2015-02
2015-03
2015-04
2015-05
2015-06
2015-07
2015-08
queries ask selectconstruct describe
Queries per month in 2015 peak: 4 million per month
![Page 14: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/14.jpg)
Real users
Mix between hard analytics and super specific
Estimate somewhere between: 300 - 1000 real humans per month
We know they are real because they take holidays ;)
![Page 15: UniProtKB/Swiss-Prot:Why sparql?](https://reader031.fdocuments.net/reader031/viewer/2022030311/58ef04631a28ab7c358b4649/html5/thumbnails/15.jpg)
Using the Semantic Web for faster (Bio-) Research http://edu.isb-sib.ch/course/view.php?id=212