Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends...
Transcript of Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends...
1
Create RDF using R
• R with rrdf, rrdflibs
https://github.com/egonw/rrdf
• R Data frame to RDF
– Excel->data frame-> to RDF
– SAS dataset -> data frame -> RDF
rrdf, rrdflibs
2
Create RDF using R
Packages: rrdf, rrdflibs • add.triple()
– Add a triple :object is a URI
• add.data.triple()
– Add triple: object is a literal
3
Create RDF using R
Try or follow along
File: createTTLFromR.R
Output File: createTTLFromR.TTL
4
Create RDF using SAS
• SAS accessing SPARQL service using PROC HTTP • All functions provided by the service, see SPARQL 1.1 Protocol
(https://www.w3.org/TR/sparql11-protocol/)
• Implemented as SAS macros https://github.com/MarcJAndersen/SAS-SPARQLwrapper
• SAS generating text files with • RDF in Turtle
• SPARQL INSERT statements
5
Output File:
createTTLFromSAS.TTL
Create RDF using SAS
Try or follo
w along
File: createTTLFromSAS.SAS
2 1
3
6
7
Putting it all together
Extract, Transform
RDF Linked Data
Knowledgebase
Load
Federated Queries
• Analysis • Visualization • Submission • Publication
8
• Introduction to Semantic Web & Linked Data
• Resource Description Framework (RDF)
• Pharma: Use and Impact
• Visualizing Linked Data
• PhUSE CSS Projects
• Conclusion
Outline
9
“The choice for all major content providers is not whether to adopt a linked data approach but when. Asking ‘do we need linked data?’ today may be analogous to asking ‘do we really need a website?’ 15 years ago. ” - Cochrane Technical Report (2013)
10
11
12
Traceability: THE Linked Data Super Power
Analysis Results & Displays
Study Protocol CSR Submission Data
Pharma Company, Biotech …
Regulatory Agency, Reviewer…
• Publications • Registries • Data Gathering
• Meta-analyses • Competitive
Intelligence
PRODUCE
CONSUME
Collected Data
Clinical Program Design in RDF
Study Design and Protocol in RDF
CDISC Foundational Standards in RDF
Analysis Results & Metadata
Regulations in RDF
Data Lifecycle
CS WG Linked Data Projects & Deliverables
Clinical Development
Plan
Analysis Results & Displays
Study Protocol
Study Design
CSR Submission
Tabulated Data
Analysis Data
Use Cases
13
14
Web Standards meet Pharma Standards
• CDASH 1.1
• SDTM 1.2 + Impl. Guide 3.1.2
• SDTM 1.3 + Impl. Guide 3.1.3
• SEND Impl. Guide 3.0
• ADaM 2.1 + Impl. Guide 1.0
• Controlled Terminology
• CDISC e-share! (October 2014)
CDISC RDF - http://www.cdisc.org/rdf, https://github.com/phuse-org/rdf.cdisc.org
GetSDTM-DM-definition.rq
queryLocalCDISC.sas
1
2
3
15
16
"I want to take the clinical trials results..."
"..and put them in an RDF Data Cube!"
Placebo LowDose HighDose
Baseline N=28 N=30 N=29
---------------------------------------------
Sex
F 12 (42.9) 14 (46.7) 16 (55.2)
M 16 (57.1) 16 (53.3) 13 (44.8)
ds:obs1 a qb:Observation ; dim:treat code:trt-Placebo ; dim:sex code:sex-F ; dim:procedure code:procedure-count ; meas:measure "12"^^xsd:int ; qb:dataSet ds:dataset-DM . ds:obs2 a qb:Observation ; ...
Analysis Results & Metadata (WG)
1 2
3
17
• Details at the Analysis Results & Metadata Working Group Breakout sessions.
RDF Data Cube for Clinical Trials Results
18
19
• Introduction to Semantic Web & Linked Data
• Resource Description Framework (RDF)
• Pharma: Use and Impact
• Visualizing Linked Data
• Summary & Conclusion
Outline
20
• Interactive tables and text
• Visualization as Entry Point
• D3js
• Examples
Visualizing Linked Data
21
Interactive Tables, Text
SimpleDM Table
22
• Linked Data = data and metadata traversal at no
extra cost!
Visualization as Entry Point
23
D3js
25
Trials: Interactive Gantt LINK
26
Studies to Pool
LINK
28
• Introduction to Semantic Web & Linked Data
• Resource Description Framework (RDF)
• Pharma: Use and Impact
• Visualizing Linked Data
• Summary & Conclusion
Outline
29
Why Linked Data Traceability & Trust
Across the data lifecycle
Reuse
Exploration
Flexible content presentation
Reduce vendor Lock-in
Improve interoperability and data integration
30
Why Linked Data • Integrated metadata
“Metadata is a love note to the people and machines after you.”- Jason Scott via Twitter, 2011
• Data + Metadata = Context • Semantic Search
• Find related drugs, targets, studies, adverse events
• Predictions • Feed into risk-based methods for study design and conduct
Build a Knowledge Base, not another data store!
31
Learning Resources • PhUSE Wiki “Semantic Technology Working Groups”
http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology
• PhUSE Wiki “Semantic Technology Curriculum” http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology_Curriculum
• White papers, publications, presentations.
• “Learning SPARQL” by Bob DuCharme http://www.learningsparql.com/index.html - examples for download
• Semantic University by Cambridge Semantics http://www.cambridgesemantics.com/semantic-university
• RDF Primer http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/
• Knowledge Engineering with Semantic Web Technologies 2015 https://open.hpi.de/courses/semanticweb2015
32
33
Thank you
and Enjoy the Conference!
34
Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results to the data generation. With a strong urge to make and appreciate (complex) systems linked data is an interesting avenue. Marc has been co-chair for the AR&M group. For this workshop Marc has discussed the overall structure, and provided a few (technical) slides.
Tim is a Statistical Systems Analyst at UCB Biosciences. His interest in Linked Data was piqued at the PhUSE 2013 conference in Brussels. He currently co-chairs the AR&M project with Marc and serves as a co-lead of the PhUSE Semantic Technology Working Group. For this workshop Tim produced the slide deck and D3.js visualizations.
@NovasTaylor
@MarcJuAndersen