Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends...

34
1 Create RDF using R R with rrdf, rrdflibs https://github.com/egonw/rrdf R Data frame to RDF Excel->data frame-> to RDF SAS dataset -> data frame -> RDF rrdf, rrdflibs

Transcript of Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends...

Page 1: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

1

Create RDF using R

• R with rrdf, rrdflibs

https://github.com/egonw/rrdf

• R Data frame to RDF

– Excel->data frame-> to RDF

– SAS dataset -> data frame -> RDF

rrdf, rrdflibs

Page 2: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

2

Create RDF using R

Packages: rrdf, rrdflibs • add.triple()

– Add a triple :object is a URI

• add.data.triple()

– Add triple: object is a literal

Page 3: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

3

Create RDF using R

Try or follow along

File: createTTLFromR.R

Output File: createTTLFromR.TTL

Page 4: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

4

Create RDF using SAS

• SAS accessing SPARQL service using PROC HTTP • All functions provided by the service, see SPARQL 1.1 Protocol

(https://www.w3.org/TR/sparql11-protocol/)

• Implemented as SAS macros https://github.com/MarcJAndersen/SAS-SPARQLwrapper

• SAS generating text files with • RDF in Turtle

• SPARQL INSERT statements

Page 5: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

5

Output File:

createTTLFromSAS.TTL

Create RDF using SAS

Try or follo

w along

File: createTTLFromSAS.SAS

2 1

3

Page 6: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

6

Page 7: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

7

Putting it all together

Extract, Transform

RDF Linked Data

Knowledgebase

Load

Federated Queries

• Analysis • Visualization • Submission • Publication

Page 8: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

8

• Introduction to Semantic Web & Linked Data

• Resource Description Framework (RDF)

• Pharma: Use and Impact

• Visualizing Linked Data

• PhUSE CSS Projects

• Conclusion

Outline

Page 9: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

9

“The choice for all major content providers is not whether to adopt a linked data approach but when. Asking ‘do we need linked data?’ today may be analogous to asking ‘do we really need a website?’ 15 years ago. ” - Cochrane Technical Report (2013)

Page 10: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

10

Page 11: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

11

Page 12: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

12

Traceability: THE Linked Data Super Power

Analysis Results & Displays

Study Protocol CSR Submission Data

Pharma Company, Biotech …

Regulatory Agency, Reviewer…

• Publications • Registries • Data Gathering

• Meta-analyses • Competitive

Intelligence

PRODUCE

CONSUME

Page 13: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

Collected Data

Clinical Program Design in RDF

Study Design and Protocol in RDF

CDISC Foundational Standards in RDF

Analysis Results & Metadata

Regulations in RDF

Data Lifecycle

CS WG Linked Data Projects & Deliverables

Clinical Development

Plan

Analysis Results & Displays

Study Protocol

Study Design

CSR Submission

Tabulated Data

Analysis Data

Use Cases

13

Page 14: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

14

Web Standards meet Pharma Standards

• CDASH 1.1

• SDTM 1.2 + Impl. Guide 3.1.2

• SDTM 1.3 + Impl. Guide 3.1.3

• SEND Impl. Guide 3.0

• ADaM 2.1 + Impl. Guide 1.0

• Controlled Terminology

• CDISC e-share! (October 2014)

Page 15: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

CDISC RDF - http://www.cdisc.org/rdf, https://github.com/phuse-org/rdf.cdisc.org

GetSDTM-DM-definition.rq

queryLocalCDISC.sas

1

2

3

15

Page 16: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

16

"I want to take the clinical trials results..."

"..and put them in an RDF Data Cube!"

Placebo LowDose HighDose

Baseline N=28 N=30 N=29

---------------------------------------------

Sex

F 12 (42.9) 14 (46.7) 16 (55.2)

M 16 (57.1) 16 (53.3) 13 (44.8)

ds:obs1 a qb:Observation ; dim:treat code:trt-Placebo ; dim:sex code:sex-F ; dim:procedure code:procedure-count ; meas:measure "12"^^xsd:int ; qb:dataSet ds:dataset-DM . ds:obs2 a qb:Observation ; ...

Page 17: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

Analysis Results & Metadata (WG)

1 2

3

17

Page 18: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

• Details at the Analysis Results & Metadata Working Group Breakout sessions.

RDF Data Cube for Clinical Trials Results

18

Page 19: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

19

• Introduction to Semantic Web & Linked Data

• Resource Description Framework (RDF)

• Pharma: Use and Impact

• Visualizing Linked Data

• Summary & Conclusion

Outline

Page 20: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

20

• Interactive tables and text

• Visualization as Entry Point

• D3js

• Examples

Visualizing Linked Data

Page 21: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

21

Interactive Tables, Text

SimpleDM Table

Page 22: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

22

• Linked Data = data and metadata traversal at no

extra cost!

Visualization as Entry Point

Page 23: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

23

D3js

Page 24: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

24

Data Cube Structure LINK

Page 28: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

28

• Introduction to Semantic Web & Linked Data

• Resource Description Framework (RDF)

• Pharma: Use and Impact

• Visualizing Linked Data

• Summary & Conclusion

Outline

Page 29: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

29

Why Linked Data Traceability & Trust

Across the data lifecycle

Reuse

Exploration

Flexible content presentation

Reduce vendor Lock-in

Improve interoperability and data integration

Page 30: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

30

Why Linked Data • Integrated metadata

“Metadata is a love note to the people and machines after you.”- Jason Scott via Twitter, 2011

• Data + Metadata = Context • Semantic Search

• Find related drugs, targets, studies, adverse events

• Predictions • Feed into risk-based methods for study design and conduct

Build a Knowledge Base, not another data store!

Page 31: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

31

Learning Resources • PhUSE Wiki “Semantic Technology Working Groups”

http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology

• PhUSE Wiki “Semantic Technology Curriculum” http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology_Curriculum

• White papers, publications, presentations.

• “Learning SPARQL” by Bob DuCharme http://www.learningsparql.com/index.html - examples for download

• Semantic University by Cambridge Semantics http://www.cambridgesemantics.com/semantic-university

• RDF Primer http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/

• Knowledge Engineering with Semantic Web Technologies 2015 https://open.hpi.de/courses/semanticweb2015

Page 32: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

32

Page 33: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

33

Thank you

and Enjoy the Conference!

Page 34: Create RDF using R - PhUSE Wiki · 34 Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results

34

Who We Are Marc works as statistician and often ends reproducing or extending earlier reporting, and thereby appreciates traceability from results to the data generation. With a strong urge to make and appreciate (complex) systems linked data is an interesting avenue. Marc has been co-chair for the AR&M group. For this workshop Marc has discussed the overall structure, and provided a few (technical) slides.

Tim is a Statistical Systems Analyst at UCB Biosciences. His interest in Linked Data was piqued at the PhUSE 2013 conference in Brussels. He currently co-chairs the AR&M project with Marc and serves as a co-lead of the PhUSE Semantic Technology Working Group. For this workshop Tim produced the slide deck and D3.js visualizations.

[email protected]

@NovasTaylor

@MarcJuAndersen

[email protected]