Vu210610futurejournal

Post on 10-May-2015

1.403 views 0 download

Tags:

Transcript of Vu210610futurejournal

The Future of the Journal

Anita de Waard , a.dewaard@elsevier.com Disruptive Technologies Director, Elsevier Labs

June 21, 2010

Science is made of information...

Science is made of information...

...that gets created...

Science is made of information...

...that gets created... ... and destroyed.

What is the problem?

What is the problem?

1. Researchers can’t keep track of their data.

What is the problem?

1. Researchers can’t keep track of their data.

2. Data is not stored in a way that is easy for authors.

What is the problem?

1. Researchers can’t keep track of their data.

2. Data is not stored in a way that is easy for authors.

3. For readers, article text is not linked to the underlying data.

The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan

The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan

1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.

metadata

metadata

metadata

metadata

metadata

The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan

1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.

metadata

metadata

metadata

metadata

metadata

2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.

The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan

1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.

metadata

metadata

metadata

metadata

metadata

2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.

Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-

3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.

The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan

1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.

metadata

metadata

metadata

metadata

metadata

2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.

4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated.

Review

EditRevise

Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-

3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.

The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan

1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.

metadata

metadata

metadata

metadata

metadata

5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related data item, and its heritage can be traced.

2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.

4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated.

Review

EditRevise

Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-

3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.

Some other publisher

6. User applications: distributed applications run on this ‘exposed data’ universe.

The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan

1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.

metadata

metadata

metadata

metadata

metadata

5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related data item, and its heritage can be traced.

2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.

4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated.

Review

EditRevise

Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-

3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.

What is needed to get there?

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendly

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elements

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

E. Semantic/Linked Data XML repositories.

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

E. Semantic/Linked Data XML repositories. F. Publishing systems that run application servers.

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

E. Semantic/Linked Data XML repositories. F. Publishing systems that run application servers.

tool builders

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

E. Semantic/Linked Data XML repositories. F. Publishing systems that run application servers.

tool builders

tool builders

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

E. Semantic/Linked Data XML repositories. F. Publishing systems that run application servers.

tool builders

standards bodies

tool builders

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

E. Semantic/Linked Data XML repositories. F. Publishing systems that run application servers.

tool builders

standards bodies

institutes, funding bodies, individuals

tool builders

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

E. Semantic/Linked Data XML repositories. F. Publishing systems that run application servers.

tool builders

standards bodies

institutes, funding bodies, individualspublishers

tool builders

What is needed to get there? A. Workflow tools: Linked-data-based workflow tools for all

sciences: scalable, safe, and user-friendlyB. Authoring and reviewing tools: that enable use of rich

and provenance-tracked elementsC. Metadata standards: Standards that allow exchange of

information on any knowledge item created in a lab, including provenance/privacy/IPR rights

D. Social change: Scientists who store, track and annotate their work

E. Semantic/Linked Data XML repositories. F. Publishing systems that run application servers.

tool builders

standards bodies

institutes, funding bodies, individualspublishers

publishers

tool builders

A. Workflow tools are emerging

A. Workflow tools are emerging

http://MyExperiment.org

A. Workflow tools are emerging

http://MyExperiment.org

http://VisTrails.org

A. Workflow tools are emerging

http://wings.isi.edu/

http://MyExperiment.org

http://VisTrails.org

SWAN Semantic Relationships

PDFs

MSWORD file

Excel file

person

person

group

hypothesis Claim

Claim

publication

publication

publication

gene

comment

concept

publication

publication

Claim

comment

Public

Private

B. Authoring ‘ecosystems: SWAN

Slide by Tim Clark

SWAN Semantic Relationships

PDFs

MSWORD file

Excel file

person

person

group

hypothesis Claim

Claim

publication

publication

publication

gene

comment

concept

publication

publication

Claim

comment

Public

Private

makes

makes

makes

hasEvidence

hasEvidence

hasEvidence

hasEvidence

hasEvidence

describes

describes

describes

annotates

annotates

discussedIn

annotates

authoredBy

shareWith

authoredBy

shareWith

authorOf

B. Authoring ‘ecosystems: SWAN

Slide by Tim Clark

http://esw.w3.org/HCLSIG/SWANSIOC:

Project Description

Provide a Semantic Web platform for biomedical discourse which can be evolved over time into a more general facility for many types of scientific discourse, and which is linked to key biological categories specified by ontologies.

Discourse categories should include research questions, scientific assertions or claims, hypotheses, comments and discussion, experiments, data, publications, citations, and evidence.

Our primary scientific use cases will be derived from problems in digital scientific communications and web-based research collaboratories supporting research in neurological disorders and therapies.

The scientific use cases will motivate a series of informatics use cases which can later be generalized across wider areas of biology and medicine.

C. Metadata: HCLS SiG Scientific Discourse

The Knowledge Ecosystem: Interlocking Cycles of Research

Create/modify hypothesis

Performexperiment

Collect data

Draw conclusions

Communicate

Draw conclusions

Create/modify hypothesis

Performexperiment Gather info

Synthesize

SWAN

Collect data

C. Metadata: SWAN

Slide by Tim Clark

foaf:person rdf:Type

June 1, 2010

Atomic

http://www.ht.org/foaf.rdf#me

pav:createdOn

pav:createdBy

rdf:Type

http://anyurl.com/sf_pat01.htmlann:annotates

ann:contextonDocument

InitEndCornerSelector

ImageSelector

rdf:Type

rdfs:SubClassOf(304, 507)

(380, 618)

init

end

Other annotations on the same document:1. Atomic annotation on image (tag: “hematoma”)2. General annotation (tag: “injury”)

Other annotations on similar documents:1. General annotation (tag: “skull fracture”)

hasTag

Tag

Linear skull fracture

tag FMA:skull

hasTopic

C. Metadata: Annotation Ontology

Slide by Tim Clark

D. Linked Data: E.g. for Elsevier

D. Linked Data: E.g. for Elsevier

<ce:section id=#123>

D. Linked Data: E.g. for Elsevier

<ce:section id=#123> mice like cheesethis says

D. Linked Data: E.g. for Elsevier

<ce:section id=#123>

said @anita on May 31 2010

mice like cheesethis says

but we all know she was jetlagged then

D. Linked Data: E.g. for Elsevier

<ce:section id=#123>

said @anita on May 31 2010

mice like cheesethis says

but we all know she was jetlagged then

D. Linked Data: E.g. for Elsevier

<ce:section id=#123>

said @anita on May 31 2010

immutable, $$, proprietary

mice like cheesethis says

dynamic, personal, task-driven, - open?

but we all know she was jetlagged then

D. Linked Data: E.g. for Elsevier

<ce:section id=#123>

said @anita on May 31 2010

immutable, $$, proprietary

mice like cheesethis says

D. What to link? Semantic annotation grid

D. What to link? Semantic annotation grid

D. What to link? Semantic annotation grid

document

claim

triple

entity

collectionGranularity

D. What to link? Semantic annotation grid

document

claim

triple

entity

collectionGranularity

reader/data miningtypesetter/productionauthor/editorMoment

measure

D. What to link? Semantic annotation grid

automated

manual

semi-automated

Means

document

claim

triple

entity

collectionGranularity

reader/data miningtypesetter/productionauthor/editorMoment

measure

D. What to link? Semantic annotation grid

Automated Copy Editing

automated

manual

semi-automated

Means

document

claim

triple

entity

collectionGranularity

reader/data miningtypesetter/productionauthor/editorMoment

measure

D. What to link? Semantic annotation grid

Automated Copy Editing

Reflect

automated

manual

semi-automated

Means

document

claim

triple

entity

collectionGranularity

reader/data miningtypesetter/productionauthor/editorMoment

measure

D. A start: .XMP RDF in all our PDFs: DC + PRISM

E. Publishing on an Application server

E. SD as application server: an example

• Fall 2010: ‘Beyond the PDF’: Workshop organized by Phil Bourne @UCSD: –Take one paper from his group–And all data that went into making that paper–Including all correspondence, raw data, etc. –Challenge: how better to represent that?

• 2010 - 2011: Try to gather resources, current efforts, etc. on virtual platform

• August 2011:FoRC: Future of Research Communications–Dagstuhl Workshop–Involve key people (include funding bodies, libraries,

institutions) to see where bottlenecks are• Start using these tools and writing this way!

Next Steps: