The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge...

63
The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Transcript of The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge...

Page 1: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

The consequences for STM publishing

ASA – London –22 February 2011Jan Velterop – ACKnowledge Ltd.

Page 2: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

What does

mean for STM publishing?

ACKnowledge

Nanopublication

Page 3: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

GAME OVER!!! GAME OVER!!! GAME OVER!!! GAME OVER!!! GAME OVER!!! GAME OVER!!! GAME OVER!!! GAME OVER!!!

ACKnowledge

Page 4: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Publishing ≠ knowledge transfer

Publishing ≠ knowledge transfer

Publishing = knowledge transfer

Publishing = knowledge transfer

ACKnowledge

Page 5: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Why this change ?Why this change ?

ACKnowledge

Page 6: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Record Keeping

Knowledge Transfer

Different interestsDifferent interests

ACKnowledge

Page 7: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Why do scientists publish ?

Why do scientists publish ?

ACKnowledge

Page 8: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

The R

ecord

of Scien

ce

The record*“keeping the minutes of science”The record*“keeping the minutes of science”

*picture inspired by Geoffrey Bilder of CrossRef*picture inspired by Geoffrey Bilder of CrossRef

1

ACKnowledge

Page 9: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Credit in the ego-system; the acknowledge economy*Credit in the ego-system; the acknowledge economy*

*‘Acknowledge economy’ coined by Geoffrey Bilder of CrossRef*‘Acknowledge economy’ coined by Geoffrey Bilder of CrossRef

2

ACKnowledge

Page 10: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

1+2=the interface with

officialdomthe interface with

officialdom

ACKnowledge

Page 11: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Transfer ofinformation and knowledge

Transfer ofinformation and knowledge

3

ACKnowledge

Page 12: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

3=the interface with

sciencethe interface with

science

ACKnowledge

Page 13: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Are the requirements for all three the same?

Are the requirements for all three the same?

ACKnowledge

Page 14: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

What we have may be good for the record and

for credit...

What we have may be good for the record and

for credit...

ACKnowledge

Page 15: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

...but is it satisfactory for the transfer of knowledge ?...but is it satisfactory for

the transfer of knowledge ?

?ACKnowledge

Page 16: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

There is justtoo much to read

There is justtoo much to read

ACKnowledge

Page 17: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

“Information consumes the attention of its recipients...

“Information consumes the attention of its recipients...

ACKnowledge

Page 18: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Datarrhoea?

Publicatarrh?

Datarrhoea?

Publicatarrh?

ACKnowledge

Page 19: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

...hence a wealth of information creates a poverty of attention”

...hence a wealth of information creates a poverty of attention”

Herbert SimonHerbert SimonACKnowledge

Page 20: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Should we have to make choices what to read and

what not?

Should we have to make choices what to read and

what not?

ACKnowledge

Page 21: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Can we, truly ?Can we, truly ?

ACKnowledge

Page 22: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

How would we choose anyway?

How would we choose anyway?

Photo by: flickr - RainbirderACKnowledge

Page 23: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Shouldn’t we take in ALL the knowledge in our

area?

Shouldn’t we take in ALL the knowledge in our

area?

ACKnowledge

Page 24: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

As well as satisfy the academic desire to avoid

reading ?

As well as satisfy the academic desire to avoid

reading ?

ACKnowledge

Page 25: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

ACKnowledge

A new article in PubMed every 36

seconds

Scientists are struggling to make sense of the expanding scientific literature.

Corie Lok asks whether computational tools can do the hard work for them.

In 2002, when he began to make the transition from basic cell biology to research into Alzheimer’s disease, Virgil Muresan found himself all but overwhelmed by the sheer volume of literature on the disease. He and his wife, Zoia, both now at the University of Medicine and Dentistry of New Jersey in Newark, were hoping to test an idea that they had developed about the formation of the protein plaques in the brains of people with Alzheimer’s disease. But, as new- comers to the field, they were finding it almost impossible to figure out whether their hypothesis was consistent with existing publications. “It’s really difficult to be up to date with so much being published,” says Virgil Muresan. And it’s a challenge that is increasingly facing researchers in every field. The 19 million citations and abstracts covered by the US National Library of Medicine’s PubMed search engine include nearly 830,000 articles published in 2009, up from some 814,000 in 2008 and around 772,000 in 2007. That growth rate shows no signs of abating, especially as emerg- ing countries such as China and Brazil continue to ratchet up their research. The Muresans, however, were able to make use of Semantic Web Applications in Neuromedicine (SWAN), one of a new generation of online tools designed to help researchers zero in on the papers most relevant to their interests, uncover connections and gaps that might not otherwise be obvious, and test and generate new hypotheses. “If you think about how much effort and money we put into just Alzheimer’s disease research, it is surpris- ing that people don’t put more effort into harvesting the published knowledge,” says Elizabeth Wu, SWAN’s project manager. SWAN attempts to help researchers harvest that knowledge by providing a curated, browseable online repository of hypotheses in Alzheimer’s disease research. The hypothesis that the Muresans put into SWAN, for example, was that plaque formation begins when amyloid-β, the major component of brain plaques, forms seeds in the terminal regions of cells in the brainstem that then nucleate the plaques in the other parts of the brain into which the terminals reach. SWAN provides a visual, colour- coded display of the relationships between the hypotheses, as derived from the published literature, and shows where they may agree or conflict. The connections revealed by SWAN led the Muresans to new mouse-model experiments designed to strengthen their hypothesis. “SWAN has advanced our research, and focused it in a certain direction but also broadened it to other directions,” says Virgil Muresan. The use of computers to help researchers drink from the literature firehose dates back to the early 1960s and the first experiments with techniques such as keyword searching. More recent efforts include the striking ‘maps of science’ that cluster papers together on the basis of how often they cite one another, or by similarities in the frequencies of certain keywords. As fascinating as these maps can be, however, they don’t get at the semantics of the papers — the fact that they are talking about specific entities such as genes and proteins, and making assertions about those entities (such as gene X regulates gene Y). The extraction of this kind of informa- tion is much harder to automate, because computers are notoriously poor at understanding what they are read- ing. Even so, informaticians and biologists are working together more and making considerable progress, says Maryann Martone, the chairwoman of the Society for Neuroscience’s neuroinformatics committee. Recently, a number of companies and academic researchers have begun to create tools that are useful for scientists, using various mixtures of automated analysis and manual curation (see ‘Power tools’, page 418). Deeper meaning The goal of these tools is to help researchers analyse and integrate the literature more efficiently than they can do through their own reading, to hone in on the most fruitful experiments to do and to make new predictions of gene functions, say, or drug side effects. The first step towards that goal is for the text- or semantic-mining tool to recognize key terms, or enti- ties, such as genes and proteins. For example, academic publisher Elsevier, headquartered in Amsterdam, has piloted Reflect in two recent online issues of its jour- nal Cell. The technology was developed at the European Molecular Biology Laboratory in Heidelberg, Germany, and won Elsevier’s Grand Challenge 2009 competition for new tools that improve the communication and use of scientific information. Reflect automatically recognizes and highlights the names of genes, proteins and small molecules in the Cell articles. Users clicking on a highlighted term will see a pop- up box containing information related to that term, such as sequence data and molecular structures, along with links to the sources of the data. Reflect obtains this information from its dictionary of millions of proteins and small molecules. Such ‘entity recognition’ can be done fairly accurately by many mining tools today. But other tools take on the tougher challenge of recognizing relationships between the entities. Researchers from Leiden University and Erasmus University in Rotterdam, both in the Netherlands, have developed software called Peregrine, and used it to pre- dict an undocumented interaction between two proteins:

Page 26: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

“A new entrant in the field would need 19 years and 202 days to read all the relevant literature” (Assumption: reading 5

articles an hour, 8 hours a day, 5 days a week, 50 weeks a year) Ergo: nobody can have a comprehensive overview

• Shared knowledge between scientists is an illusion • The chance that a specialist reading one paper a day will

read a particular paper is only 1 in 8.9 • The chance that a colleague elsewhere will read the

same paper in a given year is only 1 in 79

Fraser, A.G. and Dunstan, F.D. (2010), ‘On the impossibility of being expert’. BMJ 2010; 341:c6815, doi: 10.1136/bmj.c6815.

ACKnowledge

Page 27: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Current publishing : needle transportCurrent publishing : needle transport

Photo By Stewsnews

ACKnowledge

Page 28: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

There is no way we can read everything – not even when it’s highly

relevant

There is no way we can read everything – not even when it’s highly

relevant

ACKnowledge

Page 29: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Create an overview first, perhaps?

Create an overview first, perhaps?

ACKnowledge

Page 30: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

ACKnowledge

Page 31: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

And then home in on detail?

And then home in on detail?

ACKnowledge

Page 32: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

ACKnowledge

Page 33: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Back to the question:what does it all mean for

publishing ?

Back to the question:what does it all mean for

publishing ?

ACKnowledge

Page 34: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Content is King

ACKnowledge

Page 35: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Or “was” perhaps?

ACKnowledge

Page 36: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Though the emphasis is more and more on

service, content still counts

Though the emphasis is more and more on

service, content still counts

ACKnowledge

Page 37: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Science literature is full of assertions

Science literature is full of assertions

ACKnowledge

Page 38: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Most ‘concrete’ ones have the form of ‘triples’:

Object Predicate Subject

ConceptConceptss

ConceptConceptss

ACKnowledge

Page 39: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

has pathogenicity

interacts with

has GO annotation

LGMD2A

SNT3

proteosome endopeptidase activity

CAPN3_00265

Dystrophin

Calpain-3

ACKnowledge

Some examples of ‘triples’:

Page 40: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

ACKnowledge

Identifier: d63dd9a2-5c8c-11df-b0cb-001517ac506c

Page 41: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

d0e6b292-5c61-11df-b0cb-001517ac506c dcf809bb-1a01-468a-a316-3b08de22dd46 b2437eb4-5ec4-11df-b0cb-001517ac506c

ConceptsConceptsUnambiguously identifiedUnambiguously identified

ConceptsConceptsUnambiguously identifiedUnambiguously identified

ACKnowledge

‘Assertions’

Dystrophininteracts with

SNT3

Page 42: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Adding Attributes to Assertions

Examples of attributes:

assertedBy - states which entity asserted (i.e. created) the statement

curatedBy - states that a specified entity has curated the statement

isPeerReviewed - states that this statement has been peer reviewed

isPublished - states where this statement was first published

isEvidencedBy - states that another statement, Y, should be considered evidence for this statement X

createdOn - states the date/time that the statement was created

hasAuthor - states who claims authorship of the statement

isApprovedBy - states who approves of the statement

isDeprecatedBy - states that the statement is no longer in use by the entity in question

ACKnowledge

Let’s call them‘Nanopublications’

Page 43: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

In: The anatomy of a nanopublicationPaul Groth, Andrew Gibson, Jan Velterop

http://iospress.metapress.com/content/ftkh21q50t521wm2/fulltext.html

Page 44: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

EvidenceStore

Data store 1

Data store 2

Data store n

Imp

ort

, In

tegra

tion

, N

oti

fica

tion A

PI

Identity mapping service (IRS service layer)

Reasoning / Integration

Concept identifier store (IRS data layer)Also associating with semantic type. Not more!

Linked Data Cache(Cardinal Assertions)

ProvenanceStore

Evidence Calculator

ACKnowledge

Page 45: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

EvidenceStore

Data store 1

Data store 2

Data store n

Imp

ort

, In

tegra

tion

, N

oti

fica

tion A

PI

Identity mapping service (IRS service layer)

Reasoning / Integration

Use

r In

terf

ace

s /

Web s

erv

ices

Concept identifier store (IRS data layer)Also associating with semantic type. Not more!

Linked Data Cache(Cardinal Assertions)

ProvenanceStore

Evidence Calculator

Cura

tion Inte

rface

s

ACKnowledge

Page 46: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Incorporate nanopublications in your

content

Help disseminate the interfaces (HTML

as well as PDF)

Publishers:Publishers:

ACKnowledge

Page 47: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Nanopublications are not only assertions

Nanopublications are not only assertions

ACKnowledge

Page 48: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Nanopublications are also referencesNanopublications

are also references

ACKnowledge

Page 49: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

i.e. they can be cited:good for impact & acknowledgement

i.e. they can be cited:good for impact & acknowledgement

ACKnowledge

Page 50: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

And...aren’t references open and free ?

And...aren’t references open and free ?

ACKnowledge

Page 51: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

<rdf:Description rdf:about=”http://www.nbic.nl/cwa/relation/#26277419#13817745"><cwa:typeRelation rdf:resource=” http://predicate.conceptwiki.org/index.php/#2121378”/><cwa:direction>1,2</cwa:direction><cwa:strength>1.0</cwa:strength><cwa:author rdf:resource=”http://people.conceptwiki.org/index.php/#85094810”/><cwa:provenance rdf:resource=”http://article.conceptwiki.org/index.php/#121646370”/><cwa:timestamp>1240641052059</cwa:timestamp><cwa:annotated_by rdf:resource=”http://people.conceptwiki.org/index.php/#43065817”/><cwa:annotation rdf resource=”http://www.virusdb.org/viruses/av/Heliothis_virescens_insect”></rdf:Description>

ACKnowledge

Page 52: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

ACKnowledge

Page 53: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

The whole picture

Page 54: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Even if you don’tEven if you don’thave all the detailhave all the detail

Page 55: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

detail

ACKnowledge

Page 56: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

nanopublications (i.e. references) can also be used to reason

ACKnowledge

protein A

protein X

‘Publications’ yield ‘data’Exposing the ‘unknown knowns’

Page 57: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

What is the use of water?

What is the use of water?

ACKnowledge

Page 58: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

ACKnowledge

Page 59: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

What is the use of information?

What is the use of information?

ACKnowledge

Page 60: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

ACKnowledge

Page 61: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Jan Velterop

oCelsius

Ice

Water

ACKnowledge

Page 62: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Jan Velterop

The climate in the scholarly communication world is clearly

hotting up

Page 63: The consequences for STM publishing ASA – London –22 February 2011 Jan Velterop – ACKnowledge Ltd.

Thank you

jan.velterop acknowledgeconnect comvelterop conceptweballiance orgat

at

ACKnowledge