Biomedical literature mining (and why we really need open access)

80
Biomedical literature mining (and why we really need Open Access) Lars Juhl Jensen EMBL Heidelberg

description

The 28th IATUL annual conference: Global Access to Science - Scientific Publishing for the Future, Royal Institute of Technology (KTH), Stockholm, Sweden, June 11-14, 2007

Transcript of Biomedical literature mining (and why we really need open access)

Page 1: Biomedical literature mining (and why we really need open access)

Biomedical literature mining(and why we really need Open Access)

Lars Juhl JensenEMBL Heidelberg

Page 2: Biomedical literature mining (and why we really need open access)

why biomedicine?

Page 3: Biomedical literature mining (and why we really need open access)

why literature mining?

Page 4: Biomedical literature mining (and why we really need open access)

why open access?

Page 5: Biomedical literature mining (and why we really need open access)

MEDLINE

Page 6: Biomedical literature mining (and why we really need open access)

17 million citations

Page 7: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 8: Biomedical literature mining (and why we really need open access)

too much to read

Page 9: Biomedical literature mining (and why we really need open access)

literature mining

Page 10: Biomedical literature mining (and why we really need open access)

open access

Page 11: Biomedical literature mining (and why we really need open access)

information retrieval

Page 12: Biomedical literature mining (and why we really need open access)

finding the papers

Page 13: Biomedical literature mining (and why we really need open access)

ad hoc retrieval

Page 14: Biomedical literature mining (and why we really need open access)
Page 15: Biomedical literature mining (and why we really need open access)

user-specified query

Page 16: Biomedical literature mining (and why we really need open access)

“yeast AND cell cycle”

Page 17: Biomedical literature mining (and why we really need open access)

stemming

Page 18: Biomedical literature mining (and why we really need open access)

yeast / yeasts

Page 19: Biomedical literature mining (and why we really need open access)

dynamic query expansion

Page 20: Biomedical literature mining (and why we really need open access)

yeast / S. cerevisiae

Page 21: Biomedical literature mining (and why we really need open access)
Page 22: Biomedical literature mining (and why we really need open access)

MEDLINE

Page 23: Biomedical literature mining (and why we really need open access)

abstracts

Page 24: Biomedical literature mining (and why we really need open access)

complete papers

Page 25: Biomedical literature mining (and why we really need open access)

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 26: Biomedical literature mining (and why we really need open access)

yeast?

Page 27: Biomedical literature mining (and why we really need open access)

cell cycle?

Page 28: Biomedical literature mining (and why we really need open access)

entity recognition

Page 29: Biomedical literature mining (and why we really need open access)

identifying the substance(s)

Page 30: Biomedical literature mining (and why we really need open access)

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 31: Biomedical literature mining (and why we really need open access)

Cdc28 yeast

Page 32: Biomedical literature mining (and why we really need open access)

Cdc28 cell cycle

Page 33: Biomedical literature mining (and why we really need open access)

good synonyms list

Page 34: Biomedical literature mining (and why we really need open access)

manual curation

Page 35: Biomedical literature mining (and why we really need open access)

orthographic variation

Page 36: Biomedical literature mining (and why we really need open access)

CDC28

Page 37: Biomedical literature mining (and why we really need open access)

Cdc28p

Page 38: Biomedical literature mining (and why we really need open access)

disambiguation

Page 39: Biomedical literature mining (and why we really need open access)

hairy

Page 40: Biomedical literature mining (and why we really need open access)

SDS

Page 41: Biomedical literature mining (and why we really need open access)

Cdc2

Page 42: Biomedical literature mining (and why we really need open access)
Page 43: Biomedical literature mining (and why we really need open access)
Page 44: Biomedical literature mining (and why we really need open access)

abstracts

Page 45: Biomedical literature mining (and why we really need open access)

complete papers

Page 46: Biomedical literature mining (and why we really need open access)

information extraction

Page 47: Biomedical literature mining (and why we really need open access)

formalizing the facts

Page 48: Biomedical literature mining (and why we really need open access)
Page 49: Biomedical literature mining (and why we really need open access)

co-mentioning

Page 50: Biomedical literature mining (and why we really need open access)

statistical methods

Page 51: Biomedical literature mining (and why we really need open access)

NLPNatural Language Processing

Page 52: Biomedical literature mining (and why we really need open access)

Gene and protein names

Cue words for entity recognition

Verbs for relation extraction

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 53: Biomedical literature mining (and why we really need open access)

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 54: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 55: Biomedical literature mining (and why we really need open access)

new discoveries

Page 56: Biomedical literature mining (and why we really need open access)

text mining

Page 57: Biomedical literature mining (and why we really need open access)
Page 58: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 59: Biomedical literature mining (and why we really need open access)

abstracts

Page 60: Biomedical literature mining (and why we really need open access)

complete papers

Page 61: Biomedical literature mining (and why we really need open access)

temporal trends

Page 62: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 63: Biomedical literature mining (and why we really need open access)

buzzwords

Page 64: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 65: Biomedical literature mining (and why we really need open access)

grant applications

Page 66: Biomedical literature mining (and why we really need open access)

integration of text and data

Page 67: Biomedical literature mining (and why we really need open access)

Genomic neighborhood

Species co-occurrence

Gene fusions

Database imports

Experimental interaction data

Microarray expression data

Literature mining

Page 68: Biomedical literature mining (and why we really need open access)

genotype to phenotype

Page 69: Biomedical literature mining (and why we really need open access)

Korbel et al., PLoS Biology, 2005

Page 70: Biomedical literature mining (and why we really need open access)

Korbel et al., PLoS Biology, 2005

Page 71: Biomedical literature mining (and why we really need open access)

Korbel et al., PLoS Biology, 2005

Page 72: Biomedical literature mining (and why we really need open access)

where are we now?

Page 73: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 74: Biomedical literature mining (and why we really need open access)

abstracts

Page 75: Biomedical literature mining (and why we really need open access)

complete papers

Page 76: Biomedical literature mining (and why we really need open access)

restricted access

Page 77: Biomedical literature mining (and why we really need open access)

open access

Page 78: Biomedical literature mining (and why we really need open access)

the tools are there

Page 79: Biomedical literature mining (and why we really need open access)

now we need the text!

Page 80: Biomedical literature mining (and why we really need open access)

Acknowledgments

Jasmin SaricRossitza Ouzounova

Michael KuhnJan Korbel

Tobias DoerksIsabel Rojas

Miguel AndradePeer Bork