Data is knowledge, knowledge data - National Academies of ...

12
www.ci.anl.gov www.ci.uchicago.edu Data is knowledge, knowledge data … Ian Foster

Transcript of Data is knowledge, knowledge data - National Academies of ...

www.ci.anl.govwww.ci.uchicago.edu

Data is knowledge, knowledge data …

Ian Foster

• Proteomics• Genomics• Transcriptomics• Protein sequence prediction• Phenotypic studies• Phylogeny• Sequence analysis• Protein structure prediction• Protein-protein interaction• Metabolomics• Model organism collections• Systems biology• Health epidemiology• Organisms• Disease ….

1330 molecular bio databases Nucleic Acids Research Jan 2011

(96 in 2001, 1070 in 2010)

Slide: Carole Goble

www.ci.anl.govwww.ci.uchicago.edu

4

Knowledge results from data integration

ID MURA_BACSU STANDARD; PRT; 429 AA.DE PROBABLE UDP-N-ACETYLGLUCOSAMINE 1-CARBOXYVINYLTRANSFERASEDE (EC 2.5.1.7) (ENOYLPYRUVATE TRANSFERASE) (UDP-N-ACETYLGLUCOSAMINEDE ENOLPYRUVYL TRANSFERASE) (EPT).GN MURA OR MURZ.OS BACILLUS SUBTILIS.OC BACTERIA; FIRMICUTES; BACILLUS/CLOSTRIDIUM GROUP; BACILLACEAE;OC BACILLUS.KW PEPTIDOGLYCAN SYNTHESIS; CELL WALL; TRANSFERASE.FT ACT_SITE 116 116 BINDS PEP (BY SIMILARITY).FT CONFLICT 374 374 S -> A (IN REF. 3).SQ SEQUENCE 429 AA; 46016 MW; 02018C5C CRC32;

MEKLNIAGGD SLNGTVHISG AKNSAVALIP ATILANSEVT IEGLPEISDI ETLRDLLKEIGGNVHFENGE MVVDPTSMIS MPLPNGKVKK LRASYYLMGA MLGRFKQAVI GLPGGCHLGPRPIDQHIKGF EALGAEVTNE QGAIYLRAER LRGARIYLDV VSVGATINIM LAAVLAEGKTIIENAAKEPE IIDVATLLTS MGAKIKGAGT NVIRIDGVKE LHGCKHTIIP DRIEAGTFMI

[source: GlaxoSmithKline]

www.ci.anl.govwww.ci.uchicago.edu

5Wei Tan, Ravi Madduri. Services Computing in

5

www.ci.anl.govwww.ci.uchicago.edu

6Wei Tan, Ravi Madduri. Services Computing in

6

www.ci.anl.govwww.ci.uchicago.edu

7

The workflow-service relation [Wei Tan]

workflowservice— workflow calls service

www.ci.anl.govwww.ci.uchicago.edu

8

Institutions/PIsBiosketch

Progress report

Final report

Tenure reviewOther uses

Schematic overview of possibilities

Research activities and

outputs platform

Research activities and

outputs platform

AgenciesResearch

Performance Progress

Report

Auto-filled candidateRevision, approval

Researchers

www.ci.anl.govwww.ci.uchicago.edu

9

Aim for dual use in outcomes data

People Products

Grants

produce

support

Institutionsawarded to

pay for

train

co-author

employ

use

Firmsfound, advise, employed by

used by

www.ci.anl.govwww.ci.uchicago.edu

10 J A Evans, J G Foster Science 2011;331:721-725

Knowledge is data (“metaknowledge”)

www.ci.anl.govwww.ci.uchicago.edu

24/5/2007 | myExperiment |

Source: David de Roure