21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era...

128
21st Century Biology: Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101 Seattle, Washington 98104 [email protected] (206) 667 4778

Transcript of 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era...

Page 1: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

21st Century Biology:Informatics in the Post-Genome Era

Robert J. RobbinsFred Hutchinson Cancer Research Center

1124 Columbia Street, LV-101Seattle, Washington 98104

[email protected](206) 667 4778

Page 2: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

2

Topics

• Biotechnology will be the “magic”technology of the 21st Century.

Page 3: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

3

Topics

• Biotechnology will be the “magic”technology of the 21st Century.

• Information Technology (IT) has aspecial relationship with biology.

Page 4: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

4

Topics

• Biotechnology will be the “magic”technology of the 21st Century.

• Information Technology (IT) has aspecial relationship with biology.

• Moore’s Law constantly transforms IT(and everything else).

Page 5: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

5

Topics

• Biotechnology will be the “magic”technology of the 21st Century.

• Information Technology (IT) has aspecial relationship with biology.

• Moore’s Law constantly transforms IT(and everything else).

• Current funding mechanisms for bio-information infrastructure are hopelesslyinadequate to meet future needs and mustbe radically reformed.

Page 6: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

Introduction

MagicalTechnology

Page 7: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

7

Magic

To a person from 1897, much currenttechnology would seem like magic.

Page 8: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

8

Magic

To a person from 1897, much currenttechnology would seem like magic.

What technology of 2097 would seemmagical to a person from 1997?

Page 9: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

9

Magic

To a person from 1897, much currenttechnology would seem like magic.

What technology of 2097 would seemmagical to a person from 1997?

Candidate: Biotechnology so advancedthat the distinction between living andnon-living is blurred.

Page 10: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

IT-BiologySynergism

Page 11: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

11

IT is Special

Information Technology:

• affects the performance and themanagement of tasks

Page 12: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

12

IT is Special

Information Technology:

• affects the performance and themanagement of tasks

• allows the manipulation of hugeamounts of highly complex data

Page 13: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

13

IT is Special

Information Technology:

• affects the performance and themanagement of tasks

• allows the manipulation of hugeamounts of highly complex data

• is incredibly plastic(programming and poetry are both exercises in pure thought)

Page 14: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

14

IT is Special

Information Technology:

• affects the performance and themanagement of tasks

• allows the manipulation of hugeamounts of highly complex data

• is incredibly plastic(programming and poetry are both exercises in pure thought)

• improves exponentially(Moore’s Law)

Page 15: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

15

Biology is Special

Life is Characterized by:

• individuality

Page 16: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

16

Biology is Special

Life is Characterized by:

• individuality

• historicity

Page 17: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

17

Biology is Special

Life is Characterized by:

• individuality

• historicity

• contingency

Page 18: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

18

Biology is Special

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information content

Page 19: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

19

Biology is Special

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information contentNo law of large numbers...

Page 20: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

20

Biology is Special

Life is Characterized by:

• individuality

• historicity

• contingency

• high (digital) information contentNo law of large numbers, since everyliving thing is genuinely unique.

Page 21: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

21

IT-Biology Synergism

• Physics needs calculus, the method formanipulating information aboutstatistically large numbers of vanishinglysmall, independent, equivalent things.

Page 22: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

22

IT-Biology Synergism

• Physics needs calculus, the method formanipulating information aboutstatistically large numbers of vanishinglysmall, independent, equivalent things.

• Biology needs information technology, themethod for manipulating informationabout large numbers of dependent,historically contingent, individual things.

Page 23: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

23

Biology is Special

For it is in relation to the statistical point of viewthat the structure of the vital parts of livingorganisms differs so entirely from that of anypiece of matter that we physicists and chemistshave ever handled in our laboratories ormentally at our writing desks.

Erwin Schrödinger. 1944. What is Life.

Page 24: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

24

Genetics as Code

[The] chromosomes ... contain in some kind of code-script the entire pattern of the individual's futuredevelopment and of its functioning in the mature state.... [By] code-script we mean that the all-penetratingmind, once conceived by Laplace, to which everycausal connection lay immediately open, could tellfrom their structure whether [an egg carrying them]would develop, under suitable conditions, into a blackcock or into a speckled hen, into a fly or a maize plant,a rhodo-dendron, a beetle, a mouse, or a woman.

Erwin Schrödinger. 1944. What is Life.

Page 25: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

25

One Human SequenceWe now know thatSchrödinger’s mysterioushuman “code-script”consists of 3.3 billionbase pairs of DNA.

Page 26: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

26

One Human Sequence

Typed in 10-pitch font, one human sequence would stretch for morethan 5,000 miles. Digitally formatted, it could be stored on one CD-ROM. Biologically encoded, it fits easily within a single cell.

We now know thatSchrödinger’s mysterioushuman “code-script”consists of 3.3 billionbase pairs of DNA.

Page 27: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

27

Bio-digital Information

DNA is a highly efficient digital storage device:

• There is more mass-storage capacity in theDNA of a side of beef than in all the hard drivesof all the world’s computers.

Page 28: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

28

Bio-digital Information

DNA is a highly efficient digital storage device:

• There is more mass-storage capacity in theDNA of a side of beef than in all the hard drivesof all the world’s computers.

• Storing all of the (redundant) information in allof the world’s DNA on computer hard diskswould require that the entire surface of the Earthbe covered to a depth of three miles in Conner1.0 gB drives.

Page 29: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

Genomics:An Example

Page 30: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

30

Computers as Instruments

Computers are not just tools for catalogingexisting knowledge. They are instruments thatchange the way we can see the biologicalworld. Computers allow us to see genomes,just as radio telescopes let us see quasars andmicroscopes let us see cells.

Page 31: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

31

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

USDOE. 1990. Understanding Our Genetic Inheritance.The U.S. Human Genome Project: The First Five Years.

Page 32: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

32

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– production of a variety of physical maps of all humanchromosomes and of the DNA of selected modelorganisms;

USDOE. 1990. Understanding Our Genetic Inheritance.The U.S. Human Genome Project: The First Five Years.

Page 33: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

33

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– production of a variety of physical maps of all humanchromosomes and of the DNA of selected modelorganisms;

– determination of the complete sequence of human DNA andof the DNA of selected model organisms;

USDOE. 1990. Understanding Our Genetic Inheritance.The U.S. Human Genome Project: The First Five Years.

Page 34: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

34

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– production of a variety of physical maps of all humanchromosomes and of the DNA of selected modelorganisms;

– determination of the complete sequence of human DNA andof the DNA of selected model organisms;

– development of capabilities for collecting, storing,distributing, and analyzing the data produced;

USDOE. 1990. Understanding Our Genetic Inheritance.The U.S. Human Genome Project: The First Five Years.

Page 35: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

35

Human Genome Project - Goals– construction of a high-resolution genetic map of the human

genome;

– production of a variety of physical maps of all humanchromosomes and of the DNA of selected modelorganisms;

– determination of the complete sequence of human DNA andof the DNA of selected model organisms;

– development of capabilities for collecting, storing,distributing, and analyzing the data produced;

– creation of appropriate technologies necessary to achievethese objectives.

USDOE. 1990. Understanding Our Genetic Inheritance.The U.S. Human Genome Project: The First Five Years.

Page 36: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

36

Base Pairs in GenBank

0

10 0 ,00 0 ,00 0

20 0 ,00 0 ,00 0

30 0 ,00 0 ,00 0

40 0 ,00 0 ,00 0

50 0 ,00 0 ,00 0

60 0 ,00 0 ,00 0

70 0 ,00 0 ,00 0

80 0 ,00 0 ,00 0

90 0 ,00 0 ,00 0

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

GenBank Release Numbers

9493929190898887 95 96 97

Page 37: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

37

0

10 0 ,00 0 ,00 0

20 0 ,00 0 ,00 0

30 0 ,00 0 ,00 0

40 0 ,00 0 ,00 0

50 0 ,00 0 ,00 0

60 0 ,00 0 ,00 0

70 0 ,00 0 ,00 0

80 0 ,00 0 ,00 0

90 0 ,00 0 ,00 0

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

GenBank Release Numbers

9493929190898887 95 96 97

Base Pairs in GenBank

Growth in GenBank is exponential.More data were added in the last 10weeks than were added in the first 10years of the project.

Page 38: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

38

Infrastructure and the HGP

Progress towards all of the [Genome Project]goals will require the establishment of well-funded centralized facilities, including a stockcenter for the cloned DNA fragmentsgenerated in the mapping and sequencingeffort and a data center for the computer-basedcollection and distribution of large amounts ofDNA sequence information.

National Research Council. 1988. Mapping and Sequencing theHuman Genome. Washington, DC: National Academy Press. p. 3

Page 39: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

39

Databases and the Genome Project

[The] database developer should provide, insome real sense, an intellectual focus for theinterpretation of genomic data.

NIH-DOE Ad Hoc Committee on Genome Databases

Page 40: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

21st CenturyBiology

The Approach

Page 41: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

41

Paradigm Shift in Biology

The new paradigm, now emerging, is that all the‘genes’ will be known (in the sense of beingresident in databases available electronically),and that the starting point of a biologicalinvestigation will be theoretical. An individualscientist will begin with a theoretical conjecture,only then turning to experiment to follow or testthat hypothesis.

Walter Gilbert. 1991. Towards a paradigm shift in biology. Nature, 349:99.

Page 42: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

42

Paradigm Shift in Biology

To use [the] flood of knowledge, which will pouracross the computer networks of the world,biologists not only must become computerliterate, but also change their approach to theproblem of understanding life.

Walter Gilbert. 1991. Towards a paradigm shift in biology. Nature, 349:99.

Page 43: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

21st CenturyBiology

The People

Page 44: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

44

Human Resources Issues

• Reduction in need for non-IT staff

Page 45: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

45

Human Resources Issues

• Reduction in need for non-IT staff

• Increase in need for IT staff, especially“information engineers”

Page 46: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

46

Human Resources Issues

• Reduction in need for non-IT staff

• Increase in need for IT staff, especially“information engineers”

In modern biology, a general trend is toconvert expert work into staff work andfinally into computation. New expertise isrequired to design, carry out, and interpretcontinuing work.

Page 47: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

47

Human Resources Issues

Elbert Branscomb: “You must recognize thatsome day you may need as many computerscientists as biologists in your labs.”

Page 48: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

48

Human Resources Issues

Elbert Branscomb: “You must recognize thatsome day you may need as many computerscientists as biologists in your labs.”

Craig Venter: “At TIGR, we already havetwice as many computer scientists on ourstaff.”

Exchange at DOE workshop on high-throughput sequencing.

Page 49: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

21st CenturyBiology

The Science

Page 50: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

50

Fundamental Dogma

The fundamental dogma ofmolecular biology is that genes act tocreate phenotypes through a flow ofinformation from DNA to RNA toproteins, to interactions amongproteins, and ultimately tophenotypes.

Collections of individual phenotypes,of course, constitute a population.

DNA

RNA

Proteins

Circuits

Phenotypes

Populations

Page 51: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

51

Fundamental DogmaDNA

RNA

Proteins

Circuits

Phenotypes

Populations

GenBankEMBLDDBJ

MapDatabases

SwissPROTPIR

PDB

Although a few databases already existto distribute molecular information,

Page 52: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

52

Fundamental DogmaDNA

RNA

Proteins

Circuits

Phenotypes

Populations

GenBankEMBLDDBJ

MapDatabases

SwissPROTPIR

PDB

Gene Expression?

Clinical Data ?

Regulatory Pathways?Metabolism?

Biodiversity?

Neuroanatomy?

Development ?

Molecular Epidemiology?

Comparative Genomics?

the post-genomic era will need manymore to collect, manage, and publishthe coming flood of new findings.

Although a few databases already existto distribute molecular information,

Page 53: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

21st CenturyBiology

The Literature

Page 54: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

54

Electronic Data Publishing

G D B -- Beta Hemoglobin---------------------------------------------- ** Locus Detail View **-------------------------------------------------------- Symbol: HBB Name: hemoglobin, beta MIM Num: 141900 Location: 11p15.5 Created: 01 Jan 86 00:00-------------------------------------------------------- ** Polymorphism Table **--------------------------------------------------------Probe Enzyme=================================beta-globin cDNA RsaIbeta-globin cDNA,JW10+ AvaIIPstbeta,JW102,BD23,pB+ BamHIpRK29,Unknown HindIIbeta-IVS2 probe HphIIVS-2 normal HphIUnknown AvrIIbeta-IVS2 probe AsuI

Page 55: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

55

Electronic Data Publishing

O M I M -- Beta Hemoglobin----------------------------------------------Title*141900 HEMOGLOBIN--BETA LOCUS[HBB; SICKLE CELL ANEMIA, INCLUDED;BETA-THALASSEMIAS, INCLUDED;HEINZ BODY ANEMIAS, BETA-GLOBINTYPE, ...]

The alpha and beta loci determine thestructure of the 2 types of polypeptidechains in adult hemoglobin, Hb A. Byautoradiography using heavy-labeledhemoglobin-specific messenger RNA,Price et al. (1972) found labeling of achromosome 2 and a group B chromo-some. They concluded, incorrectly as itturned out, that the beta-gamma-deltalinkage group was on a group Bchromosome since the zone of labelingwas longer on that chromosome than onchromosome 2 (which by this reasoning

G D B -- Beta Hemoglobin---------------------------------------------- ** Locus Detail View **-------------------------------------------------------- Symbol: HBB Name: hemoglobin, beta MIM Num: 141900 Location: 11p15.5 Created: 01 Jan 86 00:00-------------------------------------------------------- ** Polymorphism Table **--------------------------------------------------------Probe Enzyme=================================beta-globin cDNA RsaIbeta-globin cDNA,JW10+ AvaIIPstbeta,JW102,BD23,pB+ BamHIpRK29,Unknown HindIIbeta-IVS2 probe HphIIVS-2 normal HphIUnknown AvrIIbeta-IVS2 probe AsuI

Page 56: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

56

Electronic Data Publishing

GenBank -- Beta Hemoglobin----------------------------------------------DEFINITION [DEF] [HUMHBB] Human beta globin regionLOCUS [LOC] HUMHBBACCESSION NO. [ACC]J00179 J00093 J00094 J00096J00158 J00159 J00160 J00161

KEYWORDS [KEY]Alu repetitive element; HPFH;KpnI repetitive sequence; RNApolymerase III; allelicvariation; alternate cap site;

SEQUENCEgaattctaatctccctctcactactgtctagtatccctcaaggagtggtggctcatgtcttgagctcaagagtttgatataaaaaaaaattagccaggcaaatgggaggatcccttgagcgcactccagcct

GenBank -- Beta Hemoglobin----------------------------------------------DEFINITION [DEF] [HUMHBB] Human beta globin regionLOCUS [LOC] HUMHBBACCESSION NO. [ACC]J00179 J00093 J00094 J00096J00158 J00159 J00160 J00161

KEYWORDS [KEY]Alu repetitive element; HPFH;KpnI repetitive sequence; RNApolymerase III; allelicvariation; alternate cap site;

SEQUENCEgaattctaatctccctctcactactgtctagtatccctcaaggagtggtggctcatgtcttgagctcaagagtttgatataaaaaaaaattagccaggcaaatgggaggatcccttgagcgcactccagcct

O M I M -- Beta Hemoglobin----------------------------------------------Title*141900 HEMOGLOBIN--BETA LOCUS[HBB; SICKLE CELL ANEMIA, INCLUDED;BETA-THALASSEMIAS, INCLUDED;HEINZ BODY ANEMIAS, BETA-GLOBINTYPE, ...]

The alpha and beta loci determine thestructure of the 2 types of polypeptidechains in adult hemoglobin, Hb A. Byautoradiography using heavy-labeledhemoglobin-specific messenger RNA,Price et al. (1972) found labeling of achromosome 2 and a group B chromo-some. They concluded, incorrectly as itturned out, that the beta-gamma-deltalinkage group was on a group Bchromosome since the zone of labelingwas longer on that chromosome than onchromosome 2 (which by this reasoning

G D B -- Beta Hemoglobin---------------------------------------------- ** Locus Detail View **-------------------------------------------------------- Symbol: HBB Name: hemoglobin, beta MIM Num: 141900 Location: 11p15.5 Created: 01 Jan 86 00:00-------------------------------------------------------- ** Polymorphism Table **--------------------------------------------------------Probe Enzyme=================================beta-globin cDNA RsaIbeta-globin cDNA,JW10+ AvaIIPstbeta,JW102,BD23,pB+ BamHIpRK29,Unknown HindIIbeta-IVS2 probe HphIIVS-2 normal HphIUnknown AvrIIbeta-IVS2 probe AsuI

Page 57: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

57

Electronic Data PublishingP I R -- Beta Hemoglobin----------------------------------------------DEFINITION

[HBHU] Hemoglobin beta chain- Human, chimpanzee, pygmy chimpanzee, and gorilla

SUMMARY [SUM] #Type Protein #Molecular-weight 15867 #Length 146 #Checksum 1242

SEQUENCEV H L T P E E K S A V T A L W GK V N V D E V G G E A L G R L LV V Y P W T Q R F F E S F G D LS T P D A V M G N P K V K A H GK K V L G A F S D G L A H L D NL K G T F A T L S E L H C D K LH V D P E N F R L L G N V L V CV L A H H F G

P I R -- Beta Hemoglobin----------------------------------------------DEFINITION

[HBHU] Hemoglobin beta chain- Human, chimpanzee, pygmy chimpanzee, and gorilla

SUMMARY [SUM] #Type Protein #Molecular-weight 15867 #Length 146 #Checksum 1242

SEQUENCEV H L T P E E K S A V T A L W GK V N V D E V G G E A L G R L LV V Y P W T Q R F F E S F G D LS T P D A V M G N P K V K A H GK K V L G A F S D G L A H L D NL K G T F A T L S E L H C D K LH V D P E N F R L L G N V L V CV L A H H F G

GenBank -- Beta Hemoglobin----------------------------------------------DEFINITION [DEF] [HUMHBB] Human beta globin regionLOCUS [LOC] HUMHBBACCESSION NO. [ACC]J00179 J00093 J00094 J00096J00158 J00159 J00160 J00161

KEYWORDS [KEY]Alu repetitive element; HPFH;KpnI repetitive sequence; RNApolymerase III; allelicvariation; alternate cap site;

SEQUENCEgaattctaatctccctctcactactgtctagtatccctcaaggagtggtggctcatgtcttgagctcaagagtttgatataaaaaaaaattagccaggcaaatgggaggatcccttgagcgcactccagcct

GenBank -- Beta Hemoglobin----------------------------------------------DEFINITION [DEF] [HUMHBB] Human beta globin regionLOCUS [LOC] HUMHBBACCESSION NO. [ACC]J00179 J00093 J00094 J00096J00158 J00159 J00160 J00161

KEYWORDS [KEY]Alu repetitive element; HPFH;KpnI repetitive sequence; RNApolymerase III; allelicvariation; alternate cap site;

SEQUENCEgaattctaatctccctctcactactgtctagtatccctcaaggagtggtggctcatgtcttgagctcaagagtttgatataaaaaaaaattagccaggcaaatgggaggatcccttgagcgcactccagcct

O M I M -- Beta Hemoglobin----------------------------------------------Title*141900 HEMOGLOBIN--BETA LOCUS[HBB; SICKLE CELL ANEMIA, INCLUDED;BETA-THALASSEMIAS, INCLUDED;HEINZ BODY ANEMIAS, BETA-GLOBINTYPE, ...]

The alpha and beta loci determine thestructure of the 2 types of polypeptidechains in adult hemoglobin, Hb A. Byautoradiography using heavy-labeledhemoglobin-specific messenger RNA,Price et al. (1972) found labeling of achromosome 2 and a group B chromo-some. They concluded, incorrectly as itturned out, that the beta-gamma-deltalinkage group was on a group Bchromosome since the zone of labelingwas longer on that chromosome than onchromosome 2 (which by this reasoning

G D B -- Beta Hemoglobin---------------------------------------------- ** Locus Detail View **-------------------------------------------------------- Symbol: HBB Name: hemoglobin, beta MIM Num: 141900 Location: 11p15.5 Created: 01 Jan 86 00:00-------------------------------------------------------- ** Polymorphism Table **--------------------------------------------------------Probe Enzyme=================================beta-globin cDNA RsaIbeta-globin cDNA,JW10+ AvaIIPstbeta,JW102,BD23,pB+ BamHIpRK29,Unknown HindIIbeta-IVS2 probe HphIIVS-2 normal HphIUnknown AvrIIbeta-IVS2 probe AsuI

Page 58: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

58

Electronic Scholarly Publishing

HTTP://WWW.ESP.ORGThe ESP site is dedicated to the electronic publishing ofscientific and other scholarly materials. Of particularinterest are the history of science, genetics,computational biology, and genome research.

Page 59: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

59

Electronic Scholarly Publishing

The Classical Genetics:Foundations seriesprovides ready accessto typeset-quality,electronic editions ofimportant publicationsthat can otherwise bevery difficult to find.

Page 60: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

60

Electronic Scholarly Publishing

“Hardy” (of Hardy-Weinberg) is a namewell known to moststudents of biology.

Page 61: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

61

Electronic Scholarly Publishing

But how many haveread, or even seen,all of Hardy’sbiological writings?

This is it: A single,one-page letter to theeditor of Science.

Page 62: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

62

Electronic Scholarly Publishing

http://www.blocks.fhcrc.org/~kinesinElectronic publishing is especially appropriate for somekinds of dynamic review papers.

Page 63: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

63

Electronic Scholarly Publishing

Reviews may be revisedand maintained in realtime...

Page 64: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

64

Electronic Scholarly Publishing

and it is easy to providelarge amounts of in-depth supporting andrelated data.

Page 65: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

65

Electronic Scholarly Publishing

Page 66: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

66

Electronic Scholarly Publishing

http://www.esp.org/books/darwin/beagleEntire monographs can be made instantly available to readers world-wide..

Page 67: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

67

Electronic Scholarly Publishing

Today’s computertechnology was nearlyunimaginable just tenyears ago. The technol-ogy of ten years fromnow will also bringmany surprises.

How is it that IT canmaintain such anamazing rate ofsustained change?

And what, if any, arethe implications of thatrate of change forbiology?

Page 68: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

Moore’s Law

Transforms InfoTech(and everything else)

Page 69: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

69

Moore’s Law: The Statement

Every eighteen months, thenumber of transistors that canbe placed on a chip doubles.

Gordon Moore, co-founder of Intel...

Page 70: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

70

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Time

100,000

10,000

1,000

100

10

Page 71: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

71

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

Page 72: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

72

Moore’s Law: The Effect

Three Phases of Novel IT Applications

• It’s Impossible

Page 73: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

73

Moore’s Law: The Effect

Three Phases of Novel IT Applications

• It’s Impossible

• It’s Impractical

Page 74: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

74

Moore’s Law: The Effect

Three Phases of Novel IT Applications

• It’s Impossible

• It’s Impractical

• It’s Overdue

Page 75: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

75

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

Page 76: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

76

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

Page 77: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

77

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

Page 78: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

78

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

C

Page 79: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

79

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

C

Page 80: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

80

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

AA

C

Page 81: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

81

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

A

C

Page 82: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

82

Moore’s Law: The EffectP

erfo

rman

ce(c

on

stan

t co

st)

Cos

t(c

on

stan

t p

erfo

rma

nce

)

Time

100,000 100,000

10,000

1,000

100

10

10,000

1,000

100

10

D

P

A

C

Relevance for biology?

Page 83: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

83

Cost (constant performance)

1,000

10,000

100,000

1,000,000

10,000,000

1975 1980 1985 1990 1995 2000 2005

UniversityPurchase

Page 84: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

84

Cost (constant performance)

1,000

10,000

100,000

1,000,000

10,000,000

1975 1980 1985 1990 1995 2000 2005

UniversityPurchase

DepartmentPurchase

Page 85: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

85

Cost (constant performance)

1,000

10,000

100,000

1,000,000

10,000,000

1975 1980 1985 1990 1995 2000 2005

RO1 GrantPurchase

UniversityPurchase

DepartmentPurchase

Page 86: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

86

Cost (constant performance)

1,000

10,000

100,000

1,000,000

10,000,000

1975 1980 1985 1990 1995 2000 2005

PersonalPurchase

RO1 GrantPurchase

UniversityPurchase

DepartmentPurchase

Page 87: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

Funding forInformation

Infrastructure

Page 88: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

88

The Problem

• IT moves at “Internet Speed” and respondsrapidly to market forces.

Page 89: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

89

The Problem

• IT moves at “Internet Speed” and respondsrapidly to market forces.

• IT will play a central role in 21st Centurybiology.

Page 90: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

90

The Problem

• IT moves at “Internet Speed” and respondsrapidly to market forces.

• IT will play a central role in 21st Centurybiology.

• Current levels of support for public bio-information infrastructure are too low.

Page 91: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

91

The Problem

• IT moves at “Internet Speed” and respondsrapidly to market forces.

• IT will play a central role in 21st Centurybiology.

• Current levels of support for public bio-information infrastructure are too low.

• Reallocation of federal funding is difficult,and subject to political pressures.

Page 92: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

92

The Problem

• IT moves at “Internet Speed” and respondsrapidly to market forces.

• IT will play a central role in 21st Centurybiology.

• Current levels of support for public bio-information infrastructure are too low.

• Reallocation of federal funding is difficult,and subject to political pressures.

• Federal-funding decision processes areponderously slow and inefficient.

Page 93: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

93

Federal Funding of Bio-Databases

The challenges:

Page 94: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

94

Federal Funding of Bio-Databases

The challenges:

• providing adequate funding levels

Page 95: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

95

Federal Funding of Bio-Databases

The challenges:

• providing adequate funding levels

• making timely, efficient decisions

Page 96: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

96

Federal Funding of Bio-Databases

Appropriate funding level:

Page 97: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

97

Federal Funding of Bio-Databases

Appropriate funding level:

• approx. 10% of research funding

Page 98: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

98

Federal Funding of Bio-Databases

Appropriate funding level:

• approx. 10% of research funding

• i.e., 1 - 2 billion dollars per year

Page 99: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

99

Federal Funding of Bio-Databases

Appropriate funding level:

• approx. 10% of research funding

• i.e., 1 - 2 billion dollars per year

Source of estimate:

- Experience of IT-transformed industries.

- Current support for IT-rich biological research.

Page 100: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

100

Market Forces

Vendors

productsservices

Buyers

In a simple market economy, vendors try to anticipatethe needs of buyers and offer products and services tomeet those needs.

Page 101: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

101

Market Forces

Vendors

productsservices

Buyers

$

purchases

In a simple market economy, vendors try to anticipatethe needs of buyers and offer products and services tomeet those needs.

Real users decide whether or not to buy a product orservice, depending upon whether or not it meets a realneed at a reasonable price.

Page 102: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

102

Market Forces

Vendors

productsservices

Buyers

$

purchases

In a simple market economy, vendors try to anticipatethe needs of buyers and offer products and services tomeet those needs.

Real users decide whether or not to buy a product orservice, depending upon whether or not it meets a realneed at a reasonable price.

Business 101 Insight:

Successful vendors target aniche and excel at meeting theneeds of that niche.

Page 103: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

103

Market Forces

VentureCapital

Vendors$

Buyers

$Stock

Offerings

Funding to initiate the developmentof products and services come frominvestors, not from buyers.

$

VendorInvestment

productsservices

$

purchases

Page 104: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

104

Market Forces

VentureCapital

Vendors$

Buyers

$Stock

Offerings

Funding to initiate the developmentof products and services come frominvestors, not from buyers.

Investors decide whether or not toprovide start-up funding based uponthe estimated ability of the vendor tocreate products and services that willmeet real needs at competitive prices.

$

VendorInvestment

productsservices

$

purchases

Page 105: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

105

Federal Funding

Database

Users

productsservices

$

purchases

If biological databases were drivenby market forces, individual userswould choose what services theyneed and individual databaseproviders would choose whatservices to make available.

Page 106: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

106

Federal Funding

Investors

Database

$

Users

productsservices

$

purchases

If biological databases were drivenby market forces, individual userswould choose what services theyneed and individual databaseproviders would choose whatservices to make available.

Investors would provide start-upmoney on the likelihood ofsuccessful products and servicesbeing developed.

Page 107: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

107

Federal Funding

Investors

Database

$

Users

productsservices

$

purchases

If biological databases were drivenby market forces, individual userswould choose what services theyneed and individual databaseproviders would choose whatservices to make available.

Investors would provide start-upmoney on the likelihood ofsuccessful products and servicesbeing developed.

Ultimate success would depend onmeeting the needs of real users.Decisions could be made rapidly, inresponse to changing needs andemerging opportunities.

Page 108: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

108

Federal Funding

Agency

Database

Reviewers

OtherAgencies

AgencyAdvisors

Congress

productsservices

OMB $

$ DatabaseAdvisors

Users

Instead, funding decisions forbiological databases can follow aponderously slow course, withalmost no opportunity for input fromreal users.

Those most knowledgeable about aparticular database are oftenexcluded from participating in thereview process because of a possible“conflict of interest” status with thedatabase provider.

Page 109: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

109

Federal Funding of Bio-Databases

Possible solution - create market forces:

Page 110: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

110

Federal Funding of Bio-Databases

Possible solution - create market forces:

• stop supporting the supply side of biodatabasesthrough slow, inefficient processes.

Page 111: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

111

Federal Funding of Bio-Databases

Possible solution - create market forces:

• stop supporting the supply side of biodatabasesthrough slow, inefficient processes.

• start supporting the demand side through fast,efficient processes.

Page 112: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

112

Federal Funding of Bio-Databases

Possible solution - create market forces:

• stop supporting the supply side of biodatabasesthrough slow, inefficient processes.

• start supporting the demand side through fast,efficient processes.

• provide guaranteed supplementary funding,redeemable only for access to bio-databases.

Page 113: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

113

Federal Funding of Bio-Databases

Possible solution - create market forces:

• stop supporting the supply side of biodatabasesthrough slow, inefficient processes.

• start supporting the demand side through fast,efficient processes.

• provide guaranteed supplementary funding,redeemable only for access to bio-databases.

• data stamps

Page 114: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

114

Federal Funding of Bio-Databases

Possible solution - create market forces:

• stop supporting the supply side of biodatabasesthrough slow, inefficient processes.

• start supporting the demand side through fast,efficient processes.

• provide guaranteed supplementary funding,redeemable only for access to bio-databases.

• data stamps, AKA food (for-thought) stamps ?!

Page 115: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

115

Food (for thought) Stamps

Funding Agencies could:

• provide a 10% supplement to every researchgrant in the form of “stamps” redeemable only atdatabase providers.

Page 116: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

116

Food (for thought) Stamps

Funding Agencies could:

• provide a 10% supplement to every researchgrant in the form of “stamps” redeemable only atdatabase providers.

• allow the “stamps” to be transferable amongscientists, so that a market for them couldemerge.

Page 117: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

117

Food (for thought) Stamps

Funding Agencies could:

• provide a 10% supplement to every researchgrant in the form of “stamps” redeemable only atdatabase providers.

• allow the “stamps” to be transferable amongscientists, so that a market for them couldemerge.

• provide funding only after the stamps have beenredeemed at a database provider.

Page 118: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

118

Food (for thought) Stamps

Problems:

• how to estimate the amount of FFT stamps thatwould actually be redeemed (and thus therequired budget set-aside).

Page 119: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

119

Food (for thought) Stamps

Problems:

• how to estimate the amount of FFT stamps thatwould actually be redeemed (and thus therequired budget set-aside).

• how to identify “approved” database providers.

Page 120: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

120

Food (for thought) Stamps

Problems:

• how to estimate the amount of FFT stamps thatwould actually be redeemed (and thus therequired budget set-aside).

• how to identify “approved” database providers.

• how to initiate the FFT system.

Page 121: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

121

Food (for thought) Stamps

Problems:

• how to estimate the amount of FFT stamps thatwould actually be redeemed (and thus therequired budget set-aside).

• how to identify “approved” database providers.

• how to initiate the FFT system.

• etc etc

Page 122: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

122

Food (for thought) Stamps

Alternatives (if no solution emerges):• increasingly inefficient research activities (abject

failure will occur when it becomes easier to redoresearch than to obtain the results of prior work).

Page 123: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

123

Food (for thought) Stamps

Alternatives (if no solution emerges):• increasingly inefficient research activities (abject

failure will occur when it becomes easier to redoresearch than to obtain the results of prior work).

• loss of access to bio-databases for public-sectorresearch.

Page 124: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

124

Food (for thought) Stamps

Alternatives (if no solution emerges):• increasingly inefficient research activities (abject

failure will occur when it becomes easier to redoresearch than to obtain the results of prior work).

• loss of access to bio-databases for public-sectorresearch.

• movement of majority of “important” biologicalresearch into the private sector.

Page 125: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

125

Food (for thought) Stamps

Alternatives (if no solution emerges):• increasingly inefficient research activities (abject

failure will occur when it becomes easier to redoresearch than to obtain the results of prior work).

• loss of access to bio-databases for public-sectorresearch.

• movement of majority of “important” biologicalresearch into the private sector.

• loss of American pre-eminence (if othercountries solve the problems first).

Page 126: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

126

Food (for thought) Stamps

Bottom Line:

• This might not be the answer, but

Page 127: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

127

Food (for thought) Stamps

Bottom Line:

• This might not be the answer, but

• it’s FOOD FOR THOUGHT

Page 128: 21st Century Biology: Informatics in the Post-Genome Era · Informatics in the Post-Genome Era Robert J. Robbins Fred Hutchinson Cancer Research Center 1124 Columbia Street, LV-101

128

Slides:

http://www.esp.org/rjr/beckman.pdf