Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science,...

29
What I learned last summer .... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly Alicea http://www.msu.edu/~aliceabr http://syntheticdaisies.blogspot.com If your results are unpredictable, does it make them any less true?

Transcript of Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science,...

Page 1: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

What I learned last summer....

or

Physics is a “hard” science, biology is a “difficult” one.....

or

Hard-to-Define-Events 2013.1

Bradly Alicea

http://www.msu.edu/~aliceabr http://syntheticdaisies.blogspot.com

If your results are unpredictable, does it make them any less true?

Page 2: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Artificial Life XIII Conference East Lansing, MI July, 2012

2012

Recursive me! Giving this talk at HTDE 2012.

Residuals of the workshop hosted at Synthetic Daisies and Vimeo (videos).

Page 3: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

It’s not the phenomenon, it’s you………

Page 4: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Pashler H. and Harris, C.R. Is the Replicability Crisis Overblown? Three Arguments Examined. Perspectives on Psychological Science, 7, 531 (2012). 1) Direct vs. conceptual replication:

* successful direct replication can validate findings, but often conceptual replication (between research groups) is a more attainable goal.

Page 5: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Pashler H. and Harris, C.R. Is the Replicability Crisis Overblown? Three Arguments Examined. Perspectives on Psychological Science, 7, 531 (2012). 1) Direct vs. conceptual replication:

* successful direct replication can validate findings, but often conceptual replication (between research groups) is a more attainable goal.

Conceptual replication: same type of experiment without replicating exact conditions.

TRADEOFF: generalization vs. accuracy.

General tendencies

(THEORY)

Accurate Repetition (EMPIRICISM)

MY INTERPRETATION

Page 6: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Pashler H. and Harris, C.R. Is the Replicability Crisis Overblown? Three Arguments Examined. Perspectives on Psychological Science, 7, 531 (2012). 1) Direct vs. conceptual replication:

* successful direct replication can validate findings, but often conceptual replication (between research groups) is a more attainable goal.

2) Science is ultimately self-correcting: * given enough time, the consensus of the scientific community will prevail (e.g. wisdom of the crowd, swarm intelligence).

* in the absence of information, people will flock to ideas that sound good (e.g. popular fads, internet memes).

Conceptual replication: same type of experiment without replicating exact conditions.

TRADEOFF: generalization vs. accuracy.

General tendencies

(THEORY)

Accurate Repetition (EMPIRICISM)

MY INTERPRETATION

Page 7: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Is Science Ultimately Self-correcting? A historical view of scientific consensus

Heliocentrism 17th century Astronomy

Rates of change differ, all geometry is qualitative

MY INTERPRETATION

CONSENSUS THOUGHT

Page 8: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Is Science Ultimately Self-correcting? A historical view of scientific consensus

Heliocentrism 17th century Astronomy

One gene, One protein 20th century Biology

Rates of change differ, all geometry is qualitative

MY INTERPRETATION

Page 9: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Is Science Ultimately Self-correcting? A historical model of scientific consensus

Heliocentrism 17th century Astronomy

One gene, One protein 20th century Biology

Cultural Relativism 20th century Anthropology

Curved Spacetime 20th century Physics

Phrenology 19th century Psychology

Plate Tectonics 20th century Geology

Lamarckism 18th century Zoology

Rates of change differ, all geometry is qualitative

MY INTERPRETATION

Page 10: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Is Science Ultimately Self-correcting? A historical model of scientific consensus

Heliocentrism 17th century Astronomy

One gene, One protein 20th century Biology

Cultural Relativism 20th century Anthropology

Phrenology 19th century Psychology

Plate Tectonics 20th century Geology

Lamarckism 18th century Zoology

Rates of change differ, all geometry is qualitative

“The Half-life of Facts” Samuel Arbesman

MY INTERPRETATION Facts (scientific and otherwise) decay at a certain rate (see book): * overturned (lose consensus status).

* hard-to-kill ideas (useless but still popular).

Curved Spacetime 20th century Physics

Page 11: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Are most “true” results wrong? And why?

False positive report probability HSIGNIFICANT : H0

Level of significance (e.g. 0.05)

Statistical power (vs. TYPE II error rate)

Ioannidis, Why Most Published Research Findings are False. PLoS Medicine, 2(8), e124.

Page 12: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Are most “true” results wrong? And why?

False positive report probability HSIGNIFICANT : H0

Level of significance (e.g. 0.05)

Statistical power (vs. TYPE II error rate)

High rate of nonreplication: formal analysis of all statistically significant hypotheses

(supported null hypotheses are excluded).

Biases include: experimental design, data analysis, and presentation factors (technical

variation).

* this list assumes technical variation is likely always bad (e.g. particle physics envy).

Ioannidis, Why Most Published Research Findings are False. PLoS Medicine, 2(8), e124.

Page 13: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Findings less likely to be true (for a given field) if (e.g. high PPV):

1) smaller the number of studies conducted (low N, sparse data).

2) smaller the effect size (small 1 – β, low sensitivity). 3) greater number of potential relationships. 4) Greater the flexibility of designs and analytical modes.

SOLUTION: large studies with low levels of bias (e.g. post-modernist envy).

* easy to suggest, harder to do. What are biases that affect the practice of science?

Ioannidis, Why Most Published Research Findings are False. PLoS Medicine, 2(8), e124.

Page 14: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Findings less likely to be true (for a given field) if (e.g. high PPV):

1) smaller the number of studies conducted (low N, sparse data).

2) smaller the effect size (small 1 – β, low sensitivity). 3) greater number of potential relationships. 4) Greater the flexibility of designs and analytical modes.

SOLUTION: large studies with low levels of bias (e.g. post-modernist envy).

* easy to suggest, harder to do. What are biases that affect the practice of science?

Ioannidis, Why Most Published Research Findings are False. PLoS Medicine, 2(8), e124.

“Incestuous Amplification” Effect

Inattentional Blindness

“When objective information is low, follow the herd”

Biases lead to an inability to notice features of data, theory which otherwise would be obvious.

A small set of ideas are perpetually circulated among people in a certain field or social group without external feedback.

Absent informed dissent or debate, argumentum ad populum (popular but incorrect ideas) tends to predominate.

Page 15: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Let’s be random…….

Number 8, Jackson Pollack Random Walk algorithm, 1000 iterations

Page 16: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Let’s be random…….

Number 8, Jackson Pollack Random Walk algorithm, 1000 iterations

Now let’s be quasi-periodic….

Figure 1, Journal of Sound and Vibration, 330(11), 2565–2579

(2011)

Quasi-periodic Crystals

Page 17: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Let’s be random…….

Number 8, Jackson Pollack Random Walk algorithm, 1000 iterations

Now let’s be quasi-periodic….

Figure 1, Journal of Sound and Vibration, 330(11), 2565–2579

(2011)

Quasi-periodic Crystals

COURTESY: Wired Science

Replicates Have Information Content, H(x)

1) Low variance between replicates, low H(x). 2) High variance between replicates, high H(x). * other meaningful, useful patterns beyond tests of the null hypothesis.

Page 18: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Hmmmm…….it turns out I’m pretty good at science. Perhaps it’s the

phenomenon after all!

Page 19: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

WHY? Humans and Mice are phylogenetically related, and share much of the same genomic content.

Kolata, G. Mice Fall Short as Test Subjects for Humans’ Deadly Ills. NYT, February 11 (2013).

See paper: "Genomic responses in mouse models poorly mimic human inflammatory diseases". PNAS, doi:10.1073/pnas.1222878110.

Page 20: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Gene expression correlations are highest within human, lowest in comparing humans and mouse, and low within mouse.

MICROARRAY: Human “burn” and “trauma” are more closely related than mouse “burn” and “trauma”.

FROM: "Genomic responses in mouse models poorly mimic human inflammatory diseases". PNAS, doi:10.1073/pnas.1222878110.

Page 21: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Why would there be such massive difference between humans and mice (shared evolutionary history,

highly homologous genomes)?

BLACK BOX ARGUMENT: Multiple layers of physiological regulation explains much of the variance. 1) Epigenetics has many subtle effects. 2) Endless “–omes”.

Satisfying?

COMPLEXITY ARGUMENT: Generative nature of gene expression explains much of the variance. 1) One gene, many products. 2) Genes of large effect and number of genes involved.

Satisfying?

NOISE ARGUMENT: Noise (variation) in gene expression explains much of the variance. 1) Useful (and loss of) information can be generated from fluctuations, 2) Synchronized noise is good, white noise is bad.

Satisfying?

Page 22: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Ramsden, E. Model Organisms and Model Environments: A Rodent Laboratory in Science, Medicine and Society. Medical History, 55, 365–368 (2011).

Surprising and unexpected elements of model organisms:

* does standardization of environment (e.g. social settings, cages, diet) = a standard result (e.g. replication)?

Wikgren, J. et.al Selective breeding for endurance running capacity affects

cognitive but not motor learning in rats. Physiology and Behavior, 106, 95–100 (2012).

Does artificial selection (e.g. selective breeding, etc) affect the response of laboratory animals?

Francis, G. The Psychology of Replication and Replication in Psychology.

Perspectives on Psychological Science, 7(6), 585–594 (2012).

Is there a “psychology” of replication (e.g. a bias for results and settings that make results more replicable but less generalizable and informative)?

Page 23: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

VARIATIONAL

VARIATIONAL

EXACT

EXACT

WHY? Humans and Mice are phylogenetically related, and share much of the same genomic content.

Kolata, G. Mice Fall Short as Test Subjects for Humans’ Deadly Ills. NYT,

February 11 (2013).

See paper: "Genomic responses in mouse models poorly mimic human inflammatory diseases". PNAS, doi:10.1073/pnas.1222878110.

My Interpretation

Mode of function in Homo sapiens

Mode of function in Mus musculus

Page 24: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

EXACT: single path to activating physiological response. System not robust to perturbation.

VARIATIONAL: several alternate pathways, all are effective. System can be robust to perturbation (one route blocked, take the alternate with minimal cost).

SINGLE ROUTE (TOP): 4,373km.

VARIATIONAL ROUTES (BOTTOM): 4,373km (left); 4,551km (right).

Choice of route depends on

weather, topography, etc.

Page 25: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

EXACT VARIATIONAL

CONSERVED

(HIGHLY HOMOLOGOUS)

DIVERGENT

(NOT HIGHLY HOMOLOGOUS)

A

A

A

A

B B

B B

EXACT CONSERVED

EXACT DIVERGENT

VARIATIONAL DIVERGENT

VARIATIONAL CONSERVED

Interaction between evolution of pathways and function of pathways:

* degeneracy: signals and receptors become more promiscuous over evolutionary time (enables further complexity). Synthetic Daisies post.

* what is the role of diversity within species? Unknown (explains within species, between function gene expression outcomes).

Page 26: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

TWO OTHER EXAMPLES (Role of Evolutionary Conservation in

Model Organisms)

Page 27: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

COURTESY: Figure 2, Longo and Fabrizio.

REGULATION OF STRESS AND LONGETIVITY Longo and Fabrizio, CMLS: Cell and Molecular Life Sciences, 59, 903–908 (2002).

YEAST

WORMS

HUMANS

Similar set of proteins regulated by growth factors

General downregulation of IGF-1 pathway

Stress resistance pathways, switch from reproductive to non-reproductive phase, evolved to induce longetivity, cellular maintenance. Unknown if same factors and pathways are involved, or if they have a common ancestry. Aging modulated by a simple, course-grained intervention: caloric restriction. Conserved genes modulate longetivity in fruit flies, but may not translate into a conserved mechanism.

Page 28: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

Howe et.al The zebrafish reference genome sequence and its relationship to the human genome. Nature, doi:10.1038/nature12111 (2013).

Figure 3, Howe et.al (2013)

A: Overlap between species = number of orthologues (copies of gene, not genes themselves) at the time of their phylogenetic split.

B: Relationship among “ohnologues”: TSD (teleost-specific genome duplication)- related genes.

How can we better assess the parallels and differences between Zebrafish (an NIH-

approved model organism) and Humans?

Page 29: Bradly Alicea - Amazon S3...What I learned last summer.... or Physics is a “hard” science, biology is a “difficult” one..... or Hard-to-Define-Events 2013.1 Bradly AliceaArtificial

What is experimental replication (the big picture)?

What is experimental replication (the bigger picture)?

Replication as generative model:

TREATMENTS (combinatorial

input)

BLACK BOX (incompletely-known

mechanism)

Range of Outcomes

Experiments populate a prior probability distribution:

* distribution can never truly be known (only estimated). * may be highly complex (non-Gaussian, multimodal).

* algorithmic techniques might help find best approximations (or priors).