Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith)...

61
Causality Assessment with Mul1ple Time Series Data Cur1s A. Bagne, Ph.D. – DataSpeaks, Inc. Brian D. Athey, Ph.D., Chair, Department of Computa1onal Medicine and Bioinforma1cs, University of Michigan Edward Barbour, The Rockefeller University Walter Meixner, UM Alex Ade, UM 6/18/12 1 UM & DataSpeaks

Transcript of Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith)...

Page 1: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Causality  Assessment  with  Mul1ple  Time  Series  Data  

Cur1s  A.  Bagne,  Ph.D.  –  DataSpeaks,  Inc.  Brian  D.  Athey,  Ph.D.,  Chair,  Department  of  Computa1onal  

Medicine  and  Bioinforma1cs,  University  of  Michigan  Edward  Barbour,  The  Rockefeller  University  

Walter  Meixner,  U-­‐M  Alex  Ade,  U-­‐M  

6/18/12   1  U-­‐M  &  DataSpeaks  

Page 2: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Intent  

•  DataSpeaks,  Inc.,  an  early  stage  growth  company,  offers  a  uniquely  digital  measurement-­‐by-­‐computa5on  tool  embodied  in  soRware  

•  It  offers  a  disrup1ve  measurement  science  solu1on  •  Collaborate  (research,  grants,  publica1ons,  etc.)  to:    

–  Help  you  –  DCM&B,  U-­‐M,  etc.  –  Advance  DataSpeaks  –  Help  solve  the  (i)  bench  to  bedside  and  (ii)  clinical  science  to  clinical  prac1ce  transla1on  problems  

–  Advance  P4  Medicine  

6/18/12   2  U-­‐M  &  DataSpeaks  

Page 3: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Causality  Assessment  •  Central  to  scien1fic  understanding  •  Focus  on  Complex  Dynamic  and  Adap1ve  Systems  (CDAS)  

•  Essen1al  for  basic  and  applied  science  (e.g.,  medicine)  •  Two  contexts  for  DataSpeaks’  computa1onal  method  

1.  Without  randomized  experimental  control  •  HeLa  cell  cycle  control  –  a  human  cancer  cell  line  since  1951  •  Cancer  –  a  failure  of  cell  cycle  control  

2.  With  randomized  experimental  control  •  Frequen1st  and  Bayesian  sta1s1cs  not  sufficient    •  Need  Ultra  RCTs  (Randomized  Controlled  Trials)  with  

DataSpeaks  and  sta1s1cs  

6/18/12   3  U-­‐M  &  DataSpeaks  

Page 4: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Time  Series  Data  •  Here  a  shorthand  for  “periodic  1me-­‐ordered  data”  •  Inclusive  of:  

–  Repeated  measurements  (<  20  repeats)  –  Time  series  (20  or  more  repeats)  –  Two  or  more  repeats,  hopefully  many  more  

•  Contrasts  with  cross-­‐sec1onal  data  (including  change  scores)  

•  Time  series  can  provide  orders  of  magnitude  more  informa1on  to  understand  individuals  scien5fically  

•  Periodicity  helpful  for  understanding  temporal  dynamics  

6/18/12   4  U-­‐M  &  DataSpeaks  

Page 5: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

The  Great  Bohleneck:  We  Need  Beher  Computa1onal  Tools  

Outputs:  •   P4  Medicine  •   Safer  medicine  •   Beher  and  more  affordable  health  and  healthcare  •   Beher  drugs  to  market  faster  •   New  drug  indica1ons  

Inputs:  •   Omic  sciences  •   Data  collec1on        technologies  •   IT  infrastructure  • EHR  

6/18/12   5  U-­‐M  &  DataSpeaks  

Page 6: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

What  problem  could  not  be  done  beher    with  mul1ple  1me  series?  

6/18/12   6  U-­‐M  &  DataSpeaks  

Page 7: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

-2

-1.5

-1

-0.5

0

0.5

1

1.5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Series1Series2Series3Series4Series5Series6Series7Series8Series9CCNA2  

CDC2  CDKN3  CCNE1  PCNA  CCNB1  RFC4  E2F1  CDC6  

HeLa  Cell  Data  

•  47  repeated  hourly  measurements  •  About  three  cell  cycles  •  Synchrony  deteriorates  over  1me  •  Goal:  Elucidate  temporal  causal  networks  6/18/12   7  U-­‐M  &  DataSpeaks  

Time (hours)

Norm

alized Log2 (C

y5/Cy3) ratio

Page 8: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

HeLa  Cell  Data:  Source  

We  acknowledge  George  Michailidis  for  providing  these  HeLa  cell  data.  

6/18/12   8  U-­‐M  &  DataSpeaks  

Page 9: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Granger  Causality  •  If  “a  signal  X1  "Granger-­‐causes"  (or  "G-­‐causes")  a  signal  X2,  then  past  values  of  X1  should  contain  informa1on  that  helps  predict  X2  above  and  beyond  the  informa1on  contained  in  past  values  of  X2  alone.”  

•  “Its  mathema1cal  formula1on  is  based  on  linear  regression  modeling  of  stochas1c  processes.”  

•  This  won  Clive  Granger  the  2003  Nobel  Prize  in  Economics  

•  One  applica1on  of  DataSpeaks  is  a  measurement  alterna1ve  to  Granger  causality,  not  a  varia1on  

6/18/12   9  U-­‐M  &  DataSpeaks  

Page 10: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

HeLa  Cell  Data:  History  

•  Whinield,  M.  et  al  uses  clustering  methods  •  Same  HeLa  cell  data  subject  of  other  publica1ons  with  various  methods  in  

addi1on  to  Shojaie/Michailidis  –  Lozano  et  al  -­‐  2009  (Grouped  Graphical  Granger  Modeling)  –  Sambo  et  al  –  2008  (CNET)  –  Sacchi  et  al  –  2007  (Precedence  Temporal  Networks)  

6/18/12   10  U-­‐M  &  DataSpeaks  

Page 11: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

DataSpeaks  

•  Offers  uniquely  digital  measurement-­‐by-­‐computa1on  algorithm  embodied  in  soRware  

•  MQALA  (Method  for  the  Quan1ta1ve  Analysis  of  Longitudinal  Associa1ons)  

•  Applies  to  mul1ple  1me  series  data  •  Measures  the  (i)  amount,  (ii)  posi1ve  or  nega1ve  direc1on  and  (iii)  strength  of  evidence  for  coordina1on  of  ac1on  

6/18/12   11  U-­‐M  &  DataSpeaks  

Page 12: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

DataSpeaks,  con1nued  •  Coordina1on  scores  (CS)  describe  and  help  predict  how  individual  CDAS  

work  over  1me  –  Func1on  internally  (e.g.,  HeLa  cell  cycle  control)  –  Respond  to  environments  including  treatments  (e.g.,  Ultra  RCTs)  –  Act  as  agents  on  their  environments  (neglected  stepchild  of  science)  

•  Bagne  holds  two  issued  U.S.  soRware  patents  –  6,317,700  –  Method  and  System  to  Perform  Empirical  Induc1on  –  6,516,288  –  Method  and  System  to  Construct  Ac1on  Coordina1on  

Profiles  •  SoRware  available  as  a  func1onal  prototype,  open  to  addi1onal  

collabora1on  and  project  discussions  •  Dennis  Nash  heads  business  development  [email protected]    •  Collabora1ng  with  U-­‐M  through  Brian  Athey  

6/18/12   12  U-­‐M  &  DataSpeaks  

Page 13: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

DataSpeaks’  Approach:  Causality  Assessment  with  HeLa  Cell  Cycle  Control  Data  

•  Nine  genes,  cancer  ac1vators  •  72  pair-­‐wise  direc1onal  interac1ons  over  1me            (e.g.,  CDC6èCDC2,  PCNAèE2F1)                [è  operates  on]            [independent  variable  (IV)  è  dependent  variable  (DV)]  •  36  pairs  (e.g.,  PCNAèCDC2,  CDC2èPCNA;  CDKN3èCDC2,  

CDC2èCDKN3)  •  Measure  temporal  asymmetries  within  each  pair  using  

DataSpeaks’  Coordina1on  Scores  (CS)  •  Causes  must  come  before  effects  –  the  temporal  criterion  of  

causal  rela1onships  

6/18/12   13  U-­‐M  &  DataSpeaks  

Page 14: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

“Known”  Network  

•  “Known”  network  from  Sambo,  et  al  2008  •  Extracted  from  www.thebiogrid.org    •  Cited  by  Shojaie/Michailidis  as  “Known  Regulatory  Network”  

6/18/12   14  U-­‐M  &  DataSpeaks  

Page 15: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

E2F1èCDC6,  CDC6èE2F1  Results    

6/18/12   15  U-­‐M  &  DataSpeaks  

Time (hours)

Norm

alized Log2 (C

y5/cy3)

Page 16: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Meaning  of  Coordina1on  Scores  •  Coordina1on  -­‐  “harmonious  func1oning  of  parts  for  effec1ve  results”    •  The  CS  at  Delay  =  0  in  the  previous  slide  is  25.123    •  CSs  are  in  standard  devia1on  units  •  25.123  is  one  score  in  the  following  distribu1on  of  poten1al  scores  

Observed CS

•       We  will  introduce  you  to  compu1ng  these  scores  aRer  several  more  results  

Poten1al  Coordina1on  Score  (standard  devia1on  units)  

Chance  Probability    Used  for  Standardiza1on  

6/18/12   16  U-­‐M  &  DataSpeaks  

Page 17: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Delay = 0

Delay = 5 Delay = 4

Delay = 3 Delay = 2 Delay = 1

Delay = 6

E2F1èCDC6,  Results    

6/18/12   17  U-­‐M  &  DataSpeaks  

Page 18: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

CDC6èE2F1  Results    

Delay = 0

Delay = 5

Delay = 2 Delay = 1

Delay = 6 Delay = 4

Delay = 3

6/18/12   18  U-­‐M  &  DataSpeaks  

Page 19: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

E2F1èCDC6,  CDC6èE2F1  Results    

6/18/12   19  U-­‐M  &  DataSpeaks  

•  Not a “known” interaction •  Apt to be clustered together but not causal •  DataSpeaks - almost no evidence of causality (green line) •  Not identified by Shojaie/Michailidis

Time (hours)

Norm

alized Log2 (C

y5/cy3)

Page 20: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

PCNAèCDKN3,  CDKN3èPCNA  Results    

6/18/12   20  U-­‐M  &  DataSpeaks  

•  Not a “known” interaction •  DataSpeaks - almost no evidence of causality (green line) •  Not identified by Shojaie/Michailidis

Time (hours)

Norm

alized Log2 (C

y5/cy3)

Page 21: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

PCNAèCDC2,  CDC2èPCNA  Results  

6/18/12   21  U-­‐M  &  DataSpeaks  

Time (hours)

Norm

alized Log2 (C

y5/cy3)

Page 22: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

6/18/12   22  U-­‐M  &  DataSpeaks  

Delay = 0

Delay = 5

Delay = 2 Delay = 1

Delay = 6 Delay = 4

Delay = 3

Page 23: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

PCNAèCDC2,  CDC2èPCNA  Results  

6/18/12   23  U-­‐M  &  DataSpeaks  

•  This is a “known” interaction •  DataSpeaks identifies strong evidence for causality •  Not detected by Granger causality as reported by Shojaie/Michailidis

Time (hours)

Norm

alized Log2 (C

y5/cy3)

Page 24: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Results  Summary:    Comparison  by  Method  

•  This  is  summarized  in  a  handout  

6/18/12   24  U-­‐M  &  DataSpeaks  

Page 25: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

-40

-20

0

20

40

60

80

100

GnRH®LHLH®GnRHP. C. Index

0 1 2 3 4 6 5

Hormone  Data  Example  

6/18/12   25  U-­‐M  &  DataSpeaks  

Page 26: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Major  Steps  to  Compute  and  Use  DataSpeaks’  CS  Scores  

1.  Op1onal  data  pre-­‐processing  2.  Convert  each  dimensional  series  into  a  set  of  digital  series  3.  Form  addi1onal  digital  series  to  account  for  delay,  

persistence,  episodes,  Boolean  events  4.  Cross-­‐classify  each  digital  series  for  an  IV  or  predictor  

variable  with  each  digital  series  for  for  an  DV  or  predicted  variable  to  form  arrays  of  2x2  tables  

5.  Compute  a  raw  CS  for  each  2x2  table  6.  Compute  a  standardized  CS  for  each  2x2  table  7.  Summarize  standardized  CS  scores  8.  Analyze  results  sta1s1cally  when  there  are  groups  of  two  

or  more  individuals  

6/18/12   26  U-­‐M  &  DataSpeaks  

Page 27: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Step  2:  Digi1za1on  

•  One  key  to  DataSpeaks’  capabili1es  •  Required  when  a  1me  series  has  more  than  two  levels  

•  Digi5za5on  has  poten5al  to  be  as  valuable  for  understanding  individuals  scien5fically  with  5me  series  as  digi5za5on  has  been  for  photography  and  communica5ons.  

•  Individuals  can  be  cells,  brains,  people,  whole  popula1ons,  economies,  Earth’s  biosphere,  etc.  

6/18/12   27  U-­‐M  &  DataSpeaks  

Page 28: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Digi1za1on  Summary  for  CDC2  

Temporal  Resolu-on  

CDC2

-2

-1.5

-1

-0.5

0

0.5

1

1.5

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 CDC2

CutPoints  

0.7 1000000111100000000000110000000000000000000000000 0.6 1000001111100000000000111000000000110000000000000 0.5 1000001111100000000011111000000000110000000000000 0.3 1100111111100000000011111000000000111110000000000 0.2 1110111111100000000011111100000000111110000000000 0.1 1111111111100000000111111100000001111111000000000 -0.1 1111111111100000001111111100000011111111010000001 -0.3 1111111111110000001111111100000011111111110000111 -0.5 1111111111110000001111111110000011111111111010111 -0.6 1111111111110000011111111110001111111111111110111 -0.7 1111111111110000011111111110011111111111111111111

Dimen

sion

al  Resolu-

on  

-0.723 111111111110000011111111110011111111111111111111 -0.592 111111111110000011111111110001111111111111110111 -0.461 111111111110000001111111110000011111111111010111 -0.331 111111111110000001111111100000011111111110000111 -0.200 111111111100000001111111100000011111111110000101 -0.070 111111111100000001111111100000011111111010000001 0.061 111111111100000000111111100000001111111000000000 0.192 110111111100000000011111100000000111110000000000

0.322 100111111100000000011111000000000111110000000000 0.453 000001111100000000011111000000000110000000000000 0.584 000001111100000000000111000000000110000000000000 0.714 000000111100000000000110000000000000000000000000

“Action Digigram”

6/18/12   28  U-­‐M  &  DataSpeaks  

Page 29: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Value  of  High  Resolu1on  –    Temporal  and  Dimensional  

•  Beher  resolu1on  of  DataSpeaks  for  understanding  dynamic  temporal  phenomena  is  related  to  beher  resolu1on  in  digital  photography.  

•  An  aspect  of  DataSpeaks’  data  microscope  –  see  paherns  that  have  never  been  seen  before  

•  Biggest  gains  apt  to  come  from  improving  temporal  resolu1on.  

•  Sta5s5cs  gains  power  with  more  subjects.  •  DataSpeaks  gains  power  with  more  repeated  measurements.  

•  Big  Data  –  Really  gain  power  with  more  subjects  AND  more  repeated  measurements  AND  higher  resolu1on.  

6/18/12   29  U-­‐M  &  DataSpeaks  

Page 30: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Step  3:  Form  Addi1onal  Digital  Series  –    e.g.,  Delay  (D)  and  Persistence  (P)  

6/18/12   30  U-­‐M  &  DataSpeaks  

Page 31: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

RFC4èCDC2,  CDC2èRFC4  Delay  and  Persistence    

6/18/12   31  U-­‐M  &  DataSpeaks  

RFC4 w P 2, 3 CDC2 with P 2, 3

PC Index w P 2, 3

Page 32: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Capabili1es  to  Define  Digital  Events:  DataSpeaks’  SoRware  Currently  

•  Independent  events  –  12  levels  of  dimensional  resolu1on  –  7  levels  of  Delay,  0  –  6  –  5  levels  of  Persistence,  1  –  5  –  36  combina1ons  of  Episode  Length  and  Episode  Criterion  –  15,120  (12x7x5x36)  total  combina1ons  

•  Dependent  events  –  12  levels  of  dimensional  resolu1on  –  36  combina1ons  of  Episode  Length  and  Episode  Criterion  –  432  (12x36)  total  combina1ons  

•  Superb  for  pahern  finding  

6/18/12   32  U-­‐M  &  DataSpeaks  

Page 33: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Boolean  Independent  Events    for  Complexity  

•  Assess  mul5ple  causes,  e.g.  mul1ple  levels  of:  –  Gene  ac1vity,  proteins,  lipids,  carbohydrates,  metabolites  –  Electrophysiological  variables,  brain  ac1vity  –  Drugs  –  an  alterna1ve  to  usual  way  of  inves1ga1ng  drug/drug  interac1ons  

•  Also  can  do  Boolean  dependent  events  for  syndromes  –  Mul1ple  signs  and  symptoms  of  disease  and  disorder  

•  An  an1dote  for  reduc1onism?  

6/18/12   33  U-­‐M  &  DataSpeaks  

Page 34: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Example:  Boolean  Independent  Events  

•  PCNA_0.2_1_1_0_1                                  111110000000000011110000000000011100000000000111                    

•  CCNA2_0.1_1_1_0_1                              000001111111000000110111110000000111111101000001  

•  PCNA_0.2_1_1_0_1  AND  CCNA2_0.1_1_1_0_1  000000000000000000110000000000000100000000000001                    

•  PCNA_0.2_1_1_0_1  OR  CCNA2_0.1_1_1_0_1    111111111111000011110111110000011111111101000111                    

 

6/18/12   34  U-­‐M  &  DataSpeaks  

Page 35: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Step  4:  Cross-­‐Classify  Digital  Series  to    Form  an  Array  of  2x2  Tables  

•  Cross-­‐classify  each  digital  series  for  an  IV  with  each  digital  series  for  a  DV    

     •  Example:  •  PCNA_-­‐0.4_1_1_0_1  111111100000111111111111000111111111111111111111                  •  CDC2_0.1_1_1                        110111111100000000111111100000011111111010000101    •                                                                               aacaaaabbbddccccccaaaaaabddccccaaaaaaaacaccccaca    •  Yields  3-­‐D  array  of  12x12x7=1008  2x2  tables  for  each  gene  pair  for  these  

examples  

6/18/12   35  U-­‐M  &  DataSpeaks  

Page 36: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Step  5:  Compute    Raw  Coordina1on  Scores  

•  Start  with  observed  2x2  table  •  Compute  chi  square  •  Set  sign  

– Compute  expected  value  of  a,  E(a)  –  If  observed  value  of  a,  O(a),  is  <  E(a),  then  raw  CS  is  nega1ve  chi  square  to  indicate  nega1ve  CS  

–  If  O(a)  is  >  E(a),  then  raw  CS  is  posi1ve  chi  square  to  indicate  posi1ve  CS  

6/18/12   36  U-­‐M  &  DataSpeaks  

Page 37: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Step  6:  Compute      Coordina1on  Scores  

•  CSs  must  be  standardized  •  DataSpeaks  standardizes  these  scores  to  have  an  expected  value  of  0  and  a  standard  devia1on  of  1.  

•  How?  What  is  the  trick?  

6/18/12   37  U-­‐M  &  DataSpeaks  

Page 38: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Standardiza1on  Example        Raw  Score  P(Raw)    Score            Standardized  Score  

19  8    -­‐7.466667  0.00588335    -­‐4.363133  21  0  20  7    -­‐3.809524  0.0494201    -­‐2.231321  20  1  21  6    -­‐1.371429  0.164734    -­‐0.810112  19  2  22  5    -­‐0.152381  0.284540    -­‐0.099508  18  3  23  4    0.152381  0.278354    0.078143  17  4  24  3    1.371429  0.157734    0.788747  16  5  25  2    3.809524  0.0504749    2.209955  15  6  26  1    7.466667  0.00832004    4.341767  14  7  27  0    12.342857  0.000539262  7.184184  13  8  

6/18/12   38  U-­‐M  &  DataSpeaks  

Page 39: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Step  7:  Summarize  Standardized    Coordina1on  Scores  

•  Select  CS  score  with  the  highest  absolute  value  to  summarize  – Each  whole  array  – Any  array  dimension  (e.g.,  IV  level,  DV  level,  delay,  persistence,  episode  length,  episode  criterion)  

– Any  combina1on  of  dimensions  

•  Loca1ons  of  summary  scores  iden1fy  condi1ons  that  yield  most  evidence  for  coordina1on  

6/18/12   39  U-­‐M  &  DataSpeaks  

Page 40: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

CSs  Summarized  as    Func1on  of  IV  Level  at  Delay  =  0  

•  Does  DataSpeaks  address  non-­‐linearity?  

0

5

10

15

20

25

30

-0.6 -0.4 -0.2 0 0.2 0.4 0.6

Independent Variable Level: E2F1

Scor

es

CDC6

6/18/12   40  U-­‐M  &  DataSpeaks  

Page 41: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Step  8:  Analyze  Coordina1on  Scores  Sta1s1cally  

•  Applies  when  there  is  more  than  one  individual  •  Describe  and  compare  groups  •  Make  inferences  from  samples  of  individuals  to  popula1ons  

•  Iden1fy  gene1c  and  other  predictors  of  disorder,  treatment  response,  and  differen1al  dose  requirements  

•  Iden1fy  CS  factors  to  reduce  dimensionality  •  Inform  the  development  of  mathema1cal  and  sta1s1cal  models  

6/18/12   41  U-­‐M  &  DataSpeaks  

Page 42: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

DataSpeaks  Elucidates  Mechanisms  

•  For  example,  grow  any  of  over  100  cell  lines  under  different  environmental  condi1ons  – Temperatures  – Different  cell  culture  media  reagents  and  supplements    

– Actual  or  poten1al  an1-­‐cancer  drugs  in  media  

•  Visualize  detailed  informa1on  about  how  coordina1on  might  be  affected  

6/18/12   42  U-­‐M  &  DataSpeaks  

Page 43: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)
Page 44: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)
Page 45: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)
Page 46: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)
Page 47: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)
Page 48: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)
Page 49: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

DataSpeaks  Will  Advance    Medical  Diagnosis  

•  Many  chronic  health  problems  appear  to  be  disorders  of  coordinated  ac5on  at  various  levels  –  Biological  (proteins,  lipids,  carbohydrates,  metabolites,  electrophysiological,  etc.)  

–  Psychological  (mental  and  physical  behavior)  –  Social  (e.g.,  family  and  work  role  performance)  

•  Coordinated  ac1on  is  an  emergent  system  property  

•  Dream  project:  Apply  DataSpeaks  to  BOLD  fMRI  data  to  visualize  disorders  of  func1onal  connec1vity  between  and  among  brain  regions  

6/18/12   49  U-­‐M  &  DataSpeaks  

Page 50: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Let’s  Rock  the  RCT  Design  Boat  

•     Combine  DataSpeaks  with  randomized  experimental  control  exercised  over  5me  for  individuals  •     Tac1cal  win  now  for  drug  rescue  and  repurposing  •     Poten1al  to  become  new  gold  standard  for  many  RCTs  when  drugs  are  developed  and  used  to  manage  or  control  chronic  health  disorders  •     Use  Ultra  RCT  designs  and  DataSpeaks’  SoRware  as  a  Service  (SaaS)  to  measure  benefit  and  harm  over  1me  and  across  response  variables  for  each  individual  •     Es1mate  10%  of  the  cost  and  50%  faster  and  1000  1mes  safer  !!!  

 6/18/12   50  U-­‐M  &  DataSpeaks  

Page 51: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

“Houston,  we  have  a  problem”  

6/18/12   51  U-­‐M  &  DataSpeaks  

Page 52: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Houston,  we  have  another  problem  

•  Drug  safety  problem  – About  100,000  deaths  and  2,000,000  hospitaliza1ons  per  year  in  U.S.  

– Vioxx,  Bextra  –  product  withdrawals  – Mul1  billion  dollar  legal  liability  due  largely  to  safety  problems  that  derive  from  weak  science  that  leaves  room  for  error  and  bad  behavior  

– PfizerèNCRC    

6/18/12   52  U-­‐M  &  DataSpeaks  

Page 53: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Diagnose  Scien5fic  Causes  of    RCT  Design  Problem  

•  Current  first-­‐genera1on  RCT  designs,  which  are  for  causality  assessment,  date  back  to  the  1940s  –  Streptomycin  –  ENIAC    

•  Four  types  of  confounding  –  Individuality  with  measurement  error  –  True  responders  with  placebo  responders  – Dose  with  type  of  treatment  –  Treatment  effects  with  how  they  are  valued  

•  Too  much  data  is  going  to  waste  

6/18/12   53  U-­‐M  &  DataSpeaks  

Page 54: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Elephants  in  the  RCT  Design  Room  

6/18/12   54  U-­‐M  &  DataSpeaks  

Page 55: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Ultra  RCT  Example,  Mock  Data  

6/18/12   55  U-­‐M  &  DataSpeaks  

Page 56: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Get  Doses  Right  

6/18/12   56  U-­‐M  &  DataSpeaks  

Page 57: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Harmonize  P4  Medicine  with  Public  Health  Approach  to  Medicine  

6/18/12   57  U-­‐M  &  DataSpeaks  

Page 58: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

What  problem  could  not  be  done  beher    with  mul1ple  1me  series?    Forensic  iden1fica1on  with  DNA.    Most  problems  involving  living  systems,    which  are  CDAS  that  manifest  emergent    system  proper1es,  can  be  done  beher    with  1me  series  and  DataSpeaks’  soRware.      

6/18/12   58  U-­‐M  &  DataSpeaks  

Page 59: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Next  Steps,  Collaborate?  

•  Publica1ons  – hhp://dataspeaks.com/resources/APA-­‐JCCP-­‐1992-­‐Vol60-­‐No2-­‐P225-­‐239.pdf    

– Patents  – Need  new,  co-­‐authored,  peer-­‐reviewed  publica1on  with  a  version  of  the  algorithm  that  is  

•   More  mathema1cally  elegant  •   Computa1onally  efficient?  •   Illustrated  in  a  par1cular  scien1fic  context  

•  Faculty  and  Graduate  Student  Projects  6/18/12   59  U-­‐M  &  DataSpeaks  

Page 60: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Next  Steps,  Collaborate?  •  SoRware  development  •  In  silico  simula1ons  as  with  Ultra  RCTs  •  J&J,  tranSMART?  •  Grants,  e.g.,  SBIR  

–  Phenotyping  Pain  Treatment  Responses  (NIH  PhenX  Project)  

–  Drug  Rescue  by  Data  Rescue  from  RCTs    –  Diagnosing  Func1onal  Brain  Disorders  with  a  New  Algorithm  for  Processing  Func1onal  Brain  Imaging  Data    

•  Commercializa1on  

6/18/12   60  U-­‐M  &  DataSpeaks  

Page 61: Causality)Assessmentwith) Mul1ple)Time)Series)Data · Causality)Assessmentwith) Mul1ple)Time)Series)Data Cur1s)A.)Bagne,)Ph.D.)–DataSpeaks,)Inc.) Brian)D.)Athey,)Ph.D.,)Chair,)Departmentof)Computaonal)

Thank  you  

•  Please  contact  Curt  Bagne    – [email protected]  – 248  952-­‐1968  (home  phone)  – 2971  Vineyards  Drive,  Troy,  MI  48098  

6/18/12   61  U-­‐M  &  DataSpeaks