Fixing the leaks in the pipeline from public genomics data to the clinic
-
Upload
jtleek -
Category
Health & Medicine
-
view
200 -
download
0
Transcript of Fixing the leaks in the pipeline from public genomics data to the clinic
![Page 1: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/1.jpg)
fixing the leaks in the genomics
![Page 2: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/2.jpg)
http://jhudatascience.org/
![Page 3: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/3.jpg)
https://www.coursera.org/specialization/genomics/41
![Page 4: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/4.jpg)
@simplystatshttp://simplystatistics.org
![Page 5: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/5.jpg)
@jtleekhttp://www.jtleek.com
![Page 6: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/6.jpg)
https://www.counsyl.com/
![Page 7: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/7.jpg)
Their basic pitch was “Genomics is a fraud”
“”
http://www.technologyreview.com/news/535771/a-contrarian-in-biotech/
![Page 8: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/8.jpg)
![Page 9: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/9.jpg)
“The explosive growth of next-generation sequencing data submitted into the SRA exceeds the growth rate of storage capacity ”
http://www.ncbi.nlm.nih.gov/pubmed/22009675
![Page 10: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/10.jpg)
![Page 11: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/11.jpg)
![Page 12: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/12.jpg)
![Page 13: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/13.jpg)
3 costanalyst variationmotivation
![Page 14: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/14.jpg)
1 cost
![Page 15: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/15.jpg)
costs
moneyinterpretability
![Page 16: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/16.jpg)
http://arxiv.org/pdf/math/0606441.pdf
![Page 17: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/17.jpg)
http://www.ncbi.nlm.nih.gov/pubmed/19276151
![Page 18: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/18.jpg)
![Page 19: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/19.jpg)
@leekgroup
![Page 20: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/20.jpg)
![Page 21: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/21.jpg)
http://www.ncbi.nlm.nih.gov/pubmed/25788628
![Page 22: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/22.jpg)
http://www.ncbi.nlm.nih.gov/pubmed/25788628
![Page 23: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/23.jpg)
Agilent/Grade 1 Agilent/Grade 3 Illumina/Grade1 Illumina/Grade3
100%
75%
50%
25%
0%
Acc
urac
y
Pam Scaled Pam Unscaled TSP
http://www.ncbi.nlm.nih.gov/pubmed/25788628
![Page 24: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/24.jpg)
algorithm1.select useful pairs2.screen pairs for association3.build a simple cart predictor
![Page 25: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/25.jpg)
http://www.ncbi.nlm.nih.gov/pubmed/19276151
![Page 26: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/26.jpg)
Patil et al. (in prep)
![Page 27: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/27.jpg)
Patil et al. (in prep)
![Page 28: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/28.jpg)
Patil et al. (in prep)
![Page 29: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/29.jpg)
@leekgroup
Data:
xik
- value for feature i, sample k
yk - group indicator for sample k
TSP is (i,j) pair that maximizes:
|Pr(xik
< xjk
| yk=1) – Pr(x
ik < x
jk | y
k=0)| ⌃ ⌃
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1989150/
![Page 30: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/30.jpg)
@leekgroup
zijk
=1(xik
< xjk
)
E[zijk
|yk] = a
0ij + a
1ijy
k
→ max |a1jk
| = TSP
Patil et al. (in prep)
![Page 31: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/31.jpg)
@leekgroup
• Not the same as TSP• But |â/s.e.(â)| = |û/s.e.(û)|, algebraically• “Variance regularized” TSP• zijk invariant to monotone transformations• Fix parameters → find features
E[yk|z
ijk] = u
0ij + u
1ijz
ijk
Patil et al. (in prep)
![Page 32: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/32.jpg)
@leekgroup
1. Calculate t-statistic for all pairs2. Choose top pair (or covariate)3. Continue for a fixed number of pairs
E[yk|z
ijk] = u
0ij + u
1ijz
ijk
Patil et al. (in prep)
![Page 33: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/33.jpg)
@leekgroup
http://astor.som.jhmi.edu/~marchion//breastTSP.html
![Page 34: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/34.jpg)
@leekgroup
USP7 < RP11-423C15.3
NM_018610 < MTCH1
RND1 < LGALS14
No Recur
No Recur
No Recur
Recur
No Yes
No Yes
No Yes
![Page 35: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/35.jpg)
@leekgroup
![Page 36: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/36.jpg)
@leekgroup
Mammaprint
Patil et al. (in prep)
![Page 37: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/37.jpg)
2 analyst variation
![Page 38: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/38.jpg)
![Page 39: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/39.jpg)
![Page 40: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/40.jpg)
![Page 41: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/41.jpg)
what went wrong?
2things
![Page 42: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/42.jpg)
what went wrong? transparency
The data/code weren’t reproducible
![Page 43: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/43.jpg)
what went wrong? transparency
There was a lack of cooperation
![Page 44: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/44.jpg)
what went wrong? expertise
They used silly prediction rules
(Pr(FEC) = 5/8[Pr(F) + Pr(E) + Pr(C)] – ¼)
![Page 45: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/45.jpg)
what went wrong? expertise
They had study design problems
(Batch effects)
![Page 46: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/46.jpg)
what went wrong? expertise
Their predictions weren’t locked down
Today: Pr(FEC) = 0.8Tomorrow: Pr(FEC) = 0.1
![Page 47: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/47.jpg)
At the end of the day the Pottianalysis was fully reproducible
The problem is that the analysiswas wrong
![Page 48: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/48.jpg)
@leekgroup
http://bit.ly/10vS1yt
![Page 49: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/49.jpg)
@leekgroup
http://bit.ly/OgW3xv
![Page 50: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/50.jpg)
@leekgroup
Drinkel et al. Oganometalics 2013
![Page 51: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/51.jpg)
@leekgroup
![Page 52: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/52.jpg)
@leekgroup
![Page 53: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/53.jpg)
@leekgroup
![Page 54: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/54.jpg)
@leekgroup
![Page 55: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/55.jpg)
![Page 56: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/56.jpg)
http://simplystatistics.tumblr.com/post/19646774024/laws-of-nature-and-the-law-of-patents-supreme-court
![Page 57: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/57.jpg)
3 motivation
![Page 58: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/58.jpg)
![Page 59: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/59.jpg)
$(from reducing sample size)
![Page 60: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/60.jpg)
basic idearandomization isn’t perfect “rebalance” with baseline covariatesimprove estimator precision
![Page 61: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/61.jpg)
![Page 62: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/62.jpg)
![Page 63: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/63.jpg)
Ack Math!!!!
![Page 64: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/64.jpg)
Estimate probability of being in arm given baseline covariates
![Page 65: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/65.jpg)
Calculate initial estimate for each person using each arm model using propensity score weighted logistic regression
![Page 66: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/66.jpg)
Define a covariate as the residual from fitting the arm-level models minus the arm-level means and fit new propensity models
![Page 67: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/67.jpg)
Use these propensities to re-fit WLR from (2), then average predictions to get covariate-adjusted treatment effect
![Page 68: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/68.jpg)
@leekgroup
http://astor.som.jhmi.edu/~marchion//breastTSP.html
![Page 69: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/69.jpg)
@leekgroup
Age, Tumor Size, Grade 5.1%
Age, Tumor Size, Grade, ER Status
4.9%
Mammaprint Risk Category (MRC)
5.4%
Age, Tumor Size, Grade, ER Status, MRC
7.8%
![Page 70: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/70.jpg)
@leekgroup
Age, Tumor Size, Grade 5.1%
Age, Tumor Size, Grade, ER Status
4.9%
Mammaprint Risk Category (MRC)
5.4%
Age, Tumor Size, Grade, ER Status, MRC
7.8%
Age, Tumor Size, Grade, ER Status, TSP
6.2%
![Page 71: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/71.jpg)
3 costanalyst variationmotivation
![Page 72: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/72.jpg)
acknowledgementsLeek groupPrasad PatilLeo Collado TorresAbhi NelloreClaire RubermanJack FuKai Kammers
CollaboratorsMichael RosenblumBenjamin Haibe-KainsP.O. Bachant-WinnerRoger Peng
![Page 73: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/73.jpg)
Prasad Patilhttp://www.biostat.jhsph.edu/~prpatil/
![Page 74: Fixing the leaks in the pipeline from public genomics data to the clinic](https://reader031.fdocuments.net/reader031/viewer/2022032502/55b902c9bb61eb83658b4691/html5/thumbnails/74.jpg)
Links
https://github.com/leekgroup/sig2trial
http://jtleek.com/talks/