Data Integration: What I Haven't Yet Achieved
-
Upload
neil-saunders -
Category
Technology
-
view
221 -
download
0
description
Transcript of Data Integration: What I Haven't Yet Achieved
![Page 1: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/1.jpg)
Data Integration: what I haven’t yet achieved
Neil Saunders
MATHEMATICS, INFORMATICS AND STATISTICSwww.csiro.au
![Page 2: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/2.jpg)
Data integration 2 of 21
My main project
Ludwig colorectal cancer study
![Page 3: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/3.jpg)
Data integration 3 of 21
Multiple “omics” platforms
exon expression methylation copy number
![Page 4: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/4.jpg)
Data integration 4 of 21
We want to “integrate” these data
but what does that mean?
![Page 5: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/5.jpg)
Data integration 5 of 21
Integration can mean “portals”
![Page 6: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/6.jpg)
Data integration 6 of 21
Integration can mean “visualization”
![Page 7: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/7.jpg)
Data integration 7 of 21
Integration can mean “correlation”
![Page 8: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/8.jpg)
Data integration 8 of 21
What do we think integration means?
A B C+ +
More information when combined than when separate
![Page 9: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/9.jpg)
Data integration 9 of 21
What’s already “out there”? PubMed
●
●
●
●
●
●
●
●
●
●
●
4
8
12
2002 2004 2006 2008 2010Year
artic
les
/ 100
000
PubMed Search: "data integration"
![Page 10: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/10.jpg)
Data integration 10 of 21
What’s already “out there”? CiteULike
http://www.citeulike.org/user/neils/tag/integration
![Page 11: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/11.jpg)
Data integration 11 of 21
Buzz-word compliant
![Page 12: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/12.jpg)
Data integration 12 of 21
Quote from integIRTy paper
These methods can be roughly grouped into four categories:stepwise, regression-based, correlation-based andlatent variable modelsintegIRTy: a method to identify genes altered in cancer by accounting formultiple mechanisms of regulation using item response theoryBioinformatics, Vol. 28, No. 22. (15 November 2012), pp. 2861-2869
![Page 13: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/13.jpg)
Data integration 13 of 21
Regression: SIM
Integrated analysis of DNA copy number and gene expression microarray data using gene setsBMC Bioinformatics 2009, 10:203
![Page 14: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/14.jpg)
Data integration 14 of 21
Correlation: DR-Integrator
0 2 4
0 0.2 0.4 0.6 0.8 1
Cor
rela
tion
Chr
22212019
1817
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
010
026
142
011
115
018
037
145
017
009
023
002
116
117
120
003
036
029
040
114
118
121
112
006
113
119
034
035
028
004
007
013
014
016
024
012
019
021
015
001
067
068
072
077
048
058
064
050
075
080
086
051
061
070
076
087
092
096
099
101
104
110
093
097
100
089
109
091
103
127
130
131
135
133
136
134
137
125
128
138
146
032
033
043
038
041
042
140
141
144
153
152
147
122
123
132
126
139
069
074
085
055
095
005
066
010
026
142
011
115
018
037
145
017
009
023
002
116
117
120
003
036
029
040
114
118
121
112
006
113
119
034
035
028
004
007
013
014
016
024
012
019
021
015
001
067
068
072
077
048
058
064
050
075
080
086
051
061
070
076
087
092
096
099
101
104
110
093
097
100
089
109
091
103
127
130
131
135
133
136
134
137
125
128
138
146
032
033
043
038
041
042
140
141
144
153
152
147
122
123
132
126
139
069
074
085
055
095
005
066
![Page 15: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/15.jpg)
Data integration 15 of 21
Latent variable: iCluster
(file under impractical)
![Page 16: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/16.jpg)
Data integration 16 of 21
Basics that are never explained 1/2
Integration across groups or description of samples?
![Page 17: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/17.jpg)
Data integration 17 of 21
Basics that are never explained 2/2
Genes x Samples
![Page 18: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/18.jpg)
Data integration 18 of 21
Conclusions 1/3
We’re not the first people doing this......but it’s becoming a “hot topic”
![Page 19: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/19.jpg)
Data integration 19 of 21
Conclusions 2/3
Room for improvement in software, much of which is:
• Poorly-written
• Poorly-documented
• Difficult to implement
![Page 20: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/20.jpg)
Data integration 20 of 21
Conclusions 3/3
Too much for one individual!
![Page 21: Data Integration: What I Haven't Yet Achieved](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c367944a7959ed3c8b4577/html5/thumbnails/21.jpg)
MATHEMATICS, INFORMATICS AND STATISTICSwww.csiro.au
CSIRO Mathematics, Informatics and Statistics
Neil Saunderst +61 2 9325 3144e [email protected] Mathematics, Informatics and Statistics web