Topic Moment tensor inversion and moment tensor interpretation
Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised...
-
Upload
y-h-taguchi -
Category
Science
-
view
65 -
download
0
Transcript of Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised...
Identification of Candidate Drugs for Heart Failure using Tensor DecompositionBased Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart Failure and Drug Matrix Datasets
Yh. TaguchiDepartment of Physics, Chuo UniversityTokyo 1128551, Japan
DOI: 10.1101/117465 DOI: 10.1007/9783319633121_45
Introduction
Drug discovery (DD) is experimentally time consuming and expensive process.
In silico drug discovery enables researcher to reduce the cost and time required for this complicated process.
Two major in silico DD strategies.Ligand based one vs Structure based one.
Pros and Cons of ligand based and structure based Pros and Cons of ligand based and structure based DD.DD.
Ligand based:Ligand based:(Relatively) easy to perform.(Relatively) high accuracy.But unable to identify compounds missing similarity with known drug copounds.Structure Based:Structure Based:No needs for similarity with known drug compoundsBut need for (inferred) protein structures.Computationally massive (unable to apply to library having more than million compounds).
The 3The 3rdrd in silicoin silico DD strategy : gene expression DD strategy : gene expression based.based.
In stead of (structural) similarity between ligands, that of (drug treated cell line/model animal) gene expression profiles.
(Relatively) easy to perform.Needs massive (labeled) data set for training.
The purpose of this research:The purpose of this research:Propose of gene expression based in silico DD without training data sets (via unsupervised learning)
Method: Tensor decomposition (TD) based Method: Tensor decomposition (TD) based unsupervised feature extraction (FE)unsupervised feature extraction (FE)
N features
Categorical multiclasses SV
D
1st
samples
M samplesN × M Matrix X (numerical values)
2nd
1st
Genes
++ ++ +
+++
++ ++ ++
+
No distinction between classes
Example: singular value decomposition (SVD) applied to matrix: samples and features are embedded into Q dimensional space.
Synthetic example
10 samples10 samples
90 features 10 featuresN(0)N()
[N()+N(0)]/2
+:Top 10 outliersThus, extracting outliers selects features distinct between two classes in an unsupervised way.Accuracy:(100 trials)Accuracy:(100 trials) 89.5% ( 52.6% (
1st
2nd
Normal μ:mean Distribution ½ :SD
In this study, tensor is generated from In this study, tensor is generated from mathematical product of gene expression mathematical product of gene expression between between human diseaseshuman diseases and and drug treated drug treated
model animalsmodel animals
Human Animal
sam
ples
genes
=
=
Hum
an sam
p les
Animal Samples Animal SamplesH
uman
samp les
gene s
TD
xij3
xj1j2j3i=xij1j2xij3
=G(l1,l2,l3,l4)xl1j1xl2j2
xl3j3xl4i
xij1j2
xl1j1
xl2j2
xl3j3xl4i
j1j2
j3
i
Animal samples(compounds)
Animal samples(time points)
Humansamples
gene
Patients vs normal control
TD
Time points singular value vectors (SVV)
FirstSecondThirdFourth
Human sampls singular value vectors (SVV)
Two heart failure
vsNormal
Histogram of drug singular value vectors (SVV)
43 compounds
281 genes affected by drug treatments were selected using 21st, 25th, 27th, 28th, 33rd, 36th, 37th, 38 th, 41st, and 42nd genes SVV as well.
Although I performed massive biological evaluations of genes and compounds, due to lack of time, I cannot disucuss about it. See my paper in proceeding for more details.
SummarySummary
I have developed unsupervised method of in silico DD using gene expression.
I have successfully applied this method to the combination of drug treated model animals (rat) and human disease (heart failure).
Biological evaluation of identified genes are reasonable.
More extensive applications together with drug target genes is under consideration (already submitted).