Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Target Interaction and...

1
http://datasciencelab.ugent.be/ Ghent University – iMinds, ELIS Department/Data Science Lab Ghent University Global Campus – Center for Biotech Data Science Mijung Kim, Jasper Zuallaert, Wesley De Neve, and Rik Van de Walle BIG N2N Annual Symposium {mijung.kim, jasper.zuallaert, wesley.deneve, rik.vandewalle}@ugent.be BIOTECH DATA SCIENCE @ GUGC IN KOREA: DEEP LEARNING FOR PREDICTION OF DRUG-TARGET INTERACTION AND DNA ANALYSIS May 19, 2016 | Ghent | Belgium Overview of Deep Machine Learning Prediction of Drug-Target Interaction Input layer Hidden layers Output layer Automatic end-to-end learning of hierarchical features through multi-layered neural networks Training Data: CheMBL Dataset by EMBL-EBI Model using Deep Learning (TensorFlow) New Drug Potential Target DNA Analysis using Natural Language Processing Techniques Word2Vec on DNA sequences → Represents every -gram by a vector e.g., ACG = [O.359, …, -O.129] Vectors are calculated based on surrounding -grams e.g., ... TTA CGA ACG TGG CAT ... Convolutional Neural Networks ACG GCT CTA TAA AAG AGA GAC ACC CCT CTA … c i-4 c i-3 c i-2 c i-1 c i c i+1 c i+2 Long Short-Term Memory Networks ACG GCT CTA TAA AAG AGA GAC ACC CCT CTA … One-hot representation A C C A T A … 1 O O 1 O 1 O 1 1 O O O O O O O O O O O O O 1 O ACC CCA CAT ATA O.241 O.124 -O.549 O.421 -O.853 O.513 O.185 -O.129 -O.252 -O.884 O.112 O.466 Training Prediction Prediction

Transcript of Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Target Interaction and...

Page 1: Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Target Interaction and DNA Analysis

http://datasciencelab.ugent.be/Ghent University – iMinds, ELIS Department/Data Science Lab

Ghent University Global Campus – Center for Biotech Data Science

Mijung Kim, Jasper Zuallaert, Wesley De Neve, and Rik Van de WalleBIG N2N Annual Symposium

{mijung.kim, jasper.zuallaert, wesley.deneve, rik.vandewalle}@ugent.be

BIOTECH DATA SCIENCE @ GUGC IN KOREA: DEEP LEARNING FORPREDICTION OF DRUG-TARGET INTERACTION AND DNA ANALYSIS

May 19, 2016 | Ghent | Belgium

Overview of Deep Machine Learning

Prediction of Drug-Target Interaction

Input layer Hidden layers Output layer

Automatic end-to-end learning of hierarchical features through multi-layered neural networks

Training Data: CheMBL Dataset by EMBL-EBI

Model usingDeep Learning

(TensorFlow)

NewDrug

Potential Target

DNA Analysis using Natural Language Processing Techniques

Word2Vec on DNA sequences

→ Represents every 𝑛-gram by a vector

e.g., ACG = [O.359, …, -O.129]

Vectors are calculated based on

surrounding 𝑛-grams

e.g., ... TTA CGA ACG TGG CAT ...

Convolutional Neural Networks

ACG GCT CTA TAA AAG AGA GAC ACC CCT CTA …

… ci-4 ci-3 ci-2 ci-1 ci ci+1 ci+2 …

Long Short-Term Memory Networks

ACG GCT CTA TAA AAG AGA GAC ACC CCT CTA …

One-hot representation

A C C A T A …

1 O O 1 O 1

O 1 1 O O O

O O O O O O

O O O O 1 O

ACC CCA CAT ATA

O.241 O.124 -O.549 O.421

-O.853 O.513 O.185 -O.129

… … … …

-O.252 -O.884 O.112 O.466

Training

Prediction

Prediction