Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 ·...
Transcript of Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 ·...
![Page 2: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/2.jpg)
2
Empirical Inference• drawing conclusions from observations (includes learning)
• focus in studying it is not primarily on the conclusions, but onautomatic methods
• for high‐dimensional noisy data, inference becomes nontrivial
• empirical inference methods are appropriate whenever:– little model knowledge is available– many variables are involved– mechanistic models are infeasible– (relatively) large datasets are available
![Page 3: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/3.jpg)
3
Motivation behind kernel methods
• Linear learning typically has nice properties– Unique optimal solutions
– Fast learning algorithms
– Better statistical analysis
• But one big problem– Insufficient capacity
![Page 4: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/4.jpg)
4
Problems of high dimensions
• Capacity may easily become too large and lead to over‐fitting: being able to realise every classifier means unlikely to generalise well
• Computational costs involved in dealing with large vectors
![Page 5: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/5.jpg)
5
Historical perspective
• Minsky and Papert (1969) highlighted the weakness in their book Perceptrons
• Multylayer Perceptrons with back propagation learning (1985) overcame the problem by gluing together many linear units with non‐linear activation functions– Solved problem of capacity and led to very impressive extension of applicability of learning
– But ran into training problems of speed and multiple local minima
![Page 6: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/6.jpg)
MODEL OF A NEURON
φ(.)jInputSignals
x1
x2
xm
wj1
wj2
wjm
Biasbkwj0
vk yk = φ(vk)
Activation Function
A neuron is an information processing unit that is fundamental to the operation of a neural network. The three basic elements of the neuronal model are:
![Page 7: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/7.jpg)
1. A set of synapses characterized by a weight (wji : weight at j from i ) connecting link from input i to neuron j.
2. A linear combiner :
3. An activation function φ for limiting the amplitude of the output neuron.
It also includes a bias denoted by bj for increasing or decreasing the net input of the activation function.
Mathematically a neuron for input xi is described by :
![Page 8: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/8.jpg)
Examples of Activation Functions
1. Threshold Function:
2. Sigmoid Function:
3. Hyperbolic Tangent Function:
![Page 9: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/9.jpg)
MULTILAYER PERCEPTRON (MLP)In
put S
ig n
al
Input layer 1st Hidden layer
Out
put S
igna
l
2nd Hidden Layer
Output Layer
b
X
![Page 10: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/10.jpg)
The basic idea for a learning algorithm for
MLP is to minimize the squared error E
(error between the desired output and the
actual output), which is a non-linear
function of the weights.
E=0.5 ∑( yj - oj )2 , Oj =∑wji oi
![Page 11: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/11.jpg)
Back propagation neural network models are being
used for many applications. However BPNN suffers
from the following weaknesses.
• Need of large no. of controlling parameters.• Danger of over-fitting (for large problem size
captures not only useful information contained in training data but also unwanted characteristics).
• Being stuck into local minima- error function to be minimised is nonlinear.
• Slow convergence rate- for large size problems.
![Page 12: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/12.jpg)
Hence, Support Vector Machines (SVMs)developed by
Vapnik and his co-workers (1992-95) has been used for
supervised learning due to –
• Better generalization performance than other NN models
•Solution of SVM is unique , optimal and absent from local
minima as it uses linearly constrained quadratic
programming problem.
•Applicability to non-vectorial data ( Strings and Graphs)
. Few parameters are required for tuning the learning m/c
![Page 13: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/13.jpg)
• Kernel Methods are a set of algorithms from statistical learning which include the SVM for classification and regression ,Kernel PCA , Kernel based clustering , feature selection, and dimensionality reduction etc.
• Popular methods in bioinformatics in last decade –Pubmed( search engine for biomedical literature) lists 3070 hits for SVM and 2626 hits for ‘ kernel methods’
![Page 14: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/14.jpg)
SUPPORT VECTOR MACHINEfor classification
Second solution is better because there is a larger margin between separating hyper plane and the closest data points.
Objective : Find the hyper plane that maximizes the margin of separation.
Support Vectors are the data points which lie closest to the decision surface (optimal Hyper plane).
Which solution will generalize better to unseen examples ?
C1
C2
Fig-1
C1
C2
Support Vectors
Optimal HyperplaneFig-2
![Page 15: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/15.jpg)
Consider a binary classification
problem: input vectors are and are thetargets or labels. The index labels the patternpairs (
The
).
define a space of labelled points called in-
put space.
![Page 16: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/16.jpg)
Separating hyperplane
Margin
![Page 17: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/17.jpg)
In an arbitrary m-dimensional space a separatinghyperplane can be written:
where is the bias, and the weights, etc.Thus we will consider a decision function of theform:
![Page 18: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/18.jpg)
We note that the argument in is invariantunder a rescaling: , .We will implicitly fix a scale with:
![Page 19: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/19.jpg)
for the support vectors (canonical hyperplanes).
![Page 20: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/20.jpg)
Thus:
For two support vectors on each side of theseparating hyperplane.
![Page 21: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/21.jpg)
The margin will be given by the projection of thevector ( ) onto the normal vector to thehyperplane i.e. from which we deducethat the margin is given by .
![Page 22: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/22.jpg)
Separating hyperplane
Margin
1
2
![Page 23: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/23.jpg)
Maximisation of the margin is thus equivalent tominimisation of the functional:
subject to the constraints:
![Page 24: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/24.jpg)
SVM FOR CLASSIFICATIONGiven the training sample find the optimal values of the weight vector W and bias b such that W minimizes the function subjects to the constraints :
≥ 1 for i=1,2,…,N.
This problem leads to the dual problem :
Determine that minimizes the objective function
Q(α)=
Subject to the constraints :
(1) (2)
![Page 25: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/25.jpg)
Then the optimum weight vector
And the optimum biasWhere is the support vector, s is the index such that α0,s > 0.The decision function is defined by :
If D(x) > 0 then x Є the class labeled by +1 else it belongs to the class labeled by -1.
![Page 26: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/26.jpg)
Soft Margin Separation of Classes
• Some data point (xi ,d I ) violates di (wTxi +b) ≥1( falls inside the region of separation but on the right side of the decision surface or falls on the wrong side)
Introduce slack variables ξi ≥0 ( measures deviation from ideal separability)
di (wTxi +b) ≥1 – ξi ,i=1,2,…,Nξi >1 data falls on wrong side ( misclassification)Goal‐ find separating hyperplane to minimise misclassification error
![Page 27: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/27.jpg)
Soft Margin …
• Minimise ½ wT w + C ∑i ξi(2nd term is an upper bound on the number of test errors , C controls the tradeoff between complexity of the m/c and the number of non‐separable points)
Subject to the above constraints. This leads the same dual problem except 0 ≤αi ≤ C , i=1,…,N,
And hence the same algo. with obvious modification.
![Page 28: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/28.jpg)
Support Vector Machines (SVM)
![Page 29: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/29.jpg)
Using kernels• The SVM alg. with linearly separable data is to be modified by replacing x with φ(x) in the feature space. Critical observation is that only inner products of φ(x) ‘s are used.
• Suppose that we now have a shortcut method of computing:
• Then we do not need to explicitly compute the feature vectors either in training or testing
![Page 30: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/30.jpg)
The kernel K(x,z)=(x.z+c)d
corresponds to a feature mapwhose coordinates are all monomials of degree upto order d.
Then dim(feature space)= n+d Cd = O(nd )
Computing K(x,z) takes only O(n) time
![Page 31: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/31.jpg)
31
Dual form of SVM
• The dual form of the SVM can also be derived by taking the dual optimisation problem! This gives:
![Page 32: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/32.jpg)
KERNELA kernel is a function such that for all (input space) satisfies:
(i) , where is a mapping from the input space to an inner product space (feature space)
RXxXK ⎯→⎯:Xxx ∈',
)'(),()',( xxxxK ΦΦ=
Φ
XF
![Page 33: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/33.jpg)
(ii) is symmetric and positive definite, that is, for any two objects, and , for any choice of objects , and any choice of real numbers .
Thus, using any kernel, the SVM method in the feature space can be stated in the form of the following algorithm.
K),'()',( xxKxxK =
Xxx ∈',
0),(1 1
≥∑ ∑= =
jij
n
i
n
ji xxKcc
0>nXxx n ∈,...,1Rcc n ∈,...,1
![Page 34: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/34.jpg)
Algorithm for General SVM
Step 1: Find
that minimizes,
under the constraints: (i)and (ii)for i = 1, …, n.Step 2: Find an index i with,
and set b as:Step 3: The classification of a new object is then based on the sign of the function
),...,( 1 nααα =
),(21
1 11jijij
n
i
n
ji
n
ii xxKyy ααα ∑ ∑−∑
= ==
00
=∑=
i
n
iiy α
Ci << α0
),(1
i
n
jjjji xxKyyb ∑−=
=α
Xx∈bxxKyxf ii
n
ii +∑=
=),()(
1α
Ci << α0
![Page 35: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/35.jpg)
Positive Definite Kernels
(RKHS)
![Page 36: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/36.jpg)
For any for which:
it must be the case that:
A simple criterion is that the kernel should bepositive semi-definite.
![Page 37: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/37.jpg)
The bias is also found as follows:
![Page 38: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/38.jpg)
Alternative approach ( -SVM): solutions for an-error norm are the same as those obtained
from maximising:
![Page 39: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/39.jpg)
subject to:
where lies on the range to .
![Page 40: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/40.jpg)
In this formulation the conceptual meaning of thesoft margin parameter is more transparent:The fraction of training errors is upper boundedby and also provides a lower bound on thefraction of points which are support vectors.
![Page 41: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/41.jpg)
RBF Kernel and String Kernels
• RBF Kernel : K(x,z)=exp(‐ II x – zII2 / (2σ2 )
• Spectrum Kernel (Leslie et al,2002)‐
K(s,t)=∑q #( q< s)#(q<t), qεAn ,#(q<s)=no.of substrings of s of length n.
Feature map :
Φ(s) = (ϕu (s))uεAn ,ϕu (s)=# occurances of u in s
![Page 42: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/42.jpg)
Closure properties of kernels
• If k1 and k2 are kernels then k1 + k2 , k1 * k2 , ak1 are kernels.
• Ex. To compare two proteins one can define a kernel on their sequences and 3D structures and combine into a sequence –structure kernel for proteins(Lewis et al,2006)
• For protein function prediction combination of kernels on genome‐wide data sets, gene expression data and protein‐protein interaction
![Page 43: Kernel Methods in Bioinformaticsscc/Lecture_material/Kernel Methods in... · 2016-12-01 · Bioinformatics Sudarsan Padhy IIIT Bhubaneswar spadhy07@gmail.com. 2 Empirical Inference](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f4b003fc9090c333f7319bb/html5/thumbnails/43.jpg)
SVM in Bioinformatics• Secondary structure prediction from DNA sequence using RBF kernel ( Hua & Sun 2001)
• Detection of remote protein homology using Fisher kernel (Jaakola et al,1999)
• Protein structure prediction (QIU et al,2007)
• Svm based gene finding‐ nematode genomes (Schweikert et al ,2009)
. Protein interaction prediction(Ben‐Hur et al,2005)
Feature selection‐gene selection from microarray data‐multiple kernel(Borgwardt et al.2005)