Shogun 2.0 @ PyData NYC 2012

30
Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration The SHOGUN Machine Learning Toolbox 2.0 (and its python interface) oren Sonnenburg, Gunnar R¨ atsch, Sebastian Henschel, Christian Widmer ,Jonas Behr, Alexander Zien, Fabio De Bona, Alexander Binder, Christian Gehl, and Vojtech Franc GSoC students: Sergey Lisitsyn, Heiko Strathmann, many more... fml

Transcript of Shogun 2.0 @ PyData NYC 2012

Page 1: Shogun 2.0 @ PyData NYC 2012

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

The SHOGUN Machine Learning Toolbox 2.0(and its python interface)

Soren Sonnenburg, Gunnar Ratsch, Sebastian Henschel,Christian Widmer,Jonas Behr, Alexander Zien, Fabio De Bona,

Alexander Binder, Christian Gehl, and Vojtech FrancGSoC students: Sergey Lisitsyn, Heiko Strathmann, many more...

fml

Page 2: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

What is Shogun?

Machine Learning Toolkit

Broad range of ML algorithms (600 classes)Large-scale algorithms (up to 50 million examples)Core written in C++ (> 190, 000 lines of code)SWIG bindings (support for 8 target languages)

Used in many projects

Gene starts: ARTS [7]Splice sites: mSplicer [5]Sensor fusion (private sector)...many more (see google scholar)!

Page 3: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Architecture

SWIG - Simple Wrapper Interface Generator

Bindings to a growing number of languages!

Typemaps!!

Page 4: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Shogun’s history

Project started 1999

Early focus on large-scale SVMs and Kernels

GSoC significantly pushed project forward

Page 5: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Machine Learning - Learning from Data

What is Machine Learning and what can it do for you?

What is ML?

AIM: Learning from empirical data!

Applications

speech and handwriting recognition

medical diagnosis, bioinformatics

computer vision, object recognition

stock market analysis

network security, intrusion detection . . .

Page 6: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Machine Learning - Learning from Data

What is Machine Learning and what can it do for you?

What is ML?

AIM: Learning from empirical data!

Applications

speech and handwriting recognition

medical diagnosis, bioinformatics

computer vision, object recognition

stock market analysis

network security, intrusion detection . . .

Page 7: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Support Vector Machines

Support Vector Machine (SVMs)

SVM primal

minw

1

2‖w‖2

2︸ ︷︷ ︸regularizer = robustness

+Cn∑

i=1

max(

1− yiw>xi , 0)

)︸ ︷︷ ︸

loss = error on train data

Training: Solve optimization problem

Page 8: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Support Vector Machines

Support Vector Machine (SVMs)

SVM primal

minw

1

2‖w‖2

2︸ ︷︷ ︸regularizer = robustness

+Cn∑

i=1

max(

1− yiw>xi , 0)

)︸ ︷︷ ︸

loss = error on train data

Training: Solve optimization problem

Page 9: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Support Vector Machines

SVM with Kernels

SVM dual

maxα−1

2

n∑i=1

n∑j=1

αiαjyiyj

k(xi ,xj )︷︸︸︷xTi xj )−

n∑i=1

αi ,

s.t. 0 ≤ αi ≤ C ∀i ∈ {1, n}

Kernel: Similarity measure; generalization of dot product

Corresponds to dot product in higher dimensional space

Page 10: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Demo:

Support Vector Classification

Task: separate 2 clouds of points in 2D

Simple code example: SVM Training

lab = BinaryLabels(labels)

train_xt = RealFeatures(features)

gk = GaussianKernel(train_xt, train_xt, width)

svm = LibSVM(10.0, gk, lab)

svm.train()

test_examples = RealFeatures(test_features)

out = svm.apply(test_examples)

Page 11: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

SVMs and Kernels

Provides generic interface to 11 SVM solvers

Established implementations for solving SVMs with kernelsMore recent developments: Fast linear SVM solvers

Kernels for Real-valued Data (in demo)

Linear Kernel, Polynomial Kernel, Gaussian Kernel

String Kernels

Applications in Bioinformatics [4, 8, 10]Intrusion Detection

Heterogeneous Data Sources

Combined kernel: K (x , z) =∑M

i=1 βi · Ki (x , z)βi can be learned using Multiple Kernel Learning [6, 2]

Page 12: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Beyond Classification

(a) GP regression (b) Structured Output (c) Multitask Learning

Regression: Labels are real values (think least squares)

Structured Output Learning: Predict complex structures

Multitask Learning: Solve several related problemssimultaneuously

Page 13: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Multitask Learning

Example: Learn movie user preferece

Multitask Learning: Jointly learn models for different countries

Couple related models more strongly

Page 14: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Multitask Learning

Example: Learn movie user preferece

Multitask Learning: Jointly learn models for different countries

Couple related models more strongly

Page 15: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Multitask Learning

Example: Learn movie user preferece

Multitask Learning: Jointly learn models for different countries

Couple related models more strongly

Page 16: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Multitask Learning

Example: Learn movie user preferece

Multitask Learning: Jointly learn models for different countries

Couple related models more strongly

Page 17: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Regularization-based MTL

Multitask Learning is often implemented using regularization:

Graph-regularizer:∑T

s=1

∑Tt=1 ‖ws −wt‖2As,t

Keeps model parameters similarBased on given similarity matrix A

L2,1-regularizer: ‖W ‖2,1 =∑n

i=1 ‖wi‖Selects common sub-spaceAllows any wt in that sub-space

Clustered MTL:

Unknown task relationshipIdentifies similar tasks

Page 18: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Multitask Learning:

MTL Training

feat, labels = ... # Shogun Data objects

task_one = Task(0,10)

task_two = Task(10,20)

group = TaskGroup()

group.append_task(task_one)

group.append_task(task_two)

mtlr = MultitaskL12(0.1,0.1,feat,labels,group)

mtlr.train()

Efficient LibLinear-style solver Graph-reg SVM [9]

10 other MTL methods (based on SLEP[3]/MALSAR[1])

Page 19: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Structured Output Learning

Complex outputs

Similar framework, different loss function

Bundle-methods: state of the art solvers!

Page 20: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Other methods

(d) Sparse/L1 methods (e) Gaussian processes (f) Dim-reduct

...and much more I can’t talk about!

Page 21: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Python integration

Python integration

Serialization

Matrix integration

No-copy data wrapping

Rapid prototyping with directors

Page 22: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Python integration

pythonic interaction with shogun objects

m_real = array(in_data, dtype=float64, order=’F’)

f_real = RealFeatures(m_real)

# slicing

print f_real[0:3, 1]

# operators

f_real += f_real

f_real *= f_real

f_real -= f_real

# no copy

a = RealFeatures()

a.frombuffer(feats, False)

Page 23: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Python integration: Directors

Simple code example: SVM Training

class ExampleLinearKernel(DirectorKernel):

def __init__(self):

DirectorKernel.__init__(self, True)

def kernel_function(self, idx_a, idx_b):

seq1 = self.get_lhs().get_feature_vector(idx_a)

seq2 = self.get_rhs().get_feature_vector(idx_b)

return numpy.dot(seq1, seq2)

k = ExampleLinearKernel()

svm = SVMLight()

svm.set_kernel(k)

svm.train(train_data)

Page 24: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

How to get started

Dive into Shogun

Visit our website

Source on github (fork-me!)

Documentation available

Many python examples (> 200)

Debian Package, MacPorts

Active Mailing-List

Page 25: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

When is SHOGUN for you?

You want to work with SVMs (11 solvers to choose from)

You want to work with Kernels (35 different kernels)⇒ Esp.: String Kernels / combinations of Kernels

You’re interested recent ML developments (MTL, StructuredOutput)

You have large scale computations to do (up to 50 million)

You use one of the following languages:Python, Octave/MATLAB, R, Java, C#, Ruby, Lua, C++

Page 26: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Contributors

Original authors: Gunnar Raetsch, Soeren Sonnenburg, Christian Widmer,Alexander Binder, Alexander Zien, Marius Kloft, Sebastian Henschel, Christian Gehl,Jonas Behr.

Integrated Code:Alex Smola (prloqo), Antoine Bordes (LaRank), Thorsten Joachims (SVMLight),Chin-Chung Chang and Chih-Jen Lin (LIBSVM), Chih-Jen Lin (LibLinear), VojtechFranc (LibOCAS), Leon Bottou (SGD SVM), Vikas Sindhwani (SVMLin), Jieping Yeand Jun Liu (SLEP), Jiayu Zhou and Jieping Ye (MALSAR)

GSoC alumni:Heiko Strathmann (both 2011 and 2012), Sergey Lisitsyn (both 2011 and 2012),

Chiyuan Zhang (2012), Fernando Iglesias (2012), Viktor Gal (2012), Michal Uricar

(2012), Jacob Walker (2012), Evgeniy Andreev (2012), Baozeng Ding (2011), Alesis

Novik (2011), Shashwat Lal Das (2011)

Page 27: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

Thank you!

Thank you for your attention!!

For more information, visit:

Implementation http://www.shogun-toolbox.org

More machine learning software http://mloss.org

Machine Learning Data http://mldata.org

Page 28: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

References I

Zhou Jiayu, Jianhui Chen, and Jieping Ye.

User Manual MALSAR : Multi-tAsk Learning via StructuralRegularization.

Technical report, Arizona State University, 2012.

M. Kloft, U. Brefeld, S. Sonnenburg, P. Laskov, K.R. Muller, and A. Zien.

Efficient and accurate lp-norm multiple kernel learning.

Advances in Neural Information Processing Systems, 22(22):997–1005,2009.

Jun Liu, Shuiwang Ji, and Jieping Ye.

SLEP : Sparse Learning with Efficient Projections.

2011.

Page 29: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

References II

G. Schweikert, A. Zien, G. Zeller, J. Behr, C. Dieterich, C.S. Ong,P. Philips, F. De Bona, L. Hartmann, A. Bohlen, et al.

mGene: Accurate SVM-based gene finding with an application tonematode genomes.

Genome research, 19(11):2133, 2009.

Gabriele Schweikert, Alexander Zien, Georg Zeller, Jonas Behr, ChristophDieterich, Cheng Soon Ong, Petra Philips, Fabio De Bona, Lisa Hartmann,Anja Bohlen, Nina Kruger, Soren Sonnenburg, and Gunnar Ratsch.

mGene: accurate SVM-based gene finding with an application tonematode genomes.

Genome research, 19(11):2133–43, November 2009.

S. Sonnenburg, G. Ratsch, C. Schafer, and B. Scholkopf.

Large scale multiple kernel learning.

The Journal of Machine Learning Research, 7:1565, 2006.

Page 30: Shogun 2.0 @ PyData NYC 2012

pics/msklogo.pdf

Introduction Machine Learning Dry is all theory: Live Demo SVMs and Kernels Beyond Binary Classification Python integration Summary

References III

S Sonnenburg, A Zien, and G Ratsch.

ARTS: accurate recognition of transcription starts in human.

Bioinformatics, 2006.

S. Sonnenburg, A. Zien, and G. Ratsch.

ARTS: accurate recognition of transcription starts in human.

Bioinformatics, 22(14):e472, 2006.

C Widmer, M Kloft, N Gornitz, and G Ratsch.

Efficient Training of Graph-Regularized Multitask SVMs.

In ECML 2012, 2012.

C. Widmer, J. Leiva, Y. Altun, and G. Raetsch.

Leveraging Sequence Classification by Taxonomy-based MultitaskLearning.

In Research in Computational Molecular Biology, pages 522–534.Springer, 2010.