Icml2017 overview

33
ICML2017 Overview & Some Topics September 18th, 2017 Tatsuya Shirakawa

Transcript of Icml2017 overview

ICML2017 Overview & Some Topics

September 18th, 2017 Tatsuya Shirakawa

ABEJA, Inc. (Researcher) - Deep Learning

- Computer Vision - Natural Language Processing - Graph Convolution / Graph Embedding

- Mathematical Optimization - https://github.com/TatsuyaShiraka

tech blog → http://tech-blog.abeja.asia/

Poincaré Embeddings Graph Convolution

We are hiring! → https://www.abeja.asia/recruit/

→ https://six.abejainc.com/

1. ICML Intro & Stats

2. Trends and Topics

Table of Contents

3

1. ICML Intro & Stats

2. Trends and Topics

Table of Contents

4

International Conference on Machine Learning

• Top ML Conference

• 434 orals in 3 days

• 9 parallel tracks

• Submitted 1629 papers

• 4 talks from invited speakers

• 9 tutorial talks

• 9(parallel)x3(sessions)x3(days)=81 sessions in main conference

ICML 2017 at Sydney

5

Demos

15

Schedule

16

8/6 Tutorial Session 9 tutorials (3 parallel)

8/7 Main Conference Day 1 27 sessions (9 parallel)

8/8 Main Conference Day 2 27 sessions (9 parallel)

8/9 Main Conference Day 3 27 sessions (9 parallel)

8/10 Workshop Conference Day 1 11 sessions (11 parallel)

8/11 Workshop Conference Day 2 11 sessions (11 parallel)

1/3

max attend

1/9

1/9

1/9

1/11

1/11

1. ICML Intro & Stats

2. Trends and Topics

Table of Contents

17

• Deep learning is still the biggest trend

• Autonomous vehicles

• Health care / computational biology

• Human interpretability and visualization

• Multitask learning for small data or hard tasks

• Reinforcement learning

• Imitation learning (inverse reinforcement learning)

• Language and speech processing

• GANs / CNNs / RNNs / LSTMs are default options

• RNNs and their variant

• Optimizations

• Online learning / bandit

• Time series modeling

• Applications Session

Some Trends (highly biased)

18

• Gluon is a new deep learning wrapper framework, which integrates dynamic dl frameworks (chainer, pytorch) and static dl frameworks (keras, mxnet) and get the best of the both worlds (hybridize)

• Great resources including many latest modelshttps://github.com/apache/incubator-mxnet/tree/master/example

• Looks easy to write

• Alex Smola was the presenter

• … not so fast yet ? ←

[Tutorial] Distributed Deep Learning with MxNet Gluon

19

http://www-bcf.usc.edu/~liu32/icml_tutorial.pdf

• RNN works well

• + pretraining (combine other clinics’ data)

• + expert defined features

• + new models for missing data

• CNN works well on image data and achieved super-human accuracy

• Some Features of Health Care Data

• Small sample size

• Missing values

• Medical domain knowledge

• Interpretation

• Use gradient boosting trees to mimic deep learning models (cool idea!)

• Hard to annotate even for experts

• Big Small Data

• Limited amount of data available to train age-specific or disease-specific models

[Tutorial] Deep Learning Models for Health Care: Challenges and Solutions

20

Future Directions: - Modeling heterogeneous data sources - Model interpretation - More complex output

“Interpretable Deep Models for ICU Outcome Prediction”, 2016

• Deep Neural Networks are “black boxes”.

• Sensitive analyses methods can be applied

• ex: Grad-CAM

[Tutorial] Interpretable Machine Learning

21

• Generating periodic patterns with GANs

• Local/Global/Periodic vectors

“Learning Texture Manifolds with the Periodic Spatial GAN”

22Example for many texture and many periodicity.

Local vectors

Global vectors

Periodic vectors

• Sequence revising with generative/Inference models

• Generative model P(x, y, z)=P(x, y|z)P(z)

• x : input seq., y: goodness of x, z: hidden var.

• Inference model P(z|x) , P(y|z)

• Input x0 -> infer z0 -> search better z (better F(z)) -> reconstruct x

“Sequence to better sequence: Continuous Revision of Combinatorial Structures”

23

• Generating a new step chart from a raw audio track

“Dance Dance Convolution”

24

• Gave a new algorithm and theoretical analysis for sum of norms (SON) clustering

• SON (2011)

• Assigning center to each data point and applied some regularization which magnetize centers

• Convex problem!

“Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery”

25

Image Compression using Deep Learning

• VAE(almost reconstruction) + GAN(refinement)

• Faster than jpeg on gpu, but several secs on cpu

“Real-Time Adaptive Image Compression”

26

• Subgoals

• Breaking up the problem Into Subgoals

• Learn sub-policies to achieve them

• StreetLearn

• Transfer Learning

• Progressive Neural Networks

• Distral: Robust Multitask Reinforcement Learning

[Invited Talk] “Towards Reinforcement Learning in the Complex World” - Raia Hadsell (Google Deep Mind)

27

• GANs are approximated by discrete distribution on some finite samples (with high probability)

• Sample size =

• P = discriminator size, ε = error

• “The birthday paradox” test

• Sample m images from generator

• See if there are duplicate images

• Estimate the sample size

“Generalization and Equilibrium in Generative Adversarial Nets” & “Do GANs actually learn the distribution? Some theory and empirics”

28

˜O(p log(p/✏)/✏2)

• Deterministic Rounding vs. Stochastic Rounding

• Theoretical explanation that SGD with stochastic rounding does not converge well

• Every updates are too noisy

• Won the Google Best Student Paper Award

“Towards a Deeper Understanding of Training Quantized Networks”

29

• RL produces much better sequence than log-likelihood based methods

• Why RL is so effective? (Beam Search Issues?)

“Sequence-Level Training of Neural Models for Visual Dialog”

30

• Google’s Expander which enhances broad range of tasks using graph structure

• smart reply, personal assistant

• image recognition

• Integrated framework for

• zero-shot/one-shot learning

• multi-modal learning

• semi-supervised learning

• multi-task learning

• “Neural Graph Machines”

• introduces graph regularization into DL

• Adjacent nodes (data) are constrained to have near vector representations

Neural Graph Learning

31

2019 ICML + CVPR !

2021 Asia/Pac!

Future ICMLs

32

Any Questions?