2010 09-20-張志威老師-機器學習到人類創新

2010-9-20 @ NCCU 1

Machine Learning to Human Innovation機器學習到人類創新

Edward Chang 张智威Director of Research

@ NCCU 22010-9-20

學習與創新

• 學習

孟子

@ NCCU 32010-9-20

孟母三遷

孟子滕文公章聖人之道

孟子謂戴不勝曰：「子欲子之王之善與？我明告子。有楚大夫於此，欲其子之齊語也，則使齊人傅諸？使楚人傅諸？」

曰：「使齊人傅之。」

曰：「一齊人傅之，眾楚人咻之，雖日撻而求其齊也，不可得矣；引而置之莊嶽之間數年，雖日撻而求其楚，亦不可得矣。」

@ NCCU 42010-9-20

@ NCCU 62010-9-20

學習與創新

• 學習

• 近朱者赤，近墨者黑，形正则影直

• 上樑不正下樑歪

• 創新

• 近朱避墨吗？

• 下樑上樑必须正吗？

@ NCCU 72010-9-20

@ NCCU 82010-9-20

@ NCCU 92010-9-20

演講提綱

• 機器學習

• 機器學習人類學習

• 人類學習人類創新

• 結語

@ NCCU 102010-9-20

機器學習的定義

Program the computers to learn!

Computers improve performancewith experience at some task

Example #1:Task: 下圍棋

Performance: 勝率

Experience: 與專家學習

Example #2:

Task: 基因分類

Performance: 疾病預測準確率

Experience: 基因病例

@ NCCU 112010-9-20

基因分類 D = 4026 genes, L = 3, N = 59 cases

@ NCCU 122010-9-20

監督學習 Supervised Learning

X: Samples U: Unlabeled data

L: Labeled data

Φ: Learning algorithm Implied hypothesis

f = Φ (L) Minimize some error function

Regularize parameters to prevent overfitting (過擬合) ŷ = f (u ∈U)

@ NCCU 132010-9-20

经典學習算法 Φ

线性模型 Linear Model 近鄰法 Nearest Neighbors 神經網路 Neural Networks 決策樹 Decision Trees 核方法 Kernel Methods 支持向量機 Support Vector Machines etc.

@ NCCU 142010-9-20

延展问题 Scalability Issue

f = Φ (L) D = 4026 genes, L = 3, N = 59 cases

Scarce labeled data

訓練數據太少

f = Φ (L* + U) L* Collect most useful labeled data

U? Use unlabeled data

@ NCCU 152010-9-20

人類學習

監督學習 Supervised Learning Being taught by e.g., teachers and parents

無監督學習 Unsupervised Learning Surfing Web, watching TV

主動學習 Active Learning Asking questions

强化學習 Reinforcement Learning Taking exams

@ NCCU 162010-9-20

统一學習机器Unified Learning Machines (KDD06)

半監督學習

强化學習

主動學習

@ NCCU 172010-9-20

信息檢索機器學習問題

Presenter

Presentation Notes

This figure shows a 200*200 checkerboard, which is divided into four quadrants. The top-left and bottom-right ones are occupied by the majority instances, as shown in a red cross. The rest two quadrants are occupied by the minority instances, as shown in a blue circle. In each quadrant, its instances are uniformly distributed. We define the ratio of a checkerboard dataset to be the number of majorities divided by the number of minorities. For example, in this figure, the ratio is 10 to 1. Then, at each time, we keep the minority instances unchanged, but uniformly add more new majority instances into the top-left and bottom-right quadrants, so as to induce another checkerboard dataset with a different ratio.

@ NCCU 182010-9-20

@ NCCU 192010-9-20

@ NCCU 202010-9-20

Text-based image search limitations. . .

@ NCCU 212010-9-20

VIMA Visual Search

@ NCCU 222010-9-20

Step #1 Acquire Labels

Presenter

Presentation Notes


@ NCCU 232010-9-20

Step #2 Compute Boundary

Presenter

Presentation Notes


@ NCCU 242010-9-20

Step #3 Identify Useful Samples

Presenter

Presentation Notes


@ NCCU 252010-9-20

Step #4 Acquire Labels

Presenter

Presentation Notes


@ NCCU 262010-9-20

Step #5 Refine Boundary

Presenter

Presentation Notes


@ NCCU 272010-9-20

Step #6 Return Results

Presenter

Presentation Notes


@ NCCU 282010-9-20

Midpoint Observations

Find good training instances

Find diversified training instances

Is a linear model sufficient?頭腦簡單?

@ NCCU 292010-9-20

信息檢索機器學習問題

Presenter

Presentation Notes


@ NCCU 302010-9-20

Nonlinear Boundary

Presenter

Presentation Notes


@ NCCU 312010-9-20

Linear Model Fits All Data?

@ NCCU 322010-9-20

连接点 Connecting Dots

@ NCCU 332010-9-20

连接点算法 NN近鄰法

Y(x) = 1/k Σ yi,

xi ∈Nk(x)

k = 1

@ NCCU 342010-9-20

近鄰法 with k = 1

@ NCCU 352010-9-20

近鄰法 Nearest Neighbors

Four Things Make a Nearest Neighbor ModelA distance function to measure nearness?

k: number of neighbors to consider?

A weighted function (optional)?

How to fit with the local points?

@ NCCU 362010-9-20

NN with k = 1

@ NCCU 372010-9-20

Problems of k = 1

Fitting Noise (過擬合) Jagged Boundaries

RemedyPicking a larger k

@ NCCU 382010-9-20

NN with k = 15

@ NCCU 392010-9-20

@ NCCU 402010-9-20

@ NCCU 412010-9-20

LM 線性 vs. NN 近鄰法 k = ∞

Linear Model

穩定

準確性低

k = 1不穩定

準確性低

k moderate穩定

準確

@ NCCU 422010-9-20

演講提綱

• 機器學習



• 結語

@ NCCU 432010-9-20

机器學習人類學習

Four Conjectures 四个推理

@ NCCU 442010-9-20

推理 #1 Find good training instances Good role models (e.g., good mentors)

Good learning environment

近朱者赤

@ NCCU 452010-9-20

推理 #2 Find diversified training instances Repetition improves only speed

Repetition improves no intuition

近朱者赤，近青者明，近白者潔

@ NCCU 462010-9-20

推理 #3 Do not overfit

Being able to generalize is what only counts

Do not memorize materials that do not contribute to generalization

Some overfitting examples中國那幾省產棉花?康熙那年诞生?始皇元年是西元幾年?

近朱者赤，非浸朱者赤

@ NCCU 472010-9-20

推理 #4 Exploring vs. Exploiting（探索 vs.開發）

Explore beyond nearest neighborhood of positive instances

Look at things in different perspectives

Find real boundaries

近红諳知青红皂白

@ NCCU 482010-9-20

四个推理

• 近朱者赤

• 近青者明，近白者潔

• 近朱者赤，非浸朱者赤

• 近红諳知青红皂白

@ NCCU 492010-9-20

演講提綱

• 機器學習



• 結語

@ NCCU 502010-9-2050

創新的基础

人才

學習優良

工作勤奮

團隊精神

創新

激情

Prepared mind

堅持不懈

革命性的思维

Presenter

Presentation Notes

@ NCCU 512010-9-20

Michelangelo

@ NCCU 522010-9-20

@ NCCU 532010-9-20

Renaissance

@ NCCU 542010-9-20

Last Supper

@ NCCU 552010-9-20

Linear PerspectiveFillipo Brunelleschi 1377-1466

實驗光學效應

顯示物體近距離看起來

比較大，远距離

比較小

確認 via 針孔實驗

實踐洗禮堂平面板繪畫

發表 by Alberti

2010-9-20 56@ NCCU

3D 2D Projections

Angel Figure 5.32010-9-20 57@ NCCU

2010-9-20 @ NCCU58

Color PrinciplesAristotelian

Color existed as property of surfaces of an object

Seven colors: white, black, yellow, red, purple, green & blueWhite: water and air

Yellow: fire and the sun

Black: results from elements in transition

Rainbow colors: red, green and purple

2010-9-20 @ NCCU59

Color PrinciplesLeonardo

Percussions and rebounds (直擊反射) of light! Should be analyzed with mathematical precisions

Light: various colors best reveal their beauty at different light level… yellow (light), blue (saturated)

Shadows: the variety of colors in shadow must be as great as that the colors of the objects in that shadow

Colors in motion: light shimmering through the moving leaves, waters flowing along a spring brook

@ NCCU 602010-9-20

2010-9-20 @ NCCU61

Color PrinciplesRuben

Optical mixture

Colors should be tormented, no more than two pigments should be mixed

Rather, colors should be applied simply, directly, and separately onto the canvas

@ NCCU 632010-9-20Georges Seurat

1859-91

@ NCCU 642010-9-20

@ NCCU 652010-9-20

王维

苏轼说：“味摩诘之诗，诗中有画；

观摩诘之画，画中有诗”

明月松间照，清泉石上流。

大漠孤烟直，长河落日圆。

水墨渲淡

@ NCCU 662010-9-20

王维

@ NCCU 672010-9-20

董源

水墨三维 “點、線、面” 的立體結構。

線 — 披麻皴

點 — 點错皴

面 — 斫垛皴

用明度展現陰陽、疏密、遠近

@ NCCU 682010-9-20

董源

@ NCCU 692010-9-20

诗要素

结构格式

对仗

韵 (rhyme)律 (rhythm)節奏

比喻 (Comparison)明喻 (simile)暗喻 (metaphor)

《使至塞上》王維

單車欲問邊，屬國過居延。

征蓬出漢塞，歸雁入胡天。

大漠孤煙直，長河落日圓。

蕭關逢候騎，都護在燕然。

@ NCCU 702010-9-20

《臨江仙》蘇軾

夜飲東坡醒复醉，

歸來彷彿三更。

家童鼻息已雷鳴。

敲門都不應，

倚杖聽江聲。

長恨此身非我有，

何時忘卻營營？

夜闌風靜縠紋平。

小舟從此逝，

江海寄馀生。

@ NCCU 712010-9-20

La Belle Dame Sans MerciJohn Keats

Oh what can ail thee, knight-at-arms,Alone and palely loitering?The sedge has withered from the lake,And no birds sing.

@ NCCU 722010-9-20

@ NCCU 732010-9-20

等 November 5th, 1993; by Ed

踱過了MJH 前的每一塊石磚等候你出現四角花園裡爭奪地嬌豔駐足不了我游移的視線似倆支箭詰問每一張彷彿的容顏

@ NCCU 742010-9-20

等 …continue

踱過了MJH 前的每一塊石磚等候你出現四角花園裡爭奪地嬌豔駐足不了我游移的視線似倆支箭詰問每一張彷彿的容顏只可惜不能在那長廊伸盡處轉彎或上屋簷探探像只輕燕否則也不會有這般的懸念

@ NCCU 752010-9-20

等 Second Stanza

天空的形狀四方分散陣風你在做第幾度方向的改變？背後疑似你走近的聲音回首只見秋寒一片唉這風景的顏色怕已褪的有些疲倦

@ NCCU 762010-9-20

等 Third Stanza

然後終於覓著了你那熟悉的眉和眼姍姍前來糅和著十月芙蓉的靦腆你說你錯等在教堂後邊的書店一直提著抱歉接了過來我說：“真高興又再相見”那秋風也拂不亂你微笑的臉報答了我這時辰固執的心願

生命四季

第一季年轻

你我似两片云朵

一阵风起各分东西

另一阵风起

又相聚在一起

希望风能应许

将我们吹在一起

降雨时

溶在一地的青草里

第二季苦难

纵使降落在异域

流过陌生的湖泊瀑布

也不至惊怕

我知道你会寻我

在每一个河口溪谷

你会回头望我

与海洋交汇的地方

你会永远在那儿等我

@ NCCU 772010-9-20

生命四季

第三季成长

鄹雨歇去

剩余的残云掩不住

整个天空几带彩鱼

出港的人们

快快扬起船帆

出发啦

第四季使命

Thank you for guiding me straight and true through the many obstacles in my path. And for keeping me resolute when all around seemed lost.

--- From the book of Eli

to be continued…

@ NCCU 782010-9-20

@ NCCU 792010-9-20

演講提綱

• 機器學習



• 結語

@ NCCU 802010-9-20

Renaissance artists Striving to introduce mathematical rigor into perfecting art

Lessing, Wieland, Schiller, Herder, to Goethe Setting the aesthetic vision for Germany! What is beauty? What is your ideal beauty?

William Blake“I will not cease from mental fight, Nor shall my sword sleep in my hand Till we have built Jerusalem In England's green and pleasant land.”

You & I ?

創新文化

我不會停止

精神戰鬥

我的劍不會

睡在我的手

中直到我們已

經建立創新

的文化

在台湾的綠

色和愉快的

土地上。

@ NCCU 812010-9-20

創新

激情

Prepared mind堅持不懈

革命性的思维

2010 09-20-張志威老師-機器學習到人類創新

Education

Transcript of 2010 09-20-張志威老師-機器學習到人類創新