“基礎”からのBayesian Nonparametrics-点過程と機械学習の数理-

“基礎”からの Bayesian Nonparametrics

-点過程と機械学習の数理-

東京大学情報基盤センター助教

佐藤一誠

1 数理助教の会 2012.12.13

• 点過程のBayesian Nonparametrics (BN) を扱う (i.e. Gaussian Processは扱わない) • 初期のBNの基礎を紹介

(応用分野では必要はないが、理解しておいて損はない話。特にベイズではフビニの定理が重要)

• サンプリング方法、変分ベイズ法等はほとんど扱わない。応用の話もほとんど扱わない。

(注)原著の雰囲気をそのままお届けするために、英語

と日本語が混じっています。決して、専門用語の日本語化に失敗しているわけではありません。

Let (X,B) be a measure space

where B is the Borel σ-algebra on X.

Let random variables xi ∈X ( i=1,2,…) be defined.

Random measure φ is a B-valued random element

defined by, for any A ∈B,

Random Measure

i AxA1

)()( 加算無限和でも良い

and is also called a point process.

XXX processと言った場合には、確率変数列 XXX measureと言った場合には、個々の(B値)確率変数について言及していると思えばよい

A random measure φ is a completely random

measure if , for any finite collection A1, A2…, An

of disjoint sets, the random variables

φ(A1), φ(A2)…,φ(An) are independent.

Ex. Counting measure: N

is a completely random measure

if, for any finite collection A1… An

of disjoint sets ,

N(A1)… N(An) are independent

N(A1)=4

N(A2)=3

N(A3)=2

Completely Random Measure (CRM) [Kingman,1967]

Poisson Process (PP)

λ is a measure from the measurable sets of X to R+,

called intensive function.

N is generated from PP with λ, i.e., N~PP(dN | λ),

if, for any measurable set A ⊂ X,

N(A)～Poison(λ(A)).

N(・) is a completely random measure given by

i AxAN1

,where n～Poisson(λ(X)).

次、図あります

CRMの代表例

N(A) ～Poisson(λ(A))

Let N be a Poisson random measure over X, i.e.,

N ~ PP(dN | λ).

Let K+ is the family of positive functions on X.

The Laplace transform of N is given by

Laplace Transform of PP

確率過程について調べたい時は、Laplace Transform！では、一般のCRMもLaplace Transformで調べてみよう

⇒ Levy-Khintchine Representation Theorem

)|()()(exp][ dNPPdxNxffLN

)()1(exp )( dxe xf

この形を覚えておく！

Levy-Ito Decomposition of CRM

),()( dtAtNA

Levy process is a stochastic process with independent increments and is decomposed into

two parts:

1. continuous part: Brownian motion with drift

2. discrete part: positive pure-jump process

※Levy-Ito Decomposition of Levy Process

[Kingman,1967]

discrete part のみ

Let φ be a CRM.

There is N~PP(dN|ν) on X×[0,∞) such that

Levy-Khintchine Representation of CRM

The Levy-Khinchine representation of CRM φ is

)( ),()1(exp][A

ztAz dtdxeeE

is a levy measure on X×[0,∞). ),( dtdx

Levy measureをIntensive functionとする

X×[0,∞)上のPPと見なせる

[Kingman,1967]

これは結局

Laplace transform

各種CRMを特徴付けるものは、Levy measureだとわかったということは、Levy measureをいろいろ変えれば確率過程が作れちゃう（⇒論文書けちゃ・・・）

Gamma process:

dtetdxHdtdxt01

0)(),(

Beta process:

dtttdxHdtdx11

00)1()(),(

Inverse Gaussian process:

dtetdxHdtdxt

2/)(),( 02/3

etc…

Base measure over X

Completely Random Measure

Poisson Process

Gamma Process

Dirichlet Process

Chinese Restaurant Process

Stick-Breaking Process Pitman-Yor Process

Beta Process

Indian Buffet Process

Machine Learningでは階層化や派生モデル

も加わり亜種が爆発的に・・・

Inverse Gaussian Process

正規化

Levy measure Levy measure Levy measure

Sampling可能構成可能

構成可能

構成可能 CRP,SBPでみるとパラメータ１つ追加

※まだまだたくさんあります

Levy process

離散部分

Sampling可能

H is a base probability measure over X and

α0 is concentration parameter.

G is generated from ΓP with α0H , i.e., G~ ΓP(α0H)

if , for any measurable set A ⊂ X,

G(A)～Gamma (α0H (A),1).

G(・) is a completely random measure given by

ii xwG

where ～Gamma(α0H (X),1) .

Gamma Process (ΓP)

Levy measure on X×[0,∞):

dtetdxHdtdxt01

0)(),(

Gamma Process (ΓP)

Gamma(α0,1) α0: Concentration Parameter

[0,∞)

Intensive functionをv (Levy measure) とするX×[0,∞)上の

N~PPを用いて

G~ΓPを生成する

Base measure over X

G~ΓP (α0H)

dtetdxHdtdxt01

0)(),(

Gamma Process (ΓP)

[0,∞)

X×[0,∞)上のPP

に従う点を

加算無限個生成

Base measure over X

G~ΓP (α0H)

dtetdxHdtdxt01

0)(),(

Gamma Process (ΓP)

[0,∞)

縦軸を横軸の点の重みとすれば・・・

Base measure over X

G~ΓP (α0H)

dtetdxHdtdxt01

0)(),(

Gamma Process (ΓP)

Base measure over X

ii xwG

G~ΓP (α0H)

(再掲)Levy-Ito Decom.

),()( dtAtNAG

tをwに置き換えると

わかりやすい

ベイズの定理の復習

事後分布尤度事前分布

Gamma Process (ΓP)の事後分布

を求めたいとなる

)()|}({}){|( xpxypyxp ii

⇒ フビニの定理を基に導出他の確率過程でもほぼ同様のロジック (i.e., 新しい確率過程で事後分布を求めるなら大抵、フビニの定理を基にすればよい)

Bayesの人は当然

Fubini’s Theorem (フビニの定理)

• 多変数の期待値などに伴う積分順序に関する定理

• 簡単に言えば積分順序の交換(逐次積分)を可能にする定理

• 様々な状況での証明がある入門書としては、『測度から確率へ』(佐藤坦)などに幾つか証明があるつまり、１つの定理で、どんな確率過程でも事後分布が求まる魔法の定理ではない（各自がんばる）

e.g., 非負確率変数の場合の逐次平均可能など

Fubini-type disintegration for ΓP

Let h be any non-negative function over γ and x,

)()|P(),()|P()(),( dxdxhddxxh x

[Lo+,1978,1982,1989]

)|P(~ d , α=α0H.

xが与えられた下での

γの事後分布

hの一般化

)()|P()()]([ dxddxdxE 19

Fubini-type disintegration for ΓP [Lo+,1978,1982,1989]

hの一般化

)()()|P(),(

)()|P()(),(

)|P()(),(

dxdxdxh

dxddxxh

)(),(i

ii dxxh の場合(注: dx1, dx2の多重積分)

[Fubini’s theorem]

※dx1, dx2の積分順序によらない 20

)()|P(),(

)|P()(),(

Fubini-type disintegration for ΓP [Lo+,1978,1982,1989]

hの一般化

ii dxxh1

)(),( の場合

x1 …, xnが与えられた元での

γの事後分布ここに注目！

xn dxxxfj

dx1 …, dxnの積分を近似

⇒ x1 …, xnをサンプリングする

)()()()(1

21 1 n

xx dxdxdxdxii

)()( 0 dxHdx

からのサンプリング )(~~11 dxx

xn dxxxfj

)()()()(1

21 1 n

xx dxdxdxdxii

からのサンプリング

※ )()( 0 dxHdx

)(~~11 dxx

)(~~2~2 1

xn dxxxfj

)()()()(1

21 1 n

xx dxdxdxdxii

)(~~11 dxx

)(~~2~2 1

~3 dxxi

※ )()( 0 dxHdx

xn dxxxfj

)()()()(1

21 1 n

xx dxdxdxdxii

1)(,)( 0 XXix なので正規化して考えると

xnn dxn

※ )()( 0 dxHdx

)(~~11 dxx

)(~~2~2 1

~3 dxxi

Restaurant Representation

xnn dxn

1 xx 3x

1x1 2 3 )(~ 4

4 dxHxx

新しいxがサンプリングされる確率既出のxがサンプリングされる確率

テーブル

Normalized Gamma Process (NΓP)

iw となるように正規化

)(~)(/ 0HDPXG

)|P())(/()|P()( 00 HdrXGfHdGDGf

[Kingman1975,Lo+1989]

DP: Dirichlet Process [Ferguson1973]

i.e., for any integrable function f

加算無限和が１

)(/ XG P~P,~ DG のとき in distribution,

)(~ 0HP

H is a base probability measure over X and

α0 is concentration parameter.

G is generated from DP with α0H , i.e., G~ DP(α0H)

if , for any finite set of measurable partitions

A1∪A2 ∪ … ∪ Ak = X,

(G(A1),G(A2),…, G(Ak) )

～Dir (α0H(A1), α0H(A2),…, α0H(Ak) ).

G(・) is a obtained by

ii xwG 11

Dirichlet Process (DP)

(※)CRM ではない

[Ferguson1973]

Fubini-type disintegration for DP [Ferguson1973]

歴史的にはΓPよりも

古いことに注意

Let h be any non-negative function over G and x,

and )|P(~ dGDG , α=α0H.

)()|P(),()|P()(),(

dxdGDGxhdGDdxGGxh x

xが与えられた下での

Gの事後分布

dGDGxh

dGDdxGGxh

)|P(),(

)|P()(),(

ii dxGGxh1

)(),( の場合

x1 …, xnが与えられた元での

Gの事後分布

Fubini-type disintegration for DP [Ferguson1973]

1)(,)( 0 XXix

x1 …, xnのサンプリングは

ΓPと同じ（というかΓP がDPと同じ）

はじめから正規化されている

Let h be any non-negative function over γ and x,

)()|P(

)|P()(

)|P(~ d , α=α0H.

[Lo+1989]

Fubini-type disintegration for NΓP

DPと同じ形

)(/ XG in distributionは簡単に示せるこれを使えば

Proof:

)()|P(

)()|P()(

)|P()(

[Lo+1989]

Fubini-type disintegration for NΓP

次で説明

)(),;(

Laplace transform: )1(][ veE vx

Gamma distribution:

If α is also a gamma random variable with shape

parameter γ+n and unit scale, i.e., α~Gamma(γ+n,1)

Note that when n=1,

次で説明

If α~Gamma(γ,1) then,

dxexEE x

dxeEx x

1 ][)(

[Laplace transform]

γ⇒ γ+n β⇒ n

とすれば

応用例：無限混合モデル

)()|()|( dGypGyp

)|( 1yp)|( 2yp )|( 3yp

dG加算無限個

)()|(i

ii Gyp

無限混合モデル

)|()|()|( :1:1 nn ydGpGypyyp

ベイズ予測分布：

Completely Random Measure

Poisson Process

Gamma Process

Dirichlet Process

Chinese Restaurant Process

Stick-Breaking Process Pitman-Yor Process

Beta Process

Indian Buffet Process

Machine Learningでは階層化や派生モデル

も加わり亜種が爆発的に・・・

Inverse Gaussian Process

正規化

Levy measure Levy measure Levy measure

Sampling可能構成可能

構成可能

Sampling可能

構成可能 CRP,SBPでみるとパラメータ１つ追加

Sampling可能

※まだまだたくさんあります

Levy process

離散部分

今回の話

“基礎”からのBayesian Nonparametrics-点過程と機械学習の数理-

Education

Transcript of “基礎”からのBayesian Nonparametrics-点過程と機械学習の数理-

土木機械 建設機械の EMC | ISO 13766-1, -2土木機械/建設機械のEMC — ISO 13766-1,-2 の概要 本稿ではこのISO 13766-1:2018[1] の概要を述べ、 またISO

Bayesian Nonparametrics via Probabilistic Programming

2. 機械設計段階のリスクアセスメントの基本的手法 · ＜機械設計段階リスクアセスメントの基礎＞ 10 2. 機械設計段階のリスクアセスメントの基本的手法

BNP3hdphmm Cvpr2012 Applied Bayesian Nonparametrics

Bayesian nonparametrics and ﬂexible structured modellingmapjg/papers/euroworkshop.pdf · Bayesian nonparametrics and ﬂexible structured modelling by Peter Green (University of

機械 少女 と 幻想の塔

公共工事における積算マネジメントと 土木工事積算必携講習会2) 機械経費の積算 ① 機械経費のあらまし ② 機械賃料と機械損料 ③ 機械損料の補正方法

Bayesian Nonparametrics - University College Londonywteh/teaching/npbayes/kaist2010.pdfBayesian Nonparametrics Yee Whye Teh Gatsby Computational Neuroscience Unit University College

BNP1models Cvpr2012 Applied Bayesian Nonparametrics

BNP5spatial Cvpr2012 Applied Bayesian Nonparametrics

Bayesian Nonparametrics: Dirichlet Process - UCLywteh/teaching/npbayes2012/dp.pdf · Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh ... • S = {Alice, Bob, Charles, David,

A Gentle Introduction to Bayesian Nonparametrics

Time-Dependent Stick-Breaking Processesjordan/sail/readings/griffin-steel.pdfKeywords: Bayesian Nonparametrics, Dirichlet Process, Poisson-Dirichlet Process, Time-Dependent Nonparametrics.

Measuring Cluster Stability for Bayesian Nonparametrics ...Measuring Cluster Stability for Bayesian Nonparametrics Using the Linear Bootstrap Ryan Giordano rgiordano@berkeley.edu ...

データサイエンス講座 第2回機械学習その1 回機械 …...データサイエンス講座 第2回機械学習その1回機械学習その1 クラスタリング分析

機械工学科のJABEE対応「機械システムコース」 機 …機械工学科のJABEE対応「機械システムコース」 機械工学科のカリキュラムには、学習教育目標を達成するための専門分野系として、

Dependent processes in Bayesian Nonparametrics

計算の理論 II Turing 機械の合成

Bayesian Nonparametrics: Models Based on the Dirichlet Processapanella/slides/nonparametric_bayes.pdf · Bayesian Nonparametrics: Models Based on the Dirichlet Process Alessandro

土木機械建設機械の EMC | ISO 13766-1, -2土木機械/建設機械のEMC — ISO 13766-1,-2 の概要本稿ではこのISO 13766-1:2018[1] の概要を述べ、またISO

機械少女と幻想の塔

公共工事における積算マネジメントと土木工事積算必携講習会2) 機械経費の積算 ① 機械経費のあらまし ② 機械賃料と機械損料 ③ 機械損料の補正方法

データサイエンス講座第2回機械学習その1 回機械 …...データサイエンス講座第2回機械学習その1回機械学習その1 クラスタリング分析

機械工学科のJABEE対応「機械システムコース」機 …機械工学科のJABEE対応「機械システムコース」機械工学科のカリキュラムには、学習教育目標を達成するための専門分野系として、