Unsupervised learning Networks

27
Unsupervised learning Networks Associative Memory Networks ELE571 Digital Neural Networks

description

ELE571 Digital Neural Networks. Unsupervised learning Networks. Associative Memory Networks. Associative Memory Networks. Associative Memory Networks recalls the original undistorted pattern from a distorted or partially-missing pattern. feedforward type (one-shot-recovery) - PowerPoint PPT Presentation

Transcript of Unsupervised learning Networks

Page 1: Unsupervised learning  Networks

Unsupervised learning Networks

•Associative Memory Networks

ELE571

Digital Neural Networks

Page 2: Unsupervised learning  Networks

Associative Memory Networks

•feedforward type (one-shot-recovery)

•feedback type e.g., the Hopfield network (iterative recovery)

Associative Memory Networks recalls the original undistorted pattern from a distorted or partially-missing pattern.

Page 3: Unsupervised learning  Networks

Associative Memory Model (feedfoward type )

b

a

W could be (1) symmetric or not, (2) square or not

nonlinear unit: e.g. threshold

b(k) W = b(k) m b(m) T a(m) = a(k)

Page 4: Unsupervised learning  Networks

Bidirectional Associative Memory

Page 5: Unsupervised learning  Networks

a1 = [1 1 1 1 –1 1 1 1 1]

a2 = [1 -1 1 -1 1 -1 1 –1 1]

2 0 2 0 0 0 2 0 2

0 2 0 2 -2 2 0 2 0

2 0 2 0 0 0 2 0 2

0 2 0 2 -2 2 0 2 0

0 -2 0 -2 2 -2 0 -2 0

0 2 0 2 -2 2 0 2 0

2 0 2 0 0 0 2 0 2

0 2 0 2 -2 2 0 2 0

2 0 2 0 0 0 2 0 2

The weight matrix

1 1 1 1 –1 1 1 1 1 1 -1 1 -1 1 -1 1 –1 1X=[ ]

W =XTX=

Page 6: Unsupervised learning  Networks

Associative Memory Model (feedback type )

aold

anew

W must be

•normalized

•W = WH W=X+X

Page 7: Unsupervised learning  Networks

Each iteration in AMM(W) comprises two substeps:

(a) Projection of aold onto W-plane

anet = W aold

(b) Remap the net-vector to closest symbol vector:

anew = T[anet]

anew = arg min anew ss

|| anew - W aold ||

The two substeps in one iteration can be sumarized as one procedure:

Page 8: Unsupervised learning  Networks

initial vector symbol

vector

perfect attractor

Γρ x -plane

Linear projection onto x-plane

g-update in DML, shown as

s-update in DML, shown asResymbolization nonlinear mapping

2 steps in one AMM iteration

^

^

=

=

anet is the (least-square-error) projection of aold

onto the (column) subspace of W.

Page 9: Unsupervised learning  Networks

Inherent properties of the signals (patterns)

to be retrieved:

•Orthogonality

•Higher-order statistics

•Constant modulus property

• FAE -Property

• others

Common Assumptions on Signal

for Associative Retrieval

Page 10: Unsupervised learning  Networks

Source

H

XS

Channel Observationg+

ε1

v S = ŝgX=gH S =

Blind Recovery of MIMO System

Page 11: Unsupervised learning  Networks

Blind Recovery of MIMO System

h11

h21

h1p

h2p

hq1

hqp

S

+

g1

g2

gq

Goal: to find g, such that v gH, and

v S = [ 0 .. 0 1 0 .. 0 ] S = sj

s1

sp

si

Page 12: Unsupervised learning  Networks

Signal Recoverability

H is PR (perfectly recoverable) if and only if H has full column rank, i.e. an inverse exists.

Assumptions on MIMO System

For Deterministic H, ……

Page 13: Unsupervised learning  Networks

non-recoverable

1 2 1 2 1 2

[ ]

1 2 1 3 1 2

[ ]

Examples for Flat MIMO

recoverable

Page 14: Unsupervised learning  Networks

If perfectly recoverable, e.g.

1 2 1 3 1 2

[ ]Parallel Equalizer

ŝ = H+X = G X

Page 15: Unsupervised learning  Networks

H

XS

Signal Constellation

Ci

g+ε1

Example: Wireless MIMO System

v S = ŝgX=gH S =

Signal recovery via g:

Page 16: Unsupervised learning  Networks

Given v s = ŝ , for ŝ to be always a valid symbol for any valid symbol

vector s, if and only if

v [ 0 .. 0 ±1 0 .. 0 ]

FAE()-Property: Finite-Alphabet Exclusiveness

Page 17: Unsupervised learning  Networks

Suppose that a v W = b. For the output b to be always a valid symbol sequence

given whatever v,

the necessary and sufficient condition is that

v = E(k) .

Theorem: FAE -Property

In other words, it is impossible to produce a valid but

different output symbol vector.

Page 18: Unsupervised learning  Networks

if and only if

v [ 0 .. 0 ±1 0 .. 0 ]

-1 +1 +1 +1 +1 +1 • •+1 +1 +1 -1 -1 +1 • •-1 -1 +1 -1 +1 +1 • •S = [ ],

v S = [valid symbols]

FAE()-Property: Finite-Alphabet Exclusiveness

If v ≠ [ 0 .. 0 ±1 0 .. 0 ]If v = [ 0 .. 0 ±1 0 .. 0 ]

Page 19: Unsupervised learning  Networks

ŝ [Finite Alphabet]

“EM”:

v S = ŝBlind-BLAST

gX=gH S =

ĝ= ŝ X+

š= T[ ĝ X]E-step

M-step

•The E step determines the best guess of the membership function zj .

• The M step determines the best parameters, θn , which maximizes the likelihood function.

ŝ = T[gX ]E-step

M-step ğ= š X+

Combined EM: š= T[ ĝ X] = T[ ŝ X+ X] = T[ ŝ W]

Page 20: Unsupervised learning  Networks

ŝold=ŝŝnew

Associative Memory Network

W=X+X

ŝnewŝnew

:threshold

ŝnew= Sign[ŝoldW]

Page 21: Unsupervised learning  Networks

initial

vector ŝNearest symbol vector

gx -plane

Linear projection onto gx - plane

FA nonlinear mapping

=

=

ŝ’ = T[ŝW]

g= ŝold X+

Page 22: Unsupervised learning  Networks

A symbol vector a* is a "perfect attractor" of AMM(W) if and only if

• a* is symbol vector

• a* = W a*

Definition: Perfect Attractor of AMM(W)

Page 23: Unsupervised learning  Networks
Page 24: Unsupervised learning  Networks

Let v= [ f(1) ≠0 f(2) … f(p)]

1

2

p

q

1

1

Thus f(p) =0

= 0

Let ûi= [ ui (1) ui(2) … ui(p) ]T

Let v’= [ f(1) ≠0 f(2) … f(p-1) 0]

Let ǖi = v ûi Let ǖ’i = v’ ûi

Page 25: Unsupervised learning  Networks

Compare the two programsand determine the differences in performance.

Why such a difference?

MatLab Exercise

Page 26: Unsupervised learning  Networks

p = zeros(1,100);for j=1:100; S = sign(randn(5,200)); A = randn(5,5)+eye(5); X = A*S + 0.01*randn(5,200); s = sign(randn(200,1)); W = X'*inv(X*X')*X; for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end while norm(s-W*s)> 5.0,

s = sign(randn(200,1)); for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end end p(j) = max(abs(S*s));endhist(p)

Page 27: Unsupervised learning  Networks

p = zeros(1,100);for j=1:100; S = sign(randn(5,200)); A = randn(5,5)+eye(5); X = A*S + 0.01*randn(5,200); s = sign(randn(200,1)); W = X'*inv(X*X')*X; for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end p(j) = max(abs(S*s));endhist(p)