Unsupervised learning Networks

Unsupervised learning Networks

•Associative Memory Networks

ELE571

Digital Neural Networks

Associative Memory Networks

•feedforward type (one-shot-recovery)

•feedback type e.g., the Hopfield network (iterative recovery)

Associative Memory Networks recalls the original undistorted pattern from a distorted or partially-missing pattern.

Associative Memory Model (feedfoward type )

b

a

W could be (1) symmetric or not, (2) square or not

nonlinear unit: e.g. threshold

b(k) W = b(k) m b(m) T a(m) = a(k)

Bidirectional Associative Memory

a1 = [1 1 1 1 –1 1 1 1 1]

a2 = [1 -1 1 -1 1 -1 1 –1 1]

2 0 2 0 0 0 2 0 2

0 2 0 2 -2 2 0 2 0

2 0 2 0 0 0 2 0 2

0 2 0 2 -2 2 0 2 0

0 -2 0 -2 2 -2 0 -2 0

0 2 0 2 -2 2 0 2 0

2 0 2 0 0 0 2 0 2

0 2 0 2 -2 2 0 2 0

2 0 2 0 0 0 2 0 2

The weight matrix

1 1 1 1 –1 1 1 1 1 1 -1 1 -1 1 -1 1 –1 1X=[ ]

W =XTX=

Associative Memory Model (feedback type )

aold

anew

W must be

•normalized

•W = WH W=X+X

Each iteration in AMM(W) comprises two substeps:

(a) Projection of aold onto W-plane

anet = W aold

(b) Remap the net-vector to closest symbol vector:

anew = T[anet]

anew = arg min anew ss

|| anew - W aold ||

The two substeps in one iteration can be sumarized as one procedure:

initial vector symbol

vector

perfect attractor

Γρ x -plane

Linear projection onto x-plane

g-update in DML, shown as

s-update in DML, shown asResymbolization nonlinear mapping

2 steps in one AMM iteration

^

^

=

=

anet is the (least-square-error) projection of aold

onto the (column) subspace of W.

Inherent properties of the signals (patterns)

to be retrieved:

•Orthogonality

•Higher-order statistics

•Constant modulus property

• FAE -Property

• others

Common Assumptions on Signal

for Associative Retrieval

Source

H

XS

Channel Observationg+

ε1

v S = ŝgX=gH S =

Blind Recovery of MIMO System

Blind Recovery of MIMO System

h11

h21

h1p

h2p

hq1

hqp

S

+

g1

g2

gq

Goal: to find g, such that v gH, and

v S = [ 0 .. 0 1 0 .. 0 ] S = sj

s1

sp

si

Signal Recoverability

H is PR (perfectly recoverable) if and only if H has full column rank, i.e. an inverse exists.

Assumptions on MIMO System

For Deterministic H, ……

non-recoverable

1 2 1 2 1 2

[ ]

1 2 1 3 1 2

[ ]

Examples for Flat MIMO

recoverable

If perfectly recoverable, e.g.

1 2 1 3 1 2

[ ]Parallel Equalizer

ŝ = H+X = G X

H

XS

Signal Constellation

Ci

g+ε1

Example: Wireless MIMO System

v S = ŝgX=gH S =

Signal recovery via g:

Given v s = ŝ , for ŝ to be always a valid symbol for any valid symbol

vector s, if and only if

v [ 0 .. 0 ±1 0 .. 0 ]

FAE()-Property: Finite-Alphabet Exclusiveness

Suppose that a v W = b. For the output b to be always a valid symbol sequence

given whatever v,

the necessary and sufficient condition is that

v = E(k) .

Theorem: FAE -Property

In other words, it is impossible to produce a valid but

different output symbol vector.

if and only if

v [ 0 .. 0 ±1 0 .. 0 ]

-1 +1 +1 +1 +1 +1 • •+1 +1 +1 -1 -1 +1 • •-1 -1 +1 -1 +1 +1 • •S = [ ],

v S = [valid symbols]

FAE()-Property: Finite-Alphabet Exclusiveness

If v ≠ [ 0 .. 0 ±1 0 .. 0 ]If v = [ 0 .. 0 ±1 0 .. 0 ]

ŝ [Finite Alphabet]

“EM”:

v S = ŝBlind-BLAST

gX=gH S =

ĝ= ŝ X+

š= T[ ĝ X]E-step

M-step

•The E step determines the best guess of the membership function zj .

• The M step determines the best parameters, θn , which maximizes the likelihood function.

ŝ = T[gX ]E-step

M-step ğ= š X+

Combined EM: š= T[ ĝ X] = T[ ŝ X+ X] = T[ ŝ W]

ŝold=ŝŝnew

Associative Memory Network

W=X+X

ŝnewŝnew

:threshold

ŝnew= Sign[ŝoldW]

initial

vector ŝNearest symbol vector

gx -plane

Linear projection onto gx - plane

FA nonlinear mapping

=

=

ŝ’ = T[ŝW]

g= ŝold X+

A symbol vector a* is a "perfect attractor" of AMM(W) if and only if

• a* is symbol vector

• a* = W a*

Definition: Perfect Attractor of AMM(W)

Let v= [ f(1) ≠0 f(2) … f(p)]

1

2

p

q

1

1

Thus f(p) =0

= 0

Let ûi= [ ui (1) ui(2) … ui(p) ]T

Let v’= [ f(1) ≠0 f(2) … f(p-1) 0]

Let ǖi = v ûi Let ǖ’i = v’ ûi

Compare the two programsand determine the differences in performance.

Why such a difference?

MatLab Exercise

p = zeros(1,100);for j=1:100; S = sign(randn(5,200)); A = randn(5,5)+eye(5); X = A*S + 0.01*randn(5,200); s = sign(randn(200,1)); W = X'*inv(X*X')*X; for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end while norm(s-W*s)> 5.0,

s = sign(randn(200,1)); for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end end p(j) = max(abs(S*s));endhist(p)

p = zeros(1,100);for j=1:100; S = sign(randn(5,200)); A = randn(5,5)+eye(5); X = A*S + 0.01*randn(5,200); s = sign(randn(200,1)); W = X'*inv(X*X')*X; for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end p(j) = max(abs(S*s));endhist(p)

Unsupervised learning Networks

Documents

Transcript of Unsupervised learning Networks