Neural Networks for Solving Systems of Linear Equations
Transcript of Neural Networks for Solving Systems of Linear Equations
![Page 1: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/1.jpg)
Artificial Neural Networks (Spring 2007)
Neural Networks for Solving Systems of Linear Equations
Seyed Jalal KazemitabarReza Sadraei
Instructor: Dr. Saeed BagheriArtificial Neural Networks Course (Spring 2007)
![Page 2: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/2.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical IntroductionProblem FormulationStandard Least Squares SolutionGeneral ANN SolutionMinimax SolutionLeast Absolute Value SolutionConclusion
![Page 3: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/3.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical IntroductionProblem FormulationStandard Least Squares SolutionGeneral ANN SolutionMinimax SolutionLeast Absolute Value SolutionConclusion
![Page 4: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/4.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
History
70’s:Kohonen solved optimization problems using Neural Networks.
80’s:Hopfield used Lyapunov function (Energy function) for proving the convergence of iterative methods in optimization problems.
Differential Eq. Neural Networksmapping
![Page 5: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/5.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
History
Many problems in science and engineering involve solving a large system of linear equations:
Machine LearningPhysicsImage ProcessingStatistics,…
In many applications an on-line solution of a set of linear equations is desired.
![Page 6: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/6.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
History
40’s:Kaczmarz introduced a method to solve linear equations
50’s – 80’s:Different methods based on Kaczmarz’s has been proposed in different fields.Conjugate Gradient method.
No good method for on-line solution of large systems.
![Page 7: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/7.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
1990:Andrzej Cichocki:
a Mathematician who received his PhD in Electrical Engineering Proposed a Neural Network for solving systems of linear equations in real time
![Page 8: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/8.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical IntroductionProblem FormulationStandard Least Squares SolutionGeneral ANN SolutionMinimax SolutionLeast Absolute Value SolutionConclusion
![Page 9: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/9.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Problem Formulation
Linear Parameter Estimation model :
: Linear Equation
: Model matrix: Unknown vector of the system parameters to be estimated
: Vector of observations: Unknown measurement errors: Vector of true values (usually unknown)
nmij R]a[A ×∈=
truebrbAx =+=
mRb∈mRr∈m
true Rb ∈
nTn21 R]x,...,x,x[x ∈=
⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢
⎣
⎡
=
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
+
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
=
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
m
2
1
true
true
true
m
2
1
m
2
1
n
2
1
mn2m1m
n22221
n11211
b
bb
r
rr
b
bb
x
xx
aaa
aaaaaa
MMMM
L
MOM
L
![Page 10: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/10.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Types of Equations
A set of linear equations is said to be overdetermined if m > n.
Usually inconsistent due to noise and errors.e.g. Linear parameter estimation problems arising in signal processing, biology, medicine and automatic control.
A set of linear equations is said to be underdetermined if m < n (due to the lack of information).
Inverse and extrapolation problems.Involves much less problems than overdetermined case
nmij R]a[A ×∈=truebrbAx =+=
![Page 11: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/11.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Mathematical Solutions
Why not use ?It is not applicable since m≠n most of the time which results in irreversibility of A.
What if we use least square error method?
Inversing is considered to be time consuming for large A in real-time systems.
bA x -1=
;bA)AA(x,bAAxA
,0)bAx(A'y),bAx()bAx(y
T1T
TT
T
T
−=
=
=−=
−−=
AAT
![Page 12: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/12.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical IntroductionProblem FormulationStandard Least Squares SolutionGeneral ANN SolutionMinimax SolutionLeast Absolute Value SolutionConclusion
![Page 13: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/13.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Least Squares Error Function
Find the vector that minimizes the least squares function
Where
represents the residual components of the residual vector
nRx ∈*
∑=
−=−=n
1jijijiii bxabxA)x(r
bAxxrxrxrxr Tm −== )](),...,(),([)( 21
∑=
=−−=m
1i
2i
T )x(r21)bAx()bAx(
21)x(E
![Page 14: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/14.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Gradient Descent ApproachBasic idea: compute a trajectory starting at the initial point
that has the solution x* as a limit point ( for )
General gradient approach for minimization of a function:
is chosen in a way that ensures the stability of the differential equations and an appropriate convergence speed
∞→t)t(x
)0(x
)x(EdtdX
∇−= μ
μ
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
∂∂
∂∂
∂∂
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
−=
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
n
2
1
mn2m1m
n22221
n11211
n
2
1
xE
xE
xE
dtdx
dtdx
dtdx
M
L
MOM
L
Mμμμ
μμμμμμ
![Page 15: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/15.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Solving LE Using Least Squares Criterion
Gradient of the energy function:
So
Scalar representation:
)bAx(AxE
xE
xEE T
T
n21
−=⎥⎦
⎤⎢⎣
⎡∂∂
∂∂
∂∂
=∇ L
n,...,2,1j,x)0(x
bxaadtdx
)0(jj
n
1p
n
1kikik
m
1iipjp
j
==
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠
⎞⎜⎝
⎛−−= ∑ ∑∑
= ==
μ
)bAx(AdtdX T −−= μ
![Page 16: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/16.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
∑ ∑∑= ==
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠
⎞⎜⎝
⎛−−=
n
1p
n
1kikik
m
1iipjp
j bxaadtdx
μ
![Page 17: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/17.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
ANN With Identity Activation Function
![Page 18: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/18.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical IntroductionProblem FormulationStandard Least Squares SolutionGeneral ANN SolutionMinimax SolutionLeast Absolute Value SolutionConclusion
![Page 19: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/19.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
General ANN Solution
The key step in designing an algorithm for neural networks:
Construct an appropriate computational energy function (Lyapunov function)
Lowest energy state will correspond to the desired solution x*
Using derivation, the energy function minimization problem is transformed into a set of ordinary differential equations
)x(E
![Page 20: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/20.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
General ANN Solution
In general, the optimization problem can be formulated as:
Find the vector that minimizes the energy function
is called weighting function.Weighting function derivation is called activation function
nRx ∈*
))x(r()bxA()x(Em
1ii
m
1iii ∑∑
==
=−= σσ
))x(r( iσ
ii
iii r
Er
)r()r(g∂∂
=∂
∂=
σ
![Page 21: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/21.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
General ANN Solution
Gradient descent approach:
The minimization of the energy function leads to the set of differential equation
)x(EdtdX
∇−= μ
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
∂∂
∂∂
∂∂
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
−=
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
n
2
1
mn2m1m
n22221
n11211
n
2
1
xE
xE
xE
dtdx
dtdx
dtdx
M
L
MOM
L
Mμμμ
μμμμμμ
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠
⎞⎜⎝
⎛−×−=
⎟⎟⎠
⎞⎜⎜⎝
⎛
∂∂
∂∂
−=∂∂
−=
∑ ∑∑
∑∑∑
= ==
===
m
1i
n
1kikikiip
n
1pjp
j
m
1i ip
in
1pjp
p
n
1pjp
j
bxagadt
dx
rE
xr
xE
dtdx
μ
μμ
![Page 22: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/22.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
General ANN Architecture⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠
⎞⎜⎝
⎛−×−= ∑ ∑∑
= ==
m
1i
n
1kikikiip
n
1pjp
j bxagadt
dxμ
Remember that this is the activation function
g1
g2
gm
![Page 23: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/23.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Drawbacks of Least Square Error Criterion
Why not always use least square energy function?
Not so good in case of existence of large outliers.Only optimal for Gaussian distribution of error.
The proper choice of the criterion depends onSpecific applications.Distribution of the errors in the measurement vector b
Gaussian dist*. Least squares criterionUniform dist. Chebyshev norm criterion
*However the assumption that the set of measurements or observations has a Gaussian error distribution is frequently unrealistic due to different sources of errors such as instrument errors, modeling errors, sampling errors, and human errors.
![Page 24: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/24.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Huber’s Function:
Weighting Function Activation Function
Special Energy Functions
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
>−
≤=
βββ
βρ
e:2
e
e:2e
)e( 2
2
H
![Page 25: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/25.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Special Energy Functions
Talvar’s Function:
This Function has direct implementationWeighting Function Activation Function
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
>
≤=
ββ
βρ
e:2
e:2
e
)e( 2
2
T
![Page 26: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/26.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Special Energy Functions
Logistic Function:
Iterative Reweigheted method uses this activation function.
Weighting Function Activation Function
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛=
ββρ eCoshln)e( 2
L
![Page 27: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/27.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Special Energy Functions
Lp-normed function:
Activation Function
∑=
=m
1i
pip r
p1)x(E
![Page 28: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/28.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Lp-Norm Energy Functions
A well-known criterion is energy functionNormL1 −
Weighting Function Activation Function
∑=
=m
1ii1 )x(r)x(E
![Page 29: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/29.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Special Energy Functions
Another well-known criterion is (chebyshev) criterion which can be formulated as the minimax problem:
This criterion is optimal for uniform distribution of error.
NormL −∞
{ })x(rmaxmin imi1Rx n ≤≤∈
![Page 30: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/30.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical IntroductionProblem FormulationStandard Least Squares SolutionGeneral ANN SolutionMinimax SolutionLeast Absolute Value SolutionConclusion
![Page 31: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/31.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Minimax (L∞-Norm) Criterion
For the case p=∞ of the Lp-Norm problem the activation function g[ri(x)] can not be explicitly mathematically expressed by
Error function can be define as
resulting in following activation function:
mi1i })x(rmax{)x(E
≤≤∞ =
1)( −pi xr
⎪⎭
⎪⎬⎫
⎪⎩
⎪⎨⎧ =
= ≤≤
otherwise0})x(r{max)x(rif)]x(r[sign
)]x(r[gkmk1ii
i
![Page 32: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/32.jpg)
Jalal KazemitabarReza Sadraei Artificial Neural Networks (Spring 2007)
Minimax (L∞-Norm) Criterion
Although straightforward, some problems arise in practical implementations of the system of differential equations:
Exact realization of the signum functions is rather difficult (electrically).E∞ has a derivative discontinuity at x if for some i ≠ k
*This is often responsible for various anomalous results (e.g. hysteresis phenomena)
)()()( xExrxr ki ∞==
![Page 33: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/33.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Transforming the problem to an equivalent one
Rather than directly implementing the proposed system, we transform the minimax problem
into an equivalent one:
Minimize subject to the constraints
Thus the problem can be viewed as finding the smallest non-negative value of
where x* is a vector of the optimal values of the parameters
{ })(maxmin1
xr imiRx n ≤≤∈
ε
ε≤)(xri 0≥ε
0)( ** ≥= ∞ xEε
![Page 34: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/34.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
New Energy Function
Applying the standard quadratic function we can consider the cost function as:
where are coefficients and
{ }∑=
−− −+++=m
iii xrxrxE
1
22 ))](([))](([2
),( εεκυεε
0,0 >> κν
},0min{][ yy =−
![Page 35: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/35.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
New Energy Function
Applying now the gradient strategy we obtain the associated system of differential equations
⎟⎠
⎞⎜⎝
⎛−ε++ε+
κν
μ−=ε ∑
=
]S))x(r(S))x(r[(dtd
2ii
m
1i1ii0
{ }∑=
−−+−=m
iiiiiijj
j SxrSxradtdx
121 ]))(())([( εεμ ),...,2,1( nj =
⎭⎬⎫
⎩⎨⎧ ≥+
=otherwise;1
0)x(r;0S i
1i
ε
⎭⎬⎫
⎩⎨⎧ ≥−
=otherwise;1
0)x(r;0S i
2i
ε
![Page 36: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/36.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Simplifying architecture
It is interesting to note that the system of differential equations can be simplified by:
This nonlinear function represent a typical dead zone function.
⎪⎭
⎪⎬
⎫
⎪⎩
⎪⎨
⎧
>+−≤≤−
−<+=
εεεε
εεεϕ
ii
i
ii
ii
rifrrif
rifrxr
__0_
)),((
![Page 37: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/37.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Simplifying architectureIt is easy to check:
Thus the system of differential equations can be simplified to the form:
)),(())(())(( 21 εϕεε xrSxrSxr iiiiii −=−++
)),(())(())(( 21 εϕεε xrSxrSxr iiiiii =−−+
)0(
10 )0(,)),(( εεεϕκυμε
=⎟⎠
⎞⎜⎝
⎛−−= ∑
=
m
iii xr
dtd
,)),x(r(adt
dx m
1iiiijj
j ∑=
εϕμ−= )n,...,2,1j(x)0(x )0(jj ==
![Page 38: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/38.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
∑=
−=m
iiiijj
j xradt
dx
1)),(( εϕμ
![Page 39: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/39.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
⎟⎠
⎞⎜⎝
⎛−−= ∑
=
m
1iii0 )),x(r(
dtd εϕ
κυμε
![Page 40: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/40.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical IntroductionProblem FormulationStandard Least Squares SolutionGeneral ANN SolutionMinimax SolutionLeast Absolute Value SolutionConclusion
![Page 41: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/41.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Least Absolute Values ( L1-Norm) Energy Function
Find the design vector that minimizes the error function
where
Why should one choose this function knowing that it has differentiation problems?
∑=
=m
ii xrxE
11 )()(
∑=
−=n
jijiji bxaxr
1)(
![Page 42: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/42.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Important L1-Norm Properties1. Least absolute value problems are equivalent to linear
programming problems and vice versa.
2. Although the energy function E1(x) is not differentiable, the terms can be approximated very closely by smoothly differentiable functions
3. For a full rank* matrix A, there always exists a minimum L1-Norm solution which passes through at least n of the m data points. L2-Norm does not in general interpolate any of the points.
These properties are not shared by L2-Norm.
* Matrix A is said to be of full rank if all its rows or columns are linearly independent.
)(xri
![Page 43: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/43.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Important L1-Norm Properties
Theorem: There is a minimizer of the energy function for which the residuals forat least n values of i, say i1, i2, …, in, where n denotes the rank of the matrix A.
We can say that L1-Norm solution is the median solution while the L2-Norm solution is the mean solution.
n* Rx ∈∑=
=m
1ii1 )x(r)x(E 0)x(r *
i =
![Page 44: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/44.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Least Absolute Error Implementation
The algorithm is as follows:1. First phase:
Solving the problem using ordinary least-square technique and computing all m residualsSelecting from them the n residuals which are smallest in absolute value
2. Second phase:Discarding the rest of equations, n equations related to selected residuals are solved by minimizing the residuals to zero
ANN implementation is done in three layers using inhibition control circuit.
![Page 45: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/45.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Phase #1
ANN Architecture for Solving L1-Norm Estimation Problem
Phase #2
![Page 46: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/46.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Phase #1
ANN Architecture for Solving L1-Norm Estimation Problem
Phase #2
![Page 47: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/47.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Phase #1
ANN Architecture for Solving L1-Norm Estimation Problem
Phase #2
![Page 48: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/48.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Example
Consider matrix A and observation b as below. Find the solution to Ax=b using the least absolute error energy function.
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
=
16 4 19 3 14 2 11 1 10 0 1
A
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
=
101-1 2 1
b, 0bAx, =−
![Page 49: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/49.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
In the first phase all the switches ( S1-S5 ) were closed and the network was able to find the following standard least-squares solution:
In this case it is impossible to select two largest, in absolutevalue, residuals because Phase one was rerun while switch S4 was opened and the network found then
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−=
5.15.36.0
x *I
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
−
−
=
6.04.1
6.06.04.0
)x(r *I
6.0rrr 532 ===
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−=
3409.16404.29182.0
x II*
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
−−
−
=
0273.02273.3
016362182.00818.0
)x(r II*
![Page 50: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/50.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Cichocki’s Circuit Simulation Results
In the second phase ( and third run of the network ) the inhibitive control network has opened the switch S2. So in the third run only switches S1,S3,S5 were closed, and the network found the equilibrium point:
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−=
375.1750.21
x*
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
−=
0125.2
0375.00
)x(r *,
![Page 51: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/51.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Cichocki’s Circuit Simulation Results
Residuals for n=3 of the m=5 equations converges to zero in 50 nano-seconds.
![Page 52: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/52.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Using MATLAB, we observed that zeroing r1,r3 and r5 results in the minimum value of ∑
=
=m
1ii1 )x(r)x(E
![Page 53: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/53.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical IntroductionProblem FormulationStandard Least Squares SolutionGeneral ANN SolutionMinimax SolutionLeast Absolute Value SolutionConclusion
![Page 54: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/54.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Conclusion
Great need for real-time solution of linear equations.
Cichocki’s proposal ANN is different from classical ANNs.
Consider a proper energy function, reducing which results in the optimal solution to Ax=b.
‘Proper function’ may have different meaning in different applications.
Standard least square error function gives the optimal answer for Gaussian distribution of error.
![Page 55: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/55.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)
Conclusion (Cont.)Least square function doesn’t have a good behavior when having large outliers in observations.
Various energy functions have been proposed to solve the outlierproblem (e.g. logistic function).
Minimax results in the optimal answer for the uniform distribution of error. It also has some implementation and mathematical problemsthat results in an indirect approach to solving the problem.
Least absolute error function has some properties that makes it distinguishable from other error functions.
![Page 56: Neural Networks for Solving Systems of Linear Equations](https://reader034.fdocuments.net/reader034/viewer/2022050302/626eb30919fe5115250d1a2d/html5/thumbnails/56.jpg)
Reza SadraeiJalal Kazemitabar Artificial Neural Networks (Spring 2007)