Gradient Descent Rule Tuning
description
Transcript of Gradient Descent Rule Tuning
![Page 1: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/1.jpg)
Gradient Descent Rule Tuning
See pp. 207-210 in text book
![Page 2: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/2.jpg)
RulesConsider a rule base with• M rules, rth rule has the form
• IF x1 is Tr,1 AND … AND xn is Tr,n THEN y is yr (or y is yr + other stuff)
• TSK fuzzy system has mathematical form
, , , ,1
, , ,11
,
1
( ; , ,
( )
)
( ; , , )
M
r
M
r
n
r i i r i r i r ii
n
r i i r i r i r i
r
i
f x
y x c L R
x c L R
,1 , , , , ,1
, , , ,
1
1
1
1
, ( ; , , )
( ; , , )
( )
n
r i r i r i r i r ii
n
r i i r i r i r
M
r r r n
ri
n
i
r
M
y x c L R
x
x
L
f x
x
c R
![Page 3: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/3.jpg)
• Membership function parameters– Center, right-width, left-width– Consequent parameters
• 3 level (layer) structure of f(x)– Level (layer) 1:
• For each rule Compute all membership values for each term, compute product, store as zr
– Level (layer) 2:• Compute product of membership values and
consequents, sum: n• Sum membership values: d
– Level (layer) 3:• Compute quotient: f = n/d
( ; ; , )( )
( ; )
n xf x
d x
y
![Page 4: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/4.jpg)
Rule parameters• Membership function parameters
– Center, right-width, left-width– Consequent parameters
• Why not s, z and triangular membership functions?• Why Gaussian membership functions?
,
2
,
,
,2
1 1
1
1
1 1
( , , )( )
( , , )
i r
i r
i r
i r
xnM
M
r i lrM
xnM
lr
x
r i
x
r
r
ez x
f xz x
xy
e
y
x
![Page 5: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/5.jpg)
Gradient Descent
• Choose parameters to minimize the error
• Corresponds to a blind person descending a mountain by finding the steepest descending slope and moving in that direction
• Slope is determined by differentiation (computing the “gradient”)
• Chain rule helps tremendously.
![Page 6: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/6.jpg)
Gradient Descent Math•Consider a sequence of input/output measurements: (x0
p, y0p)
•As each input/output measurement pair arrives (and before the next input/output measurement pair arrives), we want to adjust our model parameters to reduce the error ep = [f(x0
p)-y0p]2/2
•Dropping the sub-and-super-scripts e = [f(x)-y]2/2•The gradient descent algorithm for any vector-valued parameter s is
old
new old ss s
s
es s
s
step size for s
![Page 7: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/7.jpg)
2,
,
1 1
1 1
( , , )
( , )
( , , ) ( , , )
( , , ) ( ; , )i r i
r i
M M
r r rr r
xn n
r ii i
r
x
nf
d
n z x d z x
z x e x
y x
x
x i r
y x x
old
new old ss s
s
es s
s
step size for s
Apply to: y
x
![Page 8: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/8.jpg)
For ybar
old
new old y
y
ey y
y
2
2 1
1
( ; )1 1
( )2 2 ( ; )
M
rrM
r
y x ie f x y y
x i
2
1
1
1 1
( )1( )
( ( ) ) 2
( )( )
( , )( )
( , )
( ; )( , )
( ; ) ( , )
q q
q
M
r
M
rrM M
r r
f x yef x y
y f x y y
f xf x y
y
x qf x y
x r
y x rx q
yx r x r
1
2
1
( ,1)( ; )
( , 2)
( ; )( , )
M
rr
M
r
xy y x r
xe
yx r
x M
Given x and y
Modify for betaModify for xbarModify for sigma
![Page 9: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/9.jpg)
Gradient Descent• For a generic
parameter
• For ybar, see previous slide
• For xbar
• For sigma• Abstraction saves
work.
de dff y
dp dp
?i
df
dx
?i
df
d
?i
df
dy
![Page 10: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/10.jpg)
One LV Example FL System
• LV X: Term set: Negative, Zero, Positive
• 3 rules
• Antecedent matrix, Consequent matrix
• Gaussian membership functions
• Super membership function
• Fuzzy function parameters
• TSK fuzzy function
• Gradient Descent parameter tuning
![Page 11: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/11.jpg)
One LV Example FL System• LV X: Negative5, Zero, Positive5• 3 rules
– If x is Negative5 then y is 25– If x is Zero then y is 0– If x is Positive5 then y is 25
• Antecedent matrix and consequent matrix
5 1
2
5 3
Negative
A Zero
Positive
1
2
3
25 ?
0 ??
25 ???
y
C y
y
![Page 12: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/12.jpg)
One LV Example FL System
• LV X: Negative5, Zero, Positive5
• Gaussian membership functions
222
2
(0)
2( )x xx
Zero x e e
22
3
3
(5)
25 ( )
x xx
Positive x e e
221
1
( 5)
25 ( )
x xx
Negative x e e
1
2
3
5 ?
0 ??
5 ???
x
x x
x
1
2
3
2 ?
2 ??
2 ???
![Page 13: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/13.jpg)
One LV Example FL System
• Super membership function
5 1 1 1 1 1
2 2 2 2 2
5 3 3 3 3 3
( , , ) ( , , )
( , , , , ) ( , , ) ( , , )
( , , ) ( , , )
{ 5, , 5} {1,2,3}
Negative
Zero
Positive
x x x x
x LV T x x x x x
x x x x
LV is X
T is one of Negative Zero Positive
![Page 14: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/14.jpg)
One LV Example FL System• TSK fuzzy function• Gradient Descent parameter tuning
3 3
1 13 3
1 1
( , , ) ( , , )( )
( , , ) ( , , )
r r r r r rr r
r r r rr r
y x x y z x xf x
x x z x x
![Page 15: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/15.jpg)
One LV Example FL System
• TSK fuzzy function, Gradient Descent parameter tuning ybar
3 3
1 13 3
1 1
( , , ) ( , , )( )
( , , ) ( , , )
r r r r r rr r
r r r rr r
y x x y z x xf x
x x z x x
21( , , , )
2e f x y x y
o
n o y
y
de
dy
yy
[ ( , )[ ( , ) ]
[ ( , ) ]
] ( , )d f x y ydef x y
dey
d f x
df x y
dy dydy yy
: ( , )new data x y
3
1
( , ) ( , , )
( , , )
i i i i
ir r r
r
df x y x x
dy x x
1 1
2 23 3
3 31 1
( , )
1 1
r rr r
df x y
dy
z
zz z
![Page 16: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/16.jpg)
One LV Example FL System
• TSK fuzzy function, Gradient Descent parameter tuning ybar
o
n o y
y
de
dyy y
( , )[ ( , ) ]
df x yf x y
y
d
y ddy
e
: ( , )new data x y
1
, ,
23
31
( , , )( , , , )
( , , )( , , ) ( , , )
o o o
n o y
x
r
yr
z xf y x
z xz x z
y y
xx y
x
xx x
Heart and soul of gradient descent algorithm to tune ybar using experimental data.
Engineers derive these expressions. Computers compute with these expressions, often iteratively, to improve designs.
Note interplay of theory and real-world data.
![Page 17: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/17.jpg)
One LV Example FL System
• TSK fuzzy function, Gradient Descent parameter tuning xbar
o
n o xx
de
dxx x
[ ( , ) ]
( , )f x x y
de df x
dxd
x
x
: ( , )new data x y
3
13
1
3 3
1 13 3
1 1
( , ) ( , )
( , ) (
)
,
( ,
)
r r r r rr r
i i ir r
r rr
ir
rr
r r
i i
i
y z x xd
y x xdf d d
dx d
x x
dxx dxz x x x x
yd
d
2
3 3 3
1 1 12 23 3
1 1
2 2
2(1 (
) 2( ))i
i
i
i r r r i r rr r r
r r
x x
i
r r
ii
i i
y y y ydf
dx
x x x xe
![Page 18: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/18.jpg)
One LV Example FL System
• TSK fuzzy function, Gradient Descent parameter tuning xbar
112
1
2
3
11
3
2231
222
332
3
33
1
1
( )
1( )
2( )
2( )
)( )
2(
r rr
r rr
rr
r rr
y y
y y
y
x x
x x
x
f
d
yx
d
x
o
n o xx
de
dxx x
( , )[ ( , ) ]
de df xf x
dx x
x
dx y
: ( , )new data x y
31
1 121 1
32
2 22 231 2
31 3
3 321 3 , ,
2( )( )
2( )1( , , , ) ( )
2( )( )
o o o
r rr
r rr
rr
r r
n o x
r y x
x
xx y
xy y
xf y x y y
y
x
xy
x
x
![Page 19: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/19.jpg)
One LV Example FL System
• TSK fuzzy function, Gradient Descent parameter tuning xbar
o
n ox
de
d
[ ( , ) ]
( , )f x x y
de df
dd
x
: ( , )new data x y
3
13
1
3 3
1 13 3
1 1
( , ) ( , )
( , ) (
)
,
( ,
)
r r r r rr r
i i ir r
r r
rr r
r
ir
i i
r
i
y z x y xdf d d
d
d x
dd dz x x
yd
d
2
3 3 3
1 1 12 23 3
1
3
1
2 2
3
2( )( )
2( )1 i
i
i
i r r r i r rx x
i ii
i i
r r r
r rr r
x x xdf
d
y ye
y yx
![Page 20: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/20.jpg)
One LV Example FL System
• TSK fuzzy function, Gradient Descent parameter tuning xbar
21
131
2
3
11
3
2231
3
223
2
23
333
13
1
( )
1( )
2( )
2( )
)( )
2(
r rr
r rr
rr
r rr
x x
x x
y y
y ydf
d
yx x
y
o
n o x
de
d
( , )[ ( , ) ]
de dff yx
d d
xx
: ( , )new data x y
231
1 131 1
232
2 22 331 2
231 3
3 331 3 , ,
2( )( )
2( )1( , , , ) ( )
2( )( )
o o o
r rr
r rn or
rr
r
x
rr y x
xy y
xf y x y y
xy
xy
x
xx y
![Page 21: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/21.jpg)
One LV: Gradient Descent Summary
31
1 121 1
32
2 22 231 2
31 3
3 321 3 , ,
2( )( )
2( )1( , , , ) ( )
2( )( )
o o o
r rr
r rr
rr
r r
n o x
r y x
x
xx y
xy y
xf y x y y
y
x
xy
x
x
231
1 131 1
232
2 22 331 2
231 3
3 331 3 , ,
2( )( )
2( )1( , , , ) ( )
2( )( )
o o o
r rr
r rn or
rr
r
x
rr y x
xy y
xf y x y y
xy
xy
x
xx y
1
, ,
23
31
( , , )( , , , )
( , , )( , , ) ( , , )
o o o
n o y
x
r
yr
z xf y x
z xz x z
y y
xx y
x
xx x
![Page 22: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/22.jpg)
• We are now ready to do gradient descent
![Page 23: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/23.jpg)
Two LV Example FL System• Temperature term set: Cold, Comfortable, Hot• Humidity term set: Wet, Dry• 6 rules• Antecedent matrix, Consequent matrix• Gaussian membership functions• Super membership function• Fuzzy function parameters• TSK Fuzzy Function• Gradient descent parameter tuning
![Page 24: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/24.jpg)
Two LV Example FL System• Temperature term set: Comfortable, Warm, Hot
• Humidity term set: Wet, Dry
• 6 rules– If T is Comfortable and H is Wet then HI is– If T is Comfortable and H is Dry then HI is– If T is Warm and H is Wet then HI is– If T is Warm and H is Dry then HI is– If T is Hot and H is Wet then HI is – If T is Hot and H is Dry then HI is
![Page 25: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/25.jpg)
Two LV Example FL System: Matrices– If T is Comfortable and H is Wet then HI is– If T is Comfortable and H is Dry then HI is– If T is Warm and H is Wet then HI is– If T is Warm and H is Dry then HI is– If T is Hot and H is Wet then HI is – If T is Hot and H is Dry then HI is
1
2
3
4
5
6
1 1
1 2
2 1
2 2
3 1
3 2
yComfortable Wet
yComfortable Dry
yWarm WetA C
yWarm Dry
yHot Wet
yHot Dry
![Page 26: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/26.jpg)
Two LV Example FL System
• Temperature term set: Cold, Comfortable, Hot
• Humidity term set: Wet, Dry
• Gaussian membership functions
• Super membership function
![Page 27: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/27.jpg)
Two LV Example FL System
• TSK Fuzzy Function
• Gradient descent parameter tuning
![Page 28: Gradient Descent Rule Tuning](https://reader036.fdocuments.net/reader036/viewer/2022070409/568144ff550346895db1cadc/html5/thumbnails/28.jpg)
Two LV Example FL System
• Gradient descent parameter tuning