Download - Davidson and Fraser. Automatic Generation of Peephole Optimizations

Transcript
Page 1: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 1/6

P r o c e e d in g s o f t h e

A C M S I G P L A N '8 4

S y m p o s i u m o n C o m ? i l e r

C o n s t r u c t i o n

SI G PL AN Not ices Vol . 19 , No. 6 , June 198~

A u t o m a t i c G e n e r a t i o n o f P e e p h o l e O p t i m i z a ti o n s ~

Jack W. Davidson

Dept. o f Appl ied M athemat ics and C omputer Science

University o f Virginia

Charlottesville, VA 22901

Christopher W. Fraser

Dept . of Com puter Science

Universi ty o f Arizona

Tucson, AZ 85721

A bstr a c t

T h i s p a p e r d e s c r i b es a s y s t e m t h a t a u t o m a t i c a l l y

g e n e r a t e s p e e p h o l e o p t i m i z a t i o n s . A g e n er a l

p e e p h o l e o p t i m i z e r d r i v e n b y a m a c h i n e d e s c r i p t i o n

p r o d u c e s o p t i m i z a t i o n s a t c o m p i l e - c o m p i l e t i m e f o r

a f a s t , p a t t e r n - d i r e c t e d , c o m p i l e - t i m e o p t i m i z e r .

T h e y f o r m p a r t o f a c o m p i l e r t h a t si m p l i fi e s r e t a r g e t -

i n g b y s u b s t i t u ti n g p e e p h o l e o p t i m i z a t i o n f o r c a se

a n a l y s i s .

1 . In tr o duc t io n

C o d e g e n e r a t o r s o f t e n c r e a te i n e ff i c ie n t j u x t a p o -

s i ti o n s. F o r e x a m p l e , i n c r e m e n t i n g a n d t e s ti n g a

v a r i a b l e c a n c r e a t e a r e d u n d a n t c o m p a r i s o n i f t h e

c o d e f o r t h e i n c r e m e n t a u t o m a t i c a l l y s e ts a c o n d i t i o n

c o d e r e g i s t e r. C o r r e c t i n g t h i s i n t h e c o d e g e n e r a t o r

c o m p l i c a t e s c a s e a n al y s i s c o m b i n a t o r i a l l y , s i nc e e a c h

c o m b i n a t i o n o f la n g u a g e f e a t u r es m a y g e n e r a t e a

u n i q u e j u x t a p o s i t i o n [ 9] , I t i s o f t e n c h e a p e r to g e n -

e r a t e c o d e l o c a l l y a n d t h e n u s e a p e e p h o l e o p t i m i z e r

t o i m p r o v e i n e f fi c i en t j u x t a p o s i t i o n s . P e e p h o l e

o p t i m i z a t i o n t y p i c a l l y r e d u c e s c o d e s i z e b y 1 0 - 5 0 %

[ 1 4 , 1 7] . E v e n t h e n e w c o d e g e n e r a t o r s d r i v e n b y

m a c h i n e d e s c r i p t i o n s [ 6] b e n e f it f r o m p e e p h o l e

o p t i m i z a t i o n [ 2] .

C la s s i c a l p e e p h o le o p t im iz e r s [1 , 1 4 , 1 5, 1 7 ]

r a p i d l y c o r r e c t a f e w h a n d - w r i t t e n , m a c h i n e - s p e c i f i c

]This w ork was supported n p art by the National Science Founda-

tion und er Grant MCS-7802545.

Permission to cop y without fee all or par t of this material s grant-

ed provided that the copies are n ot mad e or distributed for direct

commercial advantage, the ACM copyright notice and title o f the

publication and its date app ear, and n otice is given hat copying s

by permission of the Association for Com puting Machinery. To

copy otherwise, or to republish, requires a fee and or specificper-

mission.

©1984 ACM 0-89791-139-3/84[0600/0111500.75

p a t te r n s. F o r e x am p l e , t h e a m b i t i o u s F I N A L

o p t i m i z e r i n th e B L I S S - I i c o m p i l e r [1 7] d el e te s

u n n e c e s s a r y c o m p a r i s o n s , e x p l o i ts s p e c ia l - ca s e

i n s t r u c t i o n s a n d e x o t i c a d d r e s s i n g m o d e s , c o a l e s c e s

c h a i n s o f b r a n c h e s , a n d d e l e t e s u n r e a c h a b l e c o d e .

U n f o r t u n a t e l y , g o o d p a t t e r n s c a n b e h a r d t o i d e n ti f y

a n d a r e l a n g u a g e - , c o m p i l e r - , a n d m a c h i n e - sp e c i fi c .

A r e c e n t a l t e r n a t i v e t o c l a s s i c a l p e e p h o l e o p t i m i z -

e r s [ 3 ] us e s a m a c h i n e d e s c r i p t i o n t o s i m u l a t e a d j a -

c e n t i n s t r u c t i o n s , r e p l a c i n g t h e m , w h e r e v e r p o s s i b l e,

w i t h a n e q u i v a l e n t si n g l et o n . S u c h m a c h i n e - d i r e c t e d

o p t i m i z e r s u s e n o p a t t e r n s , s o t h e y a r e m o r e

t h o r o u g h a n d p o r t a b l e t h a n t h e i r c la s si c al c o u n t e r -

p a r t s , b u t t h e y a r e s l o w e r . T h e i r t h o r o u g h n e s s

a l l o w s t h e u s e o f n a i v e , e a si l y r e t a r g e t e d c o d e g e n e r a -

t o r s , b u t v e r b o s e c o d e m a k e s o p t i m i z a t i o n s p e e d

e v e n m o r e c r u c i a l .

T h i s p a p e r d e s c r i b e s a s y s t e m t h a t a u t o m a t i c a l l y

g e n e r a t e s p a t t e r n s f o r a f a s t c l a s s i c a l p e e p h o l e o p t i m -

i ze r . A m o d e r n m a c h i n e - d i r e c t e d o p t i m i z e r is r u n a t

compile-compile t i m e , a n d p a t t e r n s f o r a f a s t , c l a s s i -

c a l

compile-t ime

p e e p h o l e o p t i m i z e r a r e a u t o m a t i -

c a l ly i n f e rr e d f r o m i t s o u t p u t . T h i s c o m b i n e s t h e

t h o r o u g h n e s s a n d r e t a r g e t a b il i t y o f a m a c h i n e -

d i r e c t ed p e e p h o l e o p t i m i z e r w i t h t h e s p e e d o f a c la s -

s ic a l p e e p h o l e o p t i m i z e r . T h i s h a s s p e d u p t h e

p e e p h o l e o p t i m i z a t i o n p h a s e o f a r e t a r g e t a b l e c o m -

p i l e r b y a f a c t o r o f fi v e .

2 . A Ma c h ine D ir e c te d O pt imiz er

T h e s y s t e m u s e s a r e t a r g e t a b l e p e ' ep h o l e o p t i m -

i z e r c a l l e d P O . O t h e r d o c u m e n t s e l a b o r a t e o n PO

i t s e l f [ 3 , 4 ] ; t h i s p a p e r s u m m a r i z e s i t o n l y e n o u g h t o

i n t r o d u c e a n e w a p p l i c a ti o n : g e n e r a t i n g p a t te r n s f o r

a f a s t, c l a s s ic a l p e e p h o l e o p t i m i z e r .

111

Page 2: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 2/6

G i v e n a n a s s e m b l y l a n g u a g e p r o g r a m a n d a s y m -

b o l i c m a c h i n e d e s c r i p t i o n , PO s i m u l a t e s a d j a c e n t

i n s t r u c t i o n s a n d , w h e r e p o s s i b l e , r e p l a c e s t h e m w i t h

a n e q u i v a l e n t si n g le in s t r u c t i o n . E a c h m a c h i n e

d e s c r i p t i o n i s a g r a m m a r f o r s y n t a x - d i r e c t e d t r a n s l a -

t i o n b e t w e e n a s s e m b l y l a n g u a g e a n d r e g i s t e r

t r a n sf e r s . F o r e x a m p l e , t h e p r o d u c t i o n

m o v l s r e , d s t : = : d s t = s r c ; N Z = s r e ? 0 ;

d e s c r i b e s t h e V A X m o v l i n s t r u c t i o n , w h i c h c o p i e s it s

f i r s t o p e r a n d o n t o i t s s e c o n d a n d s e t s t h e c o n d i t i o n

c o d e t o r e f le c t t h e s i g n o f t h e r e s u l t. S i m i l a r p r o d u c -

t i o n s d e s c r i b e a d d r e s s i n g m o d e s .

T o i m p r o v e a n i n s t r u c t i o n , P O m u s t k n o w i ts

e f f e c t , t h a t i s , t h e r e g i s t e r t r a n s f e r s t h a t i t p e r f o r m s .

E a r l y v e r s i o n s o f PO c o m p u t e d e f fe c ts b y m a t c h i n g

a s s e m b l e r i n s t r u c t i o n s a g a i n s t t h e a s s e m b l e r s y n t a x

p a t t e r n s a b o v e a n d i n s t a n t i a t i n g t h e c o r r e s p o n d i n g

r e g is t e r t r a n s f e r p a t t e r n s . T h e m o s t r e c e n t v e r s i o n

s k i p s t h i s w i t h a c o m p i l e r t h a t e m i t s r e g i s t e r t r a n s f e r s

d i r e c t l y . R e g i s t e r tr a n s f e r s a r e n o h a r d e r t o e m i t

t h a n a s s e m b l y c o d e .

O n c e P O h a s t h e e f f e c t o f e a c h i n s t r u c t i o n , i t s y m -

b o l i c a l l y s i m u l a t e s t w o - a n d t h r e e - i n s t r u c t i o n

s e q u e n c es t o f o r m t h e i r c o m b i n e d e f f e c t . P O t h e n

s e a rc h e s t h e m a c h i n e d e s c r i p t i o n f o r a n i n s t r u c t i o n

w i t h t h i s c o m b i n e d e f f e c t . I f i t f i n d s o n e , i t r e p l a c e s

t h e o r i g i n a l i n s t ru c t i o n s w i t h t h e n e w o n e . F o r

e x a m p l e , t h e e f fe c t s o f t h e V A X i n s t r u c t i o n s

movl X , r l

s u b l 2 Y , r l

a r e

r [ 1 ] = m [ X ] ; N Z = m [ X ] ? 0 ;

r [ 1 ] = r [ 1 ] - r e [ Y ] ; N Z - r [ 1 ] - r e [ Y ] ? 0 ;

S y m b o l i c s i m u l a t i o n c o m b i n e s t h e s e t o y i e ld

r [ 1 ] = m [ X ] - r e [ Y ] ; N Z = m [ X ] - m [ Y ] ? O ;

w h i c h i s r e a l i z e d b y t h e i n s t r u c t i o n

s ub l3 Y , X , l

s o t h i s i n s t r u c t i o n r e p l a c e s t h e t w o a b o v e .

U n l i k e c l a s s i c a l p e e p h o l e o p t i m i z e r s , P O

h a s n o

p a t t e r n s : i t c o m b i n e s a l l p o s s i b l e p a i r s a n d t r i p l e s .

A s a r e s u l t , i ts e f f e c t c a n b e d e s c r ib e d f o r m a l l y a n d

c o n c i s e l y : w h e n i t is f i n i s h e d , n o o n e - , t w o - , o r

t h r e e - i n s t r u c t i o n s e q u e n c e c a n b e r e p l a c e d w i t h a

c h e a p e r s i n g l e i n s t r u c t i o n h a v i n g t h e s a m e e f f e c t .

T h i s t h o r o u g h n e s s a l l o w s c o d e g e n e r a t o r s t o f o r g o

c a s e a n a l y s i s a n d e m i t o n l y a s m a l l s u b s e t o f t h e

m a c h i n e ' s i n s t r u c t i o n s a n d a d d r e s s i n g m o d e s ( e . g . ,

o n e f o r m o f a d d , o n e f o r m o f s u b t ra c t ) . P O r e p la c e s

t h e m w i t h b e t t e r i n s t r u c t i o n s a s it c o m b i n e s a d j a c e n -

c ie s. A c o m p i l e r f o r th e p r o g r a m m i n g l a n g u a g e v [ 8]

b a s e d o n t h i s t e c h n i q u e [ 4 , 5 ] h a s b e e n r e t a r g e t e d t o

s e v e n d i f f e r e n t a r c h i t e c t u r e s , s o m e i n a s f e w a s t h r e e

m a n - d a y s . I t e m i ts c o d e c o m p a r a b l e t o h o s t - sp e c i fi c

c o m p i l e r s .

T h i s r e l i a n c e o n p e e p h o l e o p t i m i z a t i o n m a k e s

o p t i m i z a t i o n s p e e d e s p e c i a l l y c r u c i a l, a n d P O i s

s l o w e r t h a n c l a s s i c a l t a r g e t - s p e c i f i c p e e p h o l e o p t i m -

i z er s . T h e Y c o m p i l e r r u n s a t a f o u r t h t h e s p e e d o f

t h e U N I X p o r t a b l e C c o m p i l e r [ 1 0] , a n d P O u s e s

a l m o s t h a l f o f i ts ti m e . P r o p o s a l s t o s p e e d u p o p t i m -

i z e rs l i k e e o a r e a l r e a d y e m e r g i n g ~ [ 7, 1 I , 1 2 ] . T h e y

p r o p o s e t o p e r f o r m a t c o m p i l e - c o m p i l e t i m e s o m e o f

t h e s y m b o l i c s i m u l a t i o n t h a t P O p e r f o r m s a t c o m p i l e

t i m e . T h i s e n t a il s c o n s i d e r in g a t c o m p i l e - c o m p i l e

t i m e a l l p o s s i b l e p a i r s o f i n s t r u c t i o n s [ 1 2 ] o r a l l t h a t

u s e c e r ta i n r ul e s ( l ik e e l i m i n a t e r e d u n d a n t i n s tr u c -

t i o n s [ 7, 11 ] ) . N a t u r a l l y , t r a d e - o f f s a p p e a r l i k e l y - -

t h e f ir s t a p p r o a c h m a y b e c o s t ly o n s o m e m a c h i n e s ,

t h e s e c o n d m a y m i ss o p t i m i z a t i o n s , a n d b o t h m a y

g e n e r a t e u n u s e d o p t i m i z a t i o n s t h o u g h t h e p r o p o -

s a l s c e r t a i n l y m e r i t f u r t h e r i n v e s t i g a t i o n . T h e

s o f t w a r e d e s c r i b e d b e l o w c o m p l e m e n t s t h e s e

a p p r o a c h e s b y a u t o m a t i c a l l y i n f er r i n g p a t t e r n s f r o m

P O 'S b e h a v i o r o n s a m p l e d a t a .

3 . A u t o m a t i c G e n e r a t io n o f P a t te r n s

T o i m p r o v e s p e e d ,

PO

i s n o w u s e d a t c o m p i l e -

c o m p i l e t i m e t o g e n e r a t e p a t t e r n s f o r a f a s t c o m p i l e -

t i m e o p t i m i z e r , c a l l e d H O P , w h i c h m a y t h e n b e u s e d

i n P O 's p l a c e . H O P p a t t e r n s a r e e n c o d e d a s t e x t w i t h

e m b e d d e d p a t t e r n v a r i a b le s o f t h e f o r m $ i t o d e n o t e

c o n t e x t - se n s i t iv e o p e r a n d s . T h u s t h e p a t t e r n

r [ $ 1 ] = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] -

m [ 3 ]

r [ 1 ] = m [ 2 ] - m [ $ 3 ]

s p e c if i e s t h a t r e g i s t e r t r a n s f e r s l i k e

r [ 2 ] = m [ X ]

r [ 2 ] = r [ 2 ] -

m [ Y ]

s h o u l d b e r e p l a c e d w i t h

r [ 2 ] = m [ X ] - m [ Y ]

O t h e r c l a s s i c al p e e p h o l e o p t i m i z e r s u s e s i m i l a r

e n c o d i n g s [ 14 , 1 6 ] . A n a p p e n d i x g i v e s f u r t h e r e x a m -

p l es o f s u c h o p t i m i z a t i o n s a n d t h e i r a p p l i c a t io n .

i Only one of these proposals reports a prototype [12]. It

is m ore powerful than an earl y version of PO, though n ot

the cu rrent version. It co nsider s O(N ) pairs to PO'S O(N),

and, thou gh it ap pears likely that a daptations could run in

linear time, it is too early to comp are their speed with PO'S.

1 1 2

Page 3: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 3/6

H O P p a t t e r n s a r e i n f e r r e d f r o m P O 's b e h a v i o r o n a

t r a i n i n g s e t. A S a n o p t i o n , P O c a n r e c o r d e a c h

r e p l a c e m e n t i t m a k e s . F o r e x a m p l e , w h e n P O m a k e s

a r e p l a c e m e n t l i k e t h e o n e a b o v e , i t w r i t e s

r [ 2 ] = m [ X ]

r [ 2 ] = r [ 2 ] - m [ Y ]

r [ 2 ] = m [ X ] - m [ Y ]

t o a d i ag nos t i c f i le .

T h i s o u t p u t is a u t o m a t i c a l l y r e du c e d t o p a t t e r n s

b y r e p l a c i n g e a c h d i s t i n c t a s s e m b l y - t i m e c o n s t a n t

w i t h $ i . F o r e x a m p l e , t h e d i a g n o s t i c o u t p u t a b o v e

w o u l d b e c o m e

r [ $ 1 ] = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] - m [ $ 3 ]

r [ $ 1 ] = r n [ $ 2 ] - r n [ $ 3 ]

w h i c h i s t h e p a t t e r n a t t h e h e a d o f t h is se c t i o n . T h e

s y n t a x o f a s s e m b l y - t i m e c o n s t a n t s i s p o t e n t i a l l y

t a rge t - s pec i f i c . HOP i s r e t a rge ted b y s pec i fy ing th i s

s y n t a x .

PO recor ds th e l a s t u s e o f e ach reg i s t e r in e ach

b l o c k , b e c a u s e t h i s a l l o w s i t t o m a k e r e p l a c e m e n t s

t h a t w o u l d o t h e r w i s e c h a n g e t h e e f f e c t o f t h e p r o -

g r a m . W h e n t h i s i n f o r m a t i o n i s u s e d , it is a l s o

r e c o r d e d i n t h e d i a g n o s t i c o u t p u t :

r [ 2 ] = i

r [ 3 ] = m [ r [ 2 ] ] ( r [2 ] d e a d )

r [ 3 ] = m [ i ]

T h e s e o b i t u a r i e s a r e a u t o m a t i c a l l y r e d u c e d t o p a t -

t e r n s w i t h t h e r e s t o f t h e d i a g n o s t i c o u t p u t . T h u s t h e

e x a m p l e a b o v e y i el d s th e p a t t e r n

r [ $ 1 ] = $ 2

r [ $ 3 ] = m [ r [ $ 1 ] ] ( r [ $ 1 ] d e a d )

r [ $ 3 ] = m [ $ 2 ]

T h e a p p e n d i x d i s p l a y s s e v e r a l s u c h o p t i m i z a t i o n s .

A f e w p r o p o s e d p a t t e r n s a r e to o g e n e r al . F o r

e x a m p l e , t h e D E C S y s t e m - 1 0 d i a g n o s t ic o u t p u t

r [ 2 ] = m [ X ]

r [ 2 ] = r [ 2 ] + 1

m [ X ] = r [ 2 ] ( r [ 2 ] d e a d )

r n [ x ] : m [ X ] + 1

s h o u l d n o t y i e l d th e p a t t e r n

r [ $ 1 ] = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] + $ 3

m [ $ 2 ] = r [ $ 1 ] ( r [ $ 1 ] d e a d )

m [ $ 2 ] = m [ $ 2 ] + $ 3

b e c a u s e t h e r e p l a c e m e n t i s o n l y v a l id i f t h e i n c r e m e n t

$3 is 1 . T he va l id i ty o f p ro pos e d p a t t e rns l ike the one

a b o v e c o u l d b e c h e c k e d w i t h t h e m a c h i n e d e s c r i p t io n

m u c h a s P O c h e c k s p r o p o s e d c o m b i n a t i o n s o f

i n s t r u c t io n s . W h e n t h e i n s t r u c t i o n c h e c k e r d e te r -

m i n e d t h a t $ 3 c o u l d o n l y m a t c h 1 , i t c o u l d r e w r i t e t h e

p a t t e r n a c c o r d i n g l y . A t p r e s e n t , a si m p l e r e x p e d i e n t

i s u s ed : con s tan t s l i ke ze ro and on e tha t a re s pec ia l

to s o m e ins t ru c t ions ( i . e. , t ha t appea r exp l i c i t ly in the

m a c h i n e d e s c r i p t i o n ) a r e a d d e d t o a n e x c e p t i o n l i s t

and neve r rep laced wi th $ i . T h i s gene ra te s a few

e x t r a p a t t e r n s w h e n t h e s e c o n s t a n t s a p p e a r i n c o n -

t ex t s whe re they a re no t s pec ia l ( e . g . , a s reg i s t e r

ind ice s ) , bu t the nu m ber o f the s e i s s ma l l .

G i v e n t h e e s t a b l i s h e d s i m p l i c i t y o f t y p i c a l p r o -

g r a m s [ 1 3] , c o m p i l i n g a l a r g e , v a r i e d t r a i n i n g

t e s t b e d w i t h P o s h o u l d y i e l d e n o u g h d i a g n o s t i c o u t -

p u t t o g e n e r a t e m o s t n e e d e d p a t t e r n s . A t p r e s e n t, t h e

t e s t b e d i s t h e Y c o m p i l e r ' s f r o n t e n d , w h i c h c o m p i l e s

Y i n t o a s i m p l e a b s t r a c t m a c h i n e c o d e , p l u s a f e w

ex t r a t e s t c a s e s , wh ich exe rc i s e the few ope ra to rs s el -

d o m u s e d i n t h e c o m p i l e r . F i g u r e 1 p l o t s f o r t h is

t e s tb e d t h e n u m b e r o f V A X p a t te r n s g e n e r at e d

v e r su s t h e n u m b e r o f a ct u a l r e p l ac e m e n t s f r o m w h i c h

t h e p a t t e r n s a r e g e n e r a t e d . T h e p a t t e r n f il e g r o w s

rap id ly a t f i r s t and then l eve l s o f f . T he 17 ,138

r e p l a c e m e n t s g e n e r a t e o n l y 6 2 7 d i s t i n c t p a t t e r n s .

Us ing th i s pa t t e rn f i l e , HOP y ie lds the s ame re s u l t a s

P O w h e n c o m p i l i n g r o u t i n e s f r o m t h e t es t b e d . W h e n

c o m p i l i n g o t h e r t y p i c a l r o u t i n e s , H O P ' s r e s u l t s a r e

o n l y a b o u t 2 % l a r g e r t h a n P O 'S , w h i c h s u g g e s t s t h a t

e v e n t h i s s m a l l t e s t b e d i s a d e q u a t e .

U l t i m a t e l y , i t s h o u l d b e p o s s i b l e t o d o w i t h o u t a

t e s t b e d , b y u s i n g a n i n c r e m e n t a l t r a i n i n g p h a s e . T h i s

c o u l d b e i m p l e m e n t e d b y t h e f o l l o w i n g c h a n g e s t o

PO. Af te r r ep lac ing a pa i r o r t r ip l e , PO wo u ld in t e r -

n a l l y r e c o r d t h e p a t t e r n r e p r e s e n t e d b y t h e r e p l a c e -

m e n t ; i f th e p a i r o r t r i p le c o u l d n o t b e r e p l ac e d , P O

w o u l d n o t e t h i s a s w e l l. A l s o , PO w o u l d b e c h a n g e d

t o c o n s u l t t h i s r e c o r d a n d u s e t h e f a s t a l g o r i t h m

d e s c r i b e d b e l o w t o r e p l a c e o r r e j e c t j u x t a p o s i t i o n s

t h a t h a v e a p p e a r e d b e f o r e ; it w o u l d f a l l b a c k o n i ts

o r i g i n a l , sl o w e r a l g o r i t h m o n l y f o r j u x t a p o s i t i o n s

t h a t h a d n e v e r a p p e a r e d b e f o r e . T h u s P O w o u l d

r e a c h H O P ' s s p e e d a f t e r a f e w c o m p i l a t i o n s , a n d i t

w o u l d n e v e r m i s s a n o p t i m i z a t i o n d u e to i n s u f f i c ie n t

t r a i n i n g b e c a u s e P O ' s g e n e r a l m e c h a n i s m w o u l d b e

a v a i l a b l e f o r n e w j u x t a p o s i t i o n s .

1 1 3

Page 4: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 4/6

4, A Pattern-Directed Optimizer

HOP matches patterns without actua l string mani-

pulation, by separating each instruction's pattern or

skel eton from its operands as it reads them. This is

accomplished at compile time by the same procedure

used to form patterns at compile-compile time. For

example, the instruction

r [ 2 ] = r [ 2 1 - m [ Y ]

is reduced to the skeleton

r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 1

plus the ope rands 2 and Y, respectively. That is, the

instruction is represented by the triple

r [ $ 1 ] = r [ $ 1 ]

-

m [ $ 2 ] , 2 , Y

This representation is a little like conventional

assembly code. The skeleton in the first field is deter-

mined roughly by the instruction's opcode and mode

bits. The operands in the remaining fields are deter-

mined roughly by the instruction's address and regis-

ter fields.

Hashing helps HOP match patterns and form

replacements fast. HOP stores skeletons and

operands uniquely in a hash table, so an input skele-

ton is compared with a line from a pattern by merely

comparing two addresses. This operatio n is logically

similar to, and costs about the same as, comparing

two binary opcodes in a classical peephole optimizer.

If a run of input skeletons matches some complete

pattern, then inter-instruction op erand consistency is

checked, again by comparin g addresses. Finally,

HOP forms replacements without actual string mani-

pulation. The skeleton for the replacement instruc-

tion is the last line of the successful pattern, and the

operands for the replacement instruction are formed

by reordering the input operands. Thus the typical

pattern is matched and, if successful, replaced, by

comparing and moving about a dozen pointers.

One detail complicates this procedure. The $i in

input skeletons are numbered from one, so pattern-

matching without string operations requires

renumbering the $i from each line of each pattern

when the pattern file is read. For example, the input

r[4] = m[A]

r[4] = r[4] - re[B]

is transla ted into the triples

r [ $ 1 ] = m [ $ 2 1 , 4 , A

r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ] , 4 , B

as it is read. To compare such triples with the pattern

r [ $ 1 ] = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] - m [ $ 3 ]

r [ $ 1 ] = m [ $ 2 ] - m [ $ 3 ]

without string operations, the $i of the second line of

the pattern are renumbered to yield

r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ]

as the patt ern file is read. The two strings are now

identically equal and can be compared by comparing

addresses in the hash table. A record of the

renumbering is retained for checking inter-

instruction operand consistency.

The input triples above are compared with the

pattern above as follows. First, the two input skele-

tons

r[$1] = m[$2]

r[$1] = r[$1] - m[$2]

are compared with the first two (renumbered) lines of

the pattern

r [ $ 1 ] = m [ $ 2 1

r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ]

by comparing two pairs of pointers. Next, HOP

checks that $i denotes the same operand in both

input instructions. Since $1 is the only $i that

appears more than once in the original (unrenum-

bered) pattern?, this merely compares the first

operand from the first instruction (the first 4) with

the first operand from the second instruction (the

second 4), again by comparing two string table

addresses. Since all comparisons have succeeded, a

replacement inst ruction is formed. Its skeleton is the

last line of the pattern

r [ $ 1 1 = m [ $ 2 1 - m [ $ 3 ]

and its three operands are t he 4 and A from the first

instruction and the B from the second instruction.

This represents the instruction

r[4] = m[A] - m[B]

which is the desired replacement for the two instruc-

tions above.

Hashing also helps locate applicable patterns

rapidly. HOP stores its pat terns in a hash table keyed

by the hashed addresses of the (uniquely stored)

skeletons tha t each matches. Thus HOP identifies the

patterns that apply to a given input sequence by

hashing the addresses of the skeletons from the input

I'$2 appears more than once in the

renumbered

pattern,

but this is an artifact of renumbering and so does not re-

quire consistencychecking.

1 1 4

Page 5: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 5/6

sequence. If this hash table is made large enough to

make collisions rare, H O P identifies any applicable

patterns in nearly constant time.

These measures make H O P f a s t , abou t 5 times fas-

ter than PO. In a typical application, it read 269 lines,

performed 136 replacements, and wrote out the

results in 1.3 CPU seconds on a VAX-I1/780. It

spends most of its time reading its input and building

the structures above.

replacements take less

time, the pattern file

compile-compile time.

incorporated patterns

takes 120K bytes.

HOP can also be

The actual matching and

than 5% of its time. To save

is incorporated into HOP at

For the VAX, HOP plus these

take 150 K bytes where PO

used for code generation.

Abstract machines are often mapped onto real

machines by macros, and single-input replacement

patterns are essentially macros. A compiler can thus

be retargeted by writing a machine description and

some patterns for naive code generation. These will

be augmented by automatically generated optimiza-

tion patterns. The use of a single program for code

generation and optimization should make compilers

faster, simpler, and easier to retarget.

HOP can also be used on assembly code. The

hand-written patterns for code generation could emit

assembly code, for this can be mapped to and from

register transfers for PO by translators automati cally

generated from the machine description [3].

Translating assembly code to register transfers would

slow Po, but this is unimportant now that HOP has

replaced PO at compile time.

A c k n o w l e d g m e n t s

The authors thank Dave Hanson for his many

helpful comments, and Torben Nielsen for his techni-

cal assistance.

A p p e n d i x

This appendix traces the optimization of the

VAX code for

j = i + 4

The figure below gives postfix intermediate code and

corresponding naive object code for this statement.

p o s t f i x o b j e c t c o d e

1. p u s h i r [ 2 ] = m [ i ]

2 . p u s h c 4 r [ 3 ] = 4

3 . a d d r [ 2 ] = r [ 2 ] + r [ 3 ] ( r [ 3 ] d e a d )

4 . p o p j m [ j ] = r [ 2 ] ( r [ 2 ] d e a d )

Initially, the pa ttern

r [ $ 1 ] = $ 2

r [ $ 3 ] = r [ $ 3 ] + r [ $ 1 ] ( r [ $ 1 ] d e a d )

r [ $ 3 ] = r [ $ 3 ] + $ 2

replaces instructions 2 and 3 with

r [ 2 ] = r [ 2 ] ÷ 4

Next, the pattern

r [ $ 1 1 = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] + $ 3

r [ $ 1 ] = m [ $ 2 ] + $ 3

combines instruction 1 with this new instruction,

yielding

r[2] = mill + 4

Finally, the pattern

r [ $ 1 ] = m [ $ 2 ] * $ 3

m [ $ 4 ] = r [ $ 1 ] ( r [ $ 1 ] d e a d )

m [ $ 4 ] = m [ $ 2 ] + $ 3

replaces this last instruc tion and instruction 4 with

m [ j ] = m [ i ] + 4

which represents the VAX inst ruction

a d d l 3 4 , i , j

Thus the four original instructions have been

replaced with one.

References

1o

2.

4.

.

J. T. Bagwell, Jr., Local Optimizations,

SIGPLANNotices

5, 7 (July 1970), 52-66.

T. Crowley, Combining Table-driven Effect

Selection and Description-Driven Peephole

Optimization for Automatic Code Generation,

MS thesis, MIT, September 1982.

J. W. Davidson and C. W. Fraser, The Design

and Application of a Retargetable Peephole

Optimizer, A CM Trans. Prog. Lang . and

Systems 2, 2 (April 1980), 191-202.

J. W. Davidson, Simplifying Code Generation

Through Peephole Optimization PhD

dissertation, University of Arizona, December

1981.

J. W. Davidson and C. W. Fraser, Code

Selection Through Object Code Optimization,

A CM Trans. Prog. Lang. and Systems to

appear.

115

Page 6: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 6/6

6. M. Gan apa th i , C . N. F ischer and J . L .

H e nne ssy , R e t a r ge t a b l e C om pi l e r C ode

G e ne r a t i on , C o m p u t in g S u rv e y s 1 4 , 4

(D ecem ber 1982) , 573-592.

7 . R . G ie ge ri c h , A F o r m a l F r a m e w o r k f o r t he

D e r iva t i on o f M a c h ine - S pe c i f i c O p t im iz e r s ,

A C M Trans . Prog . L ang. an d Sys tem s 5 , 3

(Ju ly 1983) , 478-498.

8 . D . R . H a n s o n , T h e Y P r o g r a m m i n g L a n g u a g e,

S I G P L A N N o ti ce s

16, 2 ( Feb . 1981), 59-68.

9 . W . H a r r i son , A N e w S t r a t e gy f o r C od e

G e ne r a t i on - The G e ne r a l P u r pose O p t im iz ing

C om pi l e r ,

C o n f . R ec . 4 th A C M S y ru p. o n

P r in . o f P ro g ra m m in g L a n g u a g e s , J a n u a r y

1977, 29-37.

1 0. S . C . J o h n s o n , A P o r t a b le C o m p i l e r: T h e o r y

a nd P r a c t i c e , Conf . Rec . 5 th A C M Syrup . on

P r in . o f P ro g ra m m in g L a n g u a g e s , Jan. 1978,

97-104.

l l . P . B . Kess le r, Ma chine Dep enden c ies in

R e ta r ge t a b l e C om pi l e r C ons t r uc t i on ,

D i s s e r t a t i o n p r o p o s a l , D e p a r t m e n t o f

E le c t ri c a l Eng ine e r ing a nd C o m p u te r S c i e nc e,

Univers i ty of Ca l i forn ia , Berke ley , May 1982.

12. R .R . Kessler , Peep hole Op t imiza t ion in COG ,

O pe r a t i ng N o te 76 , U ta h S ym bo l i c

C o m p u t a t i o n G r o u p , C o m p u t e r S c i e n c e

D e p a r tm e n t , U n ive r s it y o f U ta h , June 1983 .

1 3. D . E . K n u t h , A n E m p i r ic a l S t u d y o f F o r t r a n

P r og r a m s , Sof tw are - -P rac t ice Exper ience 1 ,

2 (A pril-J un e 1 971), 105-133.

14. D . A . La m b , C ons t r u c t i on o f a P e e pho le

Opt imize r ,

S o f tw a re - - P ra c t i c e E x p e r i e n c e

11(1981), 638-647.

15. W . M . M c K e e m a n , P e e pho le O p t im iz a t i on ,

C o m m . A C M S , 7 (July 1965), 443-444.

16. A . S . T a ne n ba um , H . va n S t a ve r e n a nd J . W .

S te ve nson , U s ing P e e pho le O p t im iz a t i on on

I n t e r m e d ia t e C ode , A C M Trans. Prog. Lang.

a n d S y s t e m s 4, I (Jan ua ry 1982), 21-36.

17 . W. Wulf , R . K. Joh nss on , C . B . W eins tock , S .

O . H obbs a nd C . M . G e sc hke , The

D e s ig n o f

an Opt imiz ing Compi le r ,

Nor th Hol land, 1975.

700

Figure I URX Pa~ern File row~h

6 0 0

500

P

a 4

r 300

n

5

200

I00

L

L

i

L j

I - -

t

[ , , l

0 5000 I0000 15000 20000

e p l a c e m e n t s

1 1 6