Davidson and Fraser. Automatic Generation of Peephole Optimizations

6
8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 1/6 Proceedings of the ACM SIGPLAN '84 Symposium on Com?iler Construction SIGPLAN Notices Vol. 19, No. 6, June 198~ Automatic Generation of Peephole Optimizations~ Jack W. Davidson Dept. of Applied Mathematics and Computer Science University of Virginia Charlottesville, VA 22901 Christopher W. Fraser Dept. of Computer Science University of Arizona Tucson, AZ 85721 Abstract This paper describes a system that automatically generates peephole optimizations. A general peephole optimizer driven by a machine description produces optimizations at compile-compile time for a fast, pattern-directed, compile-time optimizer. They form part of a compiler that simplifies retarget- ing by substituting peephole optimization for case analysis. 1. Introduction Code generators often create inefficient juxtapo- sitions. For example, incrementing and testing a variable can create a redundant comparison if the code for the increment automatically sets a condition code register. Correcting this in the code generator complicates case analysis combinatorially, since each combination of language features may generate a unique juxtaposition [9], It is often cheaper to gen- erate code locally and then use a peephole optimizer to improve inefficient juxtapositions. Peephole optimization typically reduces code size by 10-50% [14, 17]. Even the new code generators driven by machine descriptions [6] benefit from peephole optimization [2]. Classical peephole optimizers [1, 14, 15, 17] rapidly correct a few hand-written, machine-specific ]This work was supported n part by the National ScienceFounda- tion under Grant MCS-7802545. Permission to copy without fee all or part of this material s grant- ed provided that the copies are not made or distributedfor direct commercialadvantage, the ACM copyright notice and title of the publication and its date appear, and notice is given hat copying s by permission of the Association for Computing Machinery. To copy otherwise,or to republish, requires a fee and or specific per- mission. ©1984 ACM 0-89791-139-3/84[0600/0111500.75 patterns. For example, the ambitious FINAL optimizer in the BLISS-Ii compiler [17] deletes unnecessary comparisons, exploits special-case instructions and exotic addressing modes, coalesces chains of branches, and deletes unreachable code. Unfortunately, good patterns can be hard to identify and are language-, compiler-, and machine-specific. A recent alternative to classical peephole optimiz- ers [3] uses a machine description to simulate adja- cent instructions, replacing them, wherever possible, with an equivalent singleton. Such machine-directed optimizers use no patterns, so they are more thorough and portable than their classical counter- parts, but they are slower. Their thoroughness allows the use of naive, easily retargeted code genera- tors, but verbose code makes optimization speed even more crucial. This paper describes a system that automatically generates patterns for a fast classical peephole optim- izer. A modern machine-directed optimizer is run at compile-compile time, and patterns for a fast, classi- cal compile-time peephole optimizer are automati- cally inferred from its output. This combines the thoroughness and retargetability of a machine- directed peephole optimizer with the speed of a clas- sical peephole optimizer. This has sped up the peephole optimization phase of a retargetable com- piler by a factor of five. 2. A Machine Directed Optimizer The system uses a retargetable pe'ephole optim- izer called PO. Other documents elaborate on PO itself [3, 4]; this paper summarizes it only enough to introduce a new application: generating patterns for a fast, classical peephole optimizer. 111

Transcript of Davidson and Fraser. Automatic Generation of Peephole Optimizations

Page 1: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 1/6

P r o c e e d in g s o f t h e

A C M S I G P L A N '8 4

S y m p o s i u m o n C o m ? i l e r

C o n s t r u c t i o n

SI G PL AN Not ices Vol . 19 , No. 6 , June 198~

A u t o m a t i c G e n e r a t i o n o f P e e p h o l e O p t i m i z a ti o n s ~

Jack W. Davidson

Dept. o f Appl ied M athemat ics and C omputer Science

University o f Virginia

Charlottesville, VA 22901

Christopher W. Fraser

Dept . of Com puter Science

Universi ty o f Arizona

Tucson, AZ 85721

A bstr a c t

T h i s p a p e r d e s c r i b es a s y s t e m t h a t a u t o m a t i c a l l y

g e n e r a t e s p e e p h o l e o p t i m i z a t i o n s . A g e n er a l

p e e p h o l e o p t i m i z e r d r i v e n b y a m a c h i n e d e s c r i p t i o n

p r o d u c e s o p t i m i z a t i o n s a t c o m p i l e - c o m p i l e t i m e f o r

a f a s t , p a t t e r n - d i r e c t e d , c o m p i l e - t i m e o p t i m i z e r .

T h e y f o r m p a r t o f a c o m p i l e r t h a t si m p l i fi e s r e t a r g e t -

i n g b y s u b s t i t u ti n g p e e p h o l e o p t i m i z a t i o n f o r c a se

a n a l y s i s .

1 . In tr o duc t io n

C o d e g e n e r a t o r s o f t e n c r e a te i n e ff i c ie n t j u x t a p o -

s i ti o n s. F o r e x a m p l e , i n c r e m e n t i n g a n d t e s ti n g a

v a r i a b l e c a n c r e a t e a r e d u n d a n t c o m p a r i s o n i f t h e

c o d e f o r t h e i n c r e m e n t a u t o m a t i c a l l y s e ts a c o n d i t i o n

c o d e r e g i s t e r. C o r r e c t i n g t h i s i n t h e c o d e g e n e r a t o r

c o m p l i c a t e s c a s e a n al y s i s c o m b i n a t o r i a l l y , s i nc e e a c h

c o m b i n a t i o n o f la n g u a g e f e a t u r es m a y g e n e r a t e a

u n i q u e j u x t a p o s i t i o n [ 9] , I t i s o f t e n c h e a p e r to g e n -

e r a t e c o d e l o c a l l y a n d t h e n u s e a p e e p h o l e o p t i m i z e r

t o i m p r o v e i n e f fi c i en t j u x t a p o s i t i o n s . P e e p h o l e

o p t i m i z a t i o n t y p i c a l l y r e d u c e s c o d e s i z e b y 1 0 - 5 0 %

[ 1 4 , 1 7] . E v e n t h e n e w c o d e g e n e r a t o r s d r i v e n b y

m a c h i n e d e s c r i p t i o n s [ 6] b e n e f it f r o m p e e p h o l e

o p t i m i z a t i o n [ 2] .

C la s s i c a l p e e p h o le o p t im iz e r s [1 , 1 4 , 1 5, 1 7 ]

r a p i d l y c o r r e c t a f e w h a n d - w r i t t e n , m a c h i n e - s p e c i f i c

]This w ork was supported n p art by the National Science Founda-

tion und er Grant MCS-7802545.

Permission to cop y without fee all or par t of this material s grant-

ed provided that the copies are n ot mad e or distributed for direct

commercial advantage, the ACM copyright notice and title o f the

publication and its date app ear, and n otice is given hat copying s

by permission of the Association for Com puting Machinery. To

copy otherwise, or to republish, requires a fee and or specificper-

mission.

©1984 ACM 0-89791-139-3/84[0600/0111500.75

p a t te r n s. F o r e x am p l e , t h e a m b i t i o u s F I N A L

o p t i m i z e r i n th e B L I S S - I i c o m p i l e r [1 7] d el e te s

u n n e c e s s a r y c o m p a r i s o n s , e x p l o i ts s p e c ia l - ca s e

i n s t r u c t i o n s a n d e x o t i c a d d r e s s i n g m o d e s , c o a l e s c e s

c h a i n s o f b r a n c h e s , a n d d e l e t e s u n r e a c h a b l e c o d e .

U n f o r t u n a t e l y , g o o d p a t t e r n s c a n b e h a r d t o i d e n ti f y

a n d a r e l a n g u a g e - , c o m p i l e r - , a n d m a c h i n e - sp e c i fi c .

A r e c e n t a l t e r n a t i v e t o c l a s s i c a l p e e p h o l e o p t i m i z -

e r s [ 3 ] us e s a m a c h i n e d e s c r i p t i o n t o s i m u l a t e a d j a -

c e n t i n s t r u c t i o n s , r e p l a c i n g t h e m , w h e r e v e r p o s s i b l e,

w i t h a n e q u i v a l e n t si n g l et o n . S u c h m a c h i n e - d i r e c t e d

o p t i m i z e r s u s e n o p a t t e r n s , s o t h e y a r e m o r e

t h o r o u g h a n d p o r t a b l e t h a n t h e i r c la s si c al c o u n t e r -

p a r t s , b u t t h e y a r e s l o w e r . T h e i r t h o r o u g h n e s s

a l l o w s t h e u s e o f n a i v e , e a si l y r e t a r g e t e d c o d e g e n e r a -

t o r s , b u t v e r b o s e c o d e m a k e s o p t i m i z a t i o n s p e e d

e v e n m o r e c r u c i a l .

T h i s p a p e r d e s c r i b e s a s y s t e m t h a t a u t o m a t i c a l l y

g e n e r a t e s p a t t e r n s f o r a f a s t c l a s s i c a l p e e p h o l e o p t i m -

i ze r . A m o d e r n m a c h i n e - d i r e c t e d o p t i m i z e r is r u n a t

compile-compile t i m e , a n d p a t t e r n s f o r a f a s t , c l a s s i -

c a l

compile-t ime

p e e p h o l e o p t i m i z e r a r e a u t o m a t i -

c a l ly i n f e rr e d f r o m i t s o u t p u t . T h i s c o m b i n e s t h e

t h o r o u g h n e s s a n d r e t a r g e t a b il i t y o f a m a c h i n e -

d i r e c t ed p e e p h o l e o p t i m i z e r w i t h t h e s p e e d o f a c la s -

s ic a l p e e p h o l e o p t i m i z e r . T h i s h a s s p e d u p t h e

p e e p h o l e o p t i m i z a t i o n p h a s e o f a r e t a r g e t a b l e c o m -

p i l e r b y a f a c t o r o f fi v e .

2 . A Ma c h ine D ir e c te d O pt imiz er

T h e s y s t e m u s e s a r e t a r g e t a b l e p e ' ep h o l e o p t i m -

i z e r c a l l e d P O . O t h e r d o c u m e n t s e l a b o r a t e o n PO

i t s e l f [ 3 , 4 ] ; t h i s p a p e r s u m m a r i z e s i t o n l y e n o u g h t o

i n t r o d u c e a n e w a p p l i c a ti o n : g e n e r a t i n g p a t te r n s f o r

a f a s t, c l a s s ic a l p e e p h o l e o p t i m i z e r .

111

Page 2: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 2/6

G i v e n a n a s s e m b l y l a n g u a g e p r o g r a m a n d a s y m -

b o l i c m a c h i n e d e s c r i p t i o n , PO s i m u l a t e s a d j a c e n t

i n s t r u c t i o n s a n d , w h e r e p o s s i b l e , r e p l a c e s t h e m w i t h

a n e q u i v a l e n t si n g le in s t r u c t i o n . E a c h m a c h i n e

d e s c r i p t i o n i s a g r a m m a r f o r s y n t a x - d i r e c t e d t r a n s l a -

t i o n b e t w e e n a s s e m b l y l a n g u a g e a n d r e g i s t e r

t r a n sf e r s . F o r e x a m p l e , t h e p r o d u c t i o n

m o v l s r e , d s t : = : d s t = s r c ; N Z = s r e ? 0 ;

d e s c r i b e s t h e V A X m o v l i n s t r u c t i o n , w h i c h c o p i e s it s

f i r s t o p e r a n d o n t o i t s s e c o n d a n d s e t s t h e c o n d i t i o n

c o d e t o r e f le c t t h e s i g n o f t h e r e s u l t. S i m i l a r p r o d u c -

t i o n s d e s c r i b e a d d r e s s i n g m o d e s .

T o i m p r o v e a n i n s t r u c t i o n , P O m u s t k n o w i ts

e f f e c t , t h a t i s , t h e r e g i s t e r t r a n s f e r s t h a t i t p e r f o r m s .

E a r l y v e r s i o n s o f PO c o m p u t e d e f fe c ts b y m a t c h i n g

a s s e m b l e r i n s t r u c t i o n s a g a i n s t t h e a s s e m b l e r s y n t a x

p a t t e r n s a b o v e a n d i n s t a n t i a t i n g t h e c o r r e s p o n d i n g

r e g is t e r t r a n s f e r p a t t e r n s . T h e m o s t r e c e n t v e r s i o n

s k i p s t h i s w i t h a c o m p i l e r t h a t e m i t s r e g i s t e r t r a n s f e r s

d i r e c t l y . R e g i s t e r tr a n s f e r s a r e n o h a r d e r t o e m i t

t h a n a s s e m b l y c o d e .

O n c e P O h a s t h e e f f e c t o f e a c h i n s t r u c t i o n , i t s y m -

b o l i c a l l y s i m u l a t e s t w o - a n d t h r e e - i n s t r u c t i o n

s e q u e n c es t o f o r m t h e i r c o m b i n e d e f f e c t . P O t h e n

s e a rc h e s t h e m a c h i n e d e s c r i p t i o n f o r a n i n s t r u c t i o n

w i t h t h i s c o m b i n e d e f f e c t . I f i t f i n d s o n e , i t r e p l a c e s

t h e o r i g i n a l i n s t ru c t i o n s w i t h t h e n e w o n e . F o r

e x a m p l e , t h e e f fe c t s o f t h e V A X i n s t r u c t i o n s

movl X , r l

s u b l 2 Y , r l

a r e

r [ 1 ] = m [ X ] ; N Z = m [ X ] ? 0 ;

r [ 1 ] = r [ 1 ] - r e [ Y ] ; N Z - r [ 1 ] - r e [ Y ] ? 0 ;

S y m b o l i c s i m u l a t i o n c o m b i n e s t h e s e t o y i e ld

r [ 1 ] = m [ X ] - r e [ Y ] ; N Z = m [ X ] - m [ Y ] ? O ;

w h i c h i s r e a l i z e d b y t h e i n s t r u c t i o n

s ub l3 Y , X , l

s o t h i s i n s t r u c t i o n r e p l a c e s t h e t w o a b o v e .

U n l i k e c l a s s i c a l p e e p h o l e o p t i m i z e r s , P O

h a s n o

p a t t e r n s : i t c o m b i n e s a l l p o s s i b l e p a i r s a n d t r i p l e s .

A s a r e s u l t , i ts e f f e c t c a n b e d e s c r ib e d f o r m a l l y a n d

c o n c i s e l y : w h e n i t is f i n i s h e d , n o o n e - , t w o - , o r

t h r e e - i n s t r u c t i o n s e q u e n c e c a n b e r e p l a c e d w i t h a

c h e a p e r s i n g l e i n s t r u c t i o n h a v i n g t h e s a m e e f f e c t .

T h i s t h o r o u g h n e s s a l l o w s c o d e g e n e r a t o r s t o f o r g o

c a s e a n a l y s i s a n d e m i t o n l y a s m a l l s u b s e t o f t h e

m a c h i n e ' s i n s t r u c t i o n s a n d a d d r e s s i n g m o d e s ( e . g . ,

o n e f o r m o f a d d , o n e f o r m o f s u b t ra c t ) . P O r e p la c e s

t h e m w i t h b e t t e r i n s t r u c t i o n s a s it c o m b i n e s a d j a c e n -

c ie s. A c o m p i l e r f o r th e p r o g r a m m i n g l a n g u a g e v [ 8]

b a s e d o n t h i s t e c h n i q u e [ 4 , 5 ] h a s b e e n r e t a r g e t e d t o

s e v e n d i f f e r e n t a r c h i t e c t u r e s , s o m e i n a s f e w a s t h r e e

m a n - d a y s . I t e m i ts c o d e c o m p a r a b l e t o h o s t - sp e c i fi c

c o m p i l e r s .

T h i s r e l i a n c e o n p e e p h o l e o p t i m i z a t i o n m a k e s

o p t i m i z a t i o n s p e e d e s p e c i a l l y c r u c i a l, a n d P O i s

s l o w e r t h a n c l a s s i c a l t a r g e t - s p e c i f i c p e e p h o l e o p t i m -

i z er s . T h e Y c o m p i l e r r u n s a t a f o u r t h t h e s p e e d o f

t h e U N I X p o r t a b l e C c o m p i l e r [ 1 0] , a n d P O u s e s

a l m o s t h a l f o f i ts ti m e . P r o p o s a l s t o s p e e d u p o p t i m -

i z e rs l i k e e o a r e a l r e a d y e m e r g i n g ~ [ 7, 1 I , 1 2 ] . T h e y

p r o p o s e t o p e r f o r m a t c o m p i l e - c o m p i l e t i m e s o m e o f

t h e s y m b o l i c s i m u l a t i o n t h a t P O p e r f o r m s a t c o m p i l e

t i m e . T h i s e n t a il s c o n s i d e r in g a t c o m p i l e - c o m p i l e

t i m e a l l p o s s i b l e p a i r s o f i n s t r u c t i o n s [ 1 2 ] o r a l l t h a t

u s e c e r ta i n r ul e s ( l ik e e l i m i n a t e r e d u n d a n t i n s tr u c -

t i o n s [ 7, 11 ] ) . N a t u r a l l y , t r a d e - o f f s a p p e a r l i k e l y - -

t h e f ir s t a p p r o a c h m a y b e c o s t ly o n s o m e m a c h i n e s ,

t h e s e c o n d m a y m i ss o p t i m i z a t i o n s , a n d b o t h m a y

g e n e r a t e u n u s e d o p t i m i z a t i o n s t h o u g h t h e p r o p o -

s a l s c e r t a i n l y m e r i t f u r t h e r i n v e s t i g a t i o n . T h e

s o f t w a r e d e s c r i b e d b e l o w c o m p l e m e n t s t h e s e

a p p r o a c h e s b y a u t o m a t i c a l l y i n f er r i n g p a t t e r n s f r o m

P O 'S b e h a v i o r o n s a m p l e d a t a .

3 . A u t o m a t i c G e n e r a t io n o f P a t te r n s

T o i m p r o v e s p e e d ,

PO

i s n o w u s e d a t c o m p i l e -

c o m p i l e t i m e t o g e n e r a t e p a t t e r n s f o r a f a s t c o m p i l e -

t i m e o p t i m i z e r , c a l l e d H O P , w h i c h m a y t h e n b e u s e d

i n P O 's p l a c e . H O P p a t t e r n s a r e e n c o d e d a s t e x t w i t h

e m b e d d e d p a t t e r n v a r i a b le s o f t h e f o r m $ i t o d e n o t e

c o n t e x t - se n s i t iv e o p e r a n d s . T h u s t h e p a t t e r n

r [ $ 1 ] = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] -

m [ 3 ]

r [ 1 ] = m [ 2 ] - m [ $ 3 ]

s p e c if i e s t h a t r e g i s t e r t r a n s f e r s l i k e

r [ 2 ] = m [ X ]

r [ 2 ] = r [ 2 ] -

m [ Y ]

s h o u l d b e r e p l a c e d w i t h

r [ 2 ] = m [ X ] - m [ Y ]

O t h e r c l a s s i c al p e e p h o l e o p t i m i z e r s u s e s i m i l a r

e n c o d i n g s [ 14 , 1 6 ] . A n a p p e n d i x g i v e s f u r t h e r e x a m -

p l es o f s u c h o p t i m i z a t i o n s a n d t h e i r a p p l i c a t io n .

i Only one of these proposals reports a prototype [12]. It

is m ore powerful than an earl y version of PO, though n ot

the cu rrent version. It co nsider s O(N ) pairs to PO'S O(N),

and, thou gh it ap pears likely that a daptations could run in

linear time, it is too early to comp are their speed with PO'S.

1 1 2

Page 3: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 3/6

H O P p a t t e r n s a r e i n f e r r e d f r o m P O 's b e h a v i o r o n a

t r a i n i n g s e t. A S a n o p t i o n , P O c a n r e c o r d e a c h

r e p l a c e m e n t i t m a k e s . F o r e x a m p l e , w h e n P O m a k e s

a r e p l a c e m e n t l i k e t h e o n e a b o v e , i t w r i t e s

r [ 2 ] = m [ X ]

r [ 2 ] = r [ 2 ] - m [ Y ]

r [ 2 ] = m [ X ] - m [ Y ]

t o a d i ag nos t i c f i le .

T h i s o u t p u t is a u t o m a t i c a l l y r e du c e d t o p a t t e r n s

b y r e p l a c i n g e a c h d i s t i n c t a s s e m b l y - t i m e c o n s t a n t

w i t h $ i . F o r e x a m p l e , t h e d i a g n o s t i c o u t p u t a b o v e

w o u l d b e c o m e

r [ $ 1 ] = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] - m [ $ 3 ]

r [ $ 1 ] = r n [ $ 2 ] - r n [ $ 3 ]

w h i c h i s t h e p a t t e r n a t t h e h e a d o f t h is se c t i o n . T h e

s y n t a x o f a s s e m b l y - t i m e c o n s t a n t s i s p o t e n t i a l l y

t a rge t - s pec i f i c . HOP i s r e t a rge ted b y s pec i fy ing th i s

s y n t a x .

PO recor ds th e l a s t u s e o f e ach reg i s t e r in e ach

b l o c k , b e c a u s e t h i s a l l o w s i t t o m a k e r e p l a c e m e n t s

t h a t w o u l d o t h e r w i s e c h a n g e t h e e f f e c t o f t h e p r o -

g r a m . W h e n t h i s i n f o r m a t i o n i s u s e d , it is a l s o

r e c o r d e d i n t h e d i a g n o s t i c o u t p u t :

r [ 2 ] = i

r [ 3 ] = m [ r [ 2 ] ] ( r [2 ] d e a d )

r [ 3 ] = m [ i ]

T h e s e o b i t u a r i e s a r e a u t o m a t i c a l l y r e d u c e d t o p a t -

t e r n s w i t h t h e r e s t o f t h e d i a g n o s t i c o u t p u t . T h u s t h e

e x a m p l e a b o v e y i el d s th e p a t t e r n

r [ $ 1 ] = $ 2

r [ $ 3 ] = m [ r [ $ 1 ] ] ( r [ $ 1 ] d e a d )

r [ $ 3 ] = m [ $ 2 ]

T h e a p p e n d i x d i s p l a y s s e v e r a l s u c h o p t i m i z a t i o n s .

A f e w p r o p o s e d p a t t e r n s a r e to o g e n e r al . F o r

e x a m p l e , t h e D E C S y s t e m - 1 0 d i a g n o s t ic o u t p u t

r [ 2 ] = m [ X ]

r [ 2 ] = r [ 2 ] + 1

m [ X ] = r [ 2 ] ( r [ 2 ] d e a d )

r n [ x ] : m [ X ] + 1

s h o u l d n o t y i e l d th e p a t t e r n

r [ $ 1 ] = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] + $ 3

m [ $ 2 ] = r [ $ 1 ] ( r [ $ 1 ] d e a d )

m [ $ 2 ] = m [ $ 2 ] + $ 3

b e c a u s e t h e r e p l a c e m e n t i s o n l y v a l id i f t h e i n c r e m e n t

$3 is 1 . T he va l id i ty o f p ro pos e d p a t t e rns l ike the one

a b o v e c o u l d b e c h e c k e d w i t h t h e m a c h i n e d e s c r i p t io n

m u c h a s P O c h e c k s p r o p o s e d c o m b i n a t i o n s o f

i n s t r u c t io n s . W h e n t h e i n s t r u c t i o n c h e c k e r d e te r -

m i n e d t h a t $ 3 c o u l d o n l y m a t c h 1 , i t c o u l d r e w r i t e t h e

p a t t e r n a c c o r d i n g l y . A t p r e s e n t , a si m p l e r e x p e d i e n t

i s u s ed : con s tan t s l i ke ze ro and on e tha t a re s pec ia l

to s o m e ins t ru c t ions ( i . e. , t ha t appea r exp l i c i t ly in the

m a c h i n e d e s c r i p t i o n ) a r e a d d e d t o a n e x c e p t i o n l i s t

and neve r rep laced wi th $ i . T h i s gene ra te s a few

e x t r a p a t t e r n s w h e n t h e s e c o n s t a n t s a p p e a r i n c o n -

t ex t s whe re they a re no t s pec ia l ( e . g . , a s reg i s t e r

ind ice s ) , bu t the nu m ber o f the s e i s s ma l l .

G i v e n t h e e s t a b l i s h e d s i m p l i c i t y o f t y p i c a l p r o -

g r a m s [ 1 3] , c o m p i l i n g a l a r g e , v a r i e d t r a i n i n g

t e s t b e d w i t h P o s h o u l d y i e l d e n o u g h d i a g n o s t i c o u t -

p u t t o g e n e r a t e m o s t n e e d e d p a t t e r n s . A t p r e s e n t, t h e

t e s t b e d i s t h e Y c o m p i l e r ' s f r o n t e n d , w h i c h c o m p i l e s

Y i n t o a s i m p l e a b s t r a c t m a c h i n e c o d e , p l u s a f e w

ex t r a t e s t c a s e s , wh ich exe rc i s e the few ope ra to rs s el -

d o m u s e d i n t h e c o m p i l e r . F i g u r e 1 p l o t s f o r t h is

t e s tb e d t h e n u m b e r o f V A X p a t te r n s g e n e r at e d

v e r su s t h e n u m b e r o f a ct u a l r e p l ac e m e n t s f r o m w h i c h

t h e p a t t e r n s a r e g e n e r a t e d . T h e p a t t e r n f il e g r o w s

rap id ly a t f i r s t and then l eve l s o f f . T he 17 ,138

r e p l a c e m e n t s g e n e r a t e o n l y 6 2 7 d i s t i n c t p a t t e r n s .

Us ing th i s pa t t e rn f i l e , HOP y ie lds the s ame re s u l t a s

P O w h e n c o m p i l i n g r o u t i n e s f r o m t h e t es t b e d . W h e n

c o m p i l i n g o t h e r t y p i c a l r o u t i n e s , H O P ' s r e s u l t s a r e

o n l y a b o u t 2 % l a r g e r t h a n P O 'S , w h i c h s u g g e s t s t h a t

e v e n t h i s s m a l l t e s t b e d i s a d e q u a t e .

U l t i m a t e l y , i t s h o u l d b e p o s s i b l e t o d o w i t h o u t a

t e s t b e d , b y u s i n g a n i n c r e m e n t a l t r a i n i n g p h a s e . T h i s

c o u l d b e i m p l e m e n t e d b y t h e f o l l o w i n g c h a n g e s t o

PO. Af te r r ep lac ing a pa i r o r t r ip l e , PO wo u ld in t e r -

n a l l y r e c o r d t h e p a t t e r n r e p r e s e n t e d b y t h e r e p l a c e -

m e n t ; i f th e p a i r o r t r i p le c o u l d n o t b e r e p l ac e d , P O

w o u l d n o t e t h i s a s w e l l. A l s o , PO w o u l d b e c h a n g e d

t o c o n s u l t t h i s r e c o r d a n d u s e t h e f a s t a l g o r i t h m

d e s c r i b e d b e l o w t o r e p l a c e o r r e j e c t j u x t a p o s i t i o n s

t h a t h a v e a p p e a r e d b e f o r e ; it w o u l d f a l l b a c k o n i ts

o r i g i n a l , sl o w e r a l g o r i t h m o n l y f o r j u x t a p o s i t i o n s

t h a t h a d n e v e r a p p e a r e d b e f o r e . T h u s P O w o u l d

r e a c h H O P ' s s p e e d a f t e r a f e w c o m p i l a t i o n s , a n d i t

w o u l d n e v e r m i s s a n o p t i m i z a t i o n d u e to i n s u f f i c ie n t

t r a i n i n g b e c a u s e P O ' s g e n e r a l m e c h a n i s m w o u l d b e

a v a i l a b l e f o r n e w j u x t a p o s i t i o n s .

1 1 3

Page 4: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 4/6

4, A Pattern-Directed Optimizer

HOP matches patterns without actua l string mani-

pulation, by separating each instruction's pattern or

skel eton from its operands as it reads them. This is

accomplished at compile time by the same procedure

used to form patterns at compile-compile time. For

example, the instruction

r [ 2 ] = r [ 2 1 - m [ Y ]

is reduced to the skeleton

r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 1

plus the ope rands 2 and Y, respectively. That is, the

instruction is represented by the triple

r [ $ 1 ] = r [ $ 1 ]

-

m [ $ 2 ] , 2 , Y

This representation is a little like conventional

assembly code. The skeleton in the first field is deter-

mined roughly by the instruction's opcode and mode

bits. The operands in the remaining fields are deter-

mined roughly by the instruction's address and regis-

ter fields.

Hashing helps HOP match patterns and form

replacements fast. HOP stores skeletons and

operands uniquely in a hash table, so an input skele-

ton is compared with a line from a pattern by merely

comparing two addresses. This operatio n is logically

similar to, and costs about the same as, comparing

two binary opcodes in a classical peephole optimizer.

If a run of input skeletons matches some complete

pattern, then inter-instruction op erand consistency is

checked, again by comparin g addresses. Finally,

HOP forms replacements without actual string mani-

pulation. The skeleton for the replacement instruc-

tion is the last line of the successful pattern, and the

operands for the replacement instruction are formed

by reordering the input operands. Thus the typical

pattern is matched and, if successful, replaced, by

comparing and moving about a dozen pointers.

One detail complicates this procedure. The $i in

input skeletons are numbered from one, so pattern-

matching without string operations requires

renumbering the $i from each line of each pattern

when the pattern file is read. For example, the input

r[4] = m[A]

r[4] = r[4] - re[B]

is transla ted into the triples

r [ $ 1 ] = m [ $ 2 1 , 4 , A

r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ] , 4 , B

as it is read. To compare such triples with the pattern

r [ $ 1 ] = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] - m [ $ 3 ]

r [ $ 1 ] = m [ $ 2 ] - m [ $ 3 ]

without string operations, the $i of the second line of

the pattern are renumbered to yield

r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ]

as the patt ern file is read. The two strings are now

identically equal and can be compared by comparing

addresses in the hash table. A record of the

renumbering is retained for checking inter-

instruction operand consistency.

The input triples above are compared with the

pattern above as follows. First, the two input skele-

tons

r[$1] = m[$2]

r[$1] = r[$1] - m[$2]

are compared with the first two (renumbered) lines of

the pattern

r [ $ 1 ] = m [ $ 2 1

r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ]

by comparing two pairs of pointers. Next, HOP

checks that $i denotes the same operand in both

input instructions. Since $1 is the only $i that

appears more than once in the original (unrenum-

bered) pattern?, this merely compares the first

operand from the first instruction (the first 4) with

the first operand from the second instruction (the

second 4), again by comparing two string table

addresses. Since all comparisons have succeeded, a

replacement inst ruction is formed. Its skeleton is the

last line of the pattern

r [ $ 1 1 = m [ $ 2 1 - m [ $ 3 ]

and its three operands are t he 4 and A from the first

instruction and the B from the second instruction.

This represents the instruction

r[4] = m[A] - m[B]

which is the desired replacement for the two instruc-

tions above.

Hashing also helps locate applicable patterns

rapidly. HOP stores its pat terns in a hash table keyed

by the hashed addresses of the (uniquely stored)

skeletons tha t each matches. Thus HOP identifies the

patterns that apply to a given input sequence by

hashing the addresses of the skeletons from the input

I'$2 appears more than once in the

renumbered

pattern,

but this is an artifact of renumbering and so does not re-

quire consistencychecking.

1 1 4

Page 5: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 5/6

sequence. If this hash table is made large enough to

make collisions rare, H O P identifies any applicable

patterns in nearly constant time.

These measures make H O P f a s t , abou t 5 times fas-

ter than PO. In a typical application, it read 269 lines,

performed 136 replacements, and wrote out the

results in 1.3 CPU seconds on a VAX-I1/780. It

spends most of its time reading its input and building

the structures above.

replacements take less

time, the pattern file

compile-compile time.

incorporated patterns

takes 120K bytes.

HOP can also be

The actual matching and

than 5% of its time. To save

is incorporated into HOP at

For the VAX, HOP plus these

take 150 K bytes where PO

used for code generation.

Abstract machines are often mapped onto real

machines by macros, and single-input replacement

patterns are essentially macros. A compiler can thus

be retargeted by writing a machine description and

some patterns for naive code generation. These will

be augmented by automatically generated optimiza-

tion patterns. The use of a single program for code

generation and optimization should make compilers

faster, simpler, and easier to retarget.

HOP can also be used on assembly code. The

hand-written patterns for code generation could emit

assembly code, for this can be mapped to and from

register transfers for PO by translators automati cally

generated from the machine description [3].

Translating assembly code to register transfers would

slow Po, but this is unimportant now that HOP has

replaced PO at compile time.

A c k n o w l e d g m e n t s

The authors thank Dave Hanson for his many

helpful comments, and Torben Nielsen for his techni-

cal assistance.

A p p e n d i x

This appendix traces the optimization of the

VAX code for

j = i + 4

The figure below gives postfix intermediate code and

corresponding naive object code for this statement.

p o s t f i x o b j e c t c o d e

1. p u s h i r [ 2 ] = m [ i ]

2 . p u s h c 4 r [ 3 ] = 4

3 . a d d r [ 2 ] = r [ 2 ] + r [ 3 ] ( r [ 3 ] d e a d )

4 . p o p j m [ j ] = r [ 2 ] ( r [ 2 ] d e a d )

Initially, the pa ttern

r [ $ 1 ] = $ 2

r [ $ 3 ] = r [ $ 3 ] + r [ $ 1 ] ( r [ $ 1 ] d e a d )

r [ $ 3 ] = r [ $ 3 ] + $ 2

replaces instructions 2 and 3 with

r [ 2 ] = r [ 2 ] ÷ 4

Next, the pattern

r [ $ 1 1 = m [ $ 2 ]

r [ $ 1 ] = r [ $ 1 ] + $ 3

r [ $ 1 ] = m [ $ 2 ] + $ 3

combines instruction 1 with this new instruction,

yielding

r[2] = mill + 4

Finally, the pattern

r [ $ 1 ] = m [ $ 2 ] * $ 3

m [ $ 4 ] = r [ $ 1 ] ( r [ $ 1 ] d e a d )

m [ $ 4 ] = m [ $ 2 ] + $ 3

replaces this last instruc tion and instruction 4 with

m [ j ] = m [ i ] + 4

which represents the VAX inst ruction

a d d l 3 4 , i , j

Thus the four original instructions have been

replaced with one.

References

1o

2.

4.

.

J. T. Bagwell, Jr., Local Optimizations,

SIGPLANNotices

5, 7 (July 1970), 52-66.

T. Crowley, Combining Table-driven Effect

Selection and Description-Driven Peephole

Optimization for Automatic Code Generation,

MS thesis, MIT, September 1982.

J. W. Davidson and C. W. Fraser, The Design

and Application of a Retargetable Peephole

Optimizer, A CM Trans. Prog. Lang . and

Systems 2, 2 (April 1980), 191-202.

J. W. Davidson, Simplifying Code Generation

Through Peephole Optimization PhD

dissertation, University of Arizona, December

1981.

J. W. Davidson and C. W. Fraser, Code

Selection Through Object Code Optimization,

A CM Trans. Prog. Lang. and Systems to

appear.

115

Page 6: Davidson and Fraser. Automatic Generation of Peephole Optimizations

8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations

http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 6/6

6. M. Gan apa th i , C . N. F ischer and J . L .

H e nne ssy , R e t a r ge t a b l e C om pi l e r C ode

G e ne r a t i on , C o m p u t in g S u rv e y s 1 4 , 4

(D ecem ber 1982) , 573-592.

7 . R . G ie ge ri c h , A F o r m a l F r a m e w o r k f o r t he

D e r iva t i on o f M a c h ine - S pe c i f i c O p t im iz e r s ,

A C M Trans . Prog . L ang. an d Sys tem s 5 , 3

(Ju ly 1983) , 478-498.

8 . D . R . H a n s o n , T h e Y P r o g r a m m i n g L a n g u a g e,

S I G P L A N N o ti ce s

16, 2 ( Feb . 1981), 59-68.

9 . W . H a r r i son , A N e w S t r a t e gy f o r C od e

G e ne r a t i on - The G e ne r a l P u r pose O p t im iz ing

C om pi l e r ,

C o n f . R ec . 4 th A C M S y ru p. o n

P r in . o f P ro g ra m m in g L a n g u a g e s , J a n u a r y

1977, 29-37.

1 0. S . C . J o h n s o n , A P o r t a b le C o m p i l e r: T h e o r y

a nd P r a c t i c e , Conf . Rec . 5 th A C M Syrup . on

P r in . o f P ro g ra m m in g L a n g u a g e s , Jan. 1978,

97-104.

l l . P . B . Kess le r, Ma chine Dep enden c ies in

R e ta r ge t a b l e C om pi l e r C ons t r uc t i on ,

D i s s e r t a t i o n p r o p o s a l , D e p a r t m e n t o f

E le c t ri c a l Eng ine e r ing a nd C o m p u te r S c i e nc e,

Univers i ty of Ca l i forn ia , Berke ley , May 1982.

12. R .R . Kessler , Peep hole Op t imiza t ion in COG ,

O pe r a t i ng N o te 76 , U ta h S ym bo l i c

C o m p u t a t i o n G r o u p , C o m p u t e r S c i e n c e

D e p a r tm e n t , U n ive r s it y o f U ta h , June 1983 .

1 3. D . E . K n u t h , A n E m p i r ic a l S t u d y o f F o r t r a n

P r og r a m s , Sof tw are - -P rac t ice Exper ience 1 ,

2 (A pril-J un e 1 971), 105-133.

14. D . A . La m b , C ons t r u c t i on o f a P e e pho le

Opt imize r ,

S o f tw a re - - P ra c t i c e E x p e r i e n c e

11(1981), 638-647.

15. W . M . M c K e e m a n , P e e pho le O p t im iz a t i on ,

C o m m . A C M S , 7 (July 1965), 443-444.

16. A . S . T a ne n ba um , H . va n S t a ve r e n a nd J . W .

S te ve nson , U s ing P e e pho le O p t im iz a t i on on

I n t e r m e d ia t e C ode , A C M Trans. Prog. Lang.

a n d S y s t e m s 4, I (Jan ua ry 1982), 21-36.

17 . W. Wulf , R . K. Joh nss on , C . B . W eins tock , S .

O . H obbs a nd C . M . G e sc hke , The

D e s ig n o f

an Opt imiz ing Compi le r ,

Nor th Hol land, 1975.

700

Figure I URX Pa~ern File row~h

6 0 0

500

P

a 4

r 300

n

5

200

I00

L

L

i

L j

I - -

t

[ , , l

0 5000 I0000 15000 20000

e p l a c e m e n t s

1 1 6