Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov...

17
Improving performance of prototype recognition in Prefab By Orkhan Muradov A senior thesis submitted in partial fulfillment of The requirements for the degree of Bachelor of Science With Departmental Honors Computer Science & Engineering University of Washington June 2011 Thesis and presentation approved by _____________________________________ Date _____________

Transcript of Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov...

Page 1: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Improving performance of prototype recognition in Prefab

By

Orkhan Muradov

A senior thesis submitted in partial fulfillment of

The requirements for the degree of

Bachelor of Science

With Departmental Honors

Computer Science & Engineering

University of Washington

June 2011

Thesis and presentation approved by

_____________________________________

Date

_____________

Page 2: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

Abst

windoby revapplicnew wwheneanothevery im

1. In

indepeinteracinterprthe soudemoncursorpresencustomexecut

an Muradov

tract

Because mow building bverse engineeation has a g

ways. Most oever we use ter. In order tmportant.

ntroducti

Using Prefendently of tction in the creted using Purce windownstrated manr, visualizingnting softwarmization, andting this pixe

most graphicablocks, Prefaering an integreat potentiof the applicathem we opeto identify al

on and M

fab we can ntheir underlycontext of rePrefab. Thenw [Figure 1].ny potential eg state changre tutorials, ad more. All tel-based feat

Prefab

al user interfab helps to rerface and bual for identifations that werate very qull the parts o

Motivatio

now interpretying implemeeal application various enh. Papers preenhancemen

ges with phoadaptive GUthese interacture identific

b’s pixel-ba

faces that werecognize theuilding a treefying the stru

we use on daiuickly by scrf the interfac

on

t pixels of anentation. Thons. Pixels ahancements sented at CH

nts: dynamicsphor, param

UI visualizatiction techniqcation cycle

Figure

ased widge

e see nowadaem by lookine structure reructure of intily basis havrolling and sce correctly,

n existing aphis help reseaare first copiecan be adde

HI 2010 [1] aally expandi

meter spectruion, languagques recognimultiple tim

1

et identific

ays consist ong only at raepresenting tterfaces and

ve many widswitching fro, the speed o

pplication anarchers test aed from a sod, with inpuand CHI 201ing the motoum previews

ge translationze widget la

mes.

ation cycle

of various buaw pixels. Prthat interfaccustomizing

dgets and butom one windof this applic

nd modify thand deploy tource windowut then mapp11 [2] have or space withs with side vn, user interfayout in real

e

1

uttons or refab works e. This g them in ttons, and dow to cation is

hem their w and ed back to

h bubble views, face time by

Page 3: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

Prefabchoosedecisiothe co

pixel ia matcThe mthread

Befo•Find•Buil

rela

an Muradov

Successfullyb’s original ies a non-tranon tree in whlor of the pix

During runis the hotspoched feature.

multi-threadeds to look for

ore Runtimed non-transparentld a decision tree wative to the hotspo

y executing tmplementatinsparent pixehich every nxel at that of

De

ntime this imot pixel of a f. If the pixeld version of r features.

t hotspots for eachwhere each node ot and edges repre

this cycle reion uses a deel hotspot pi

node stores affset.

ecision tree

mplementatiofeature. It m doesn’t mat

f this implem

h feature in the librepresents an offs

esent pixel colors

quires a metecision tree ixel for eachn offset rela

Figure 2

of a Build Tr

on scans thromoves through

tch a pixel inmentation div

braryset

Du•T

m

thod for rapi[Figure 2]. T

h feature in thative to the h

2

Tree impleme

ough the imah the image n the decisiovides a bitma

uring RuntimeTry to match the hmatches then trav

idly finding This Build The library. T

hotspot and e

entation

age only oncand traverse

on tree, then ap into segm

hotspot pixel int heerse the tree unti

features in aTree approachThen Prefab bevery edge re

ce by checkines the tree untree traversa

ments and use

e decision tree. If il the feature is fou

2

an image. h first builds a epresents

ng if the ntil it finds al ends. es multiple

it und

Page 4: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

2. Ex

prototyperforalgorit

2.1 P

of thispixel ithis femore f

improvlarge, below

B•C•C

an Muradov

xamining

It is difficuypes is relati

rmance of feathms, benchm

Partially

Before the s structure is in the main ieature is addefeatures to th

The goal oving the perwe started u describes th

1 Bui2 Init3 FO4 5 6 7 8 9 10 11 12 13

efore RuntimeConvert every featCreate empty list o

g Strateg

ult to say howively small aature detectimarking soft

Matchin

execution aa match nod

image and ifed into the phe partial ma

of this approformance. B

using differenhe algorithm

ild a list of ftialize a list

OR each pixeIF pixel ENDIF IF pixel

eture into a linked of partial matches

gies for Fi

w reliable thand there areion is very imftware, and a

ng

all the featurede that specif a pixel matcpartial matchatches and ad

oach was to vBecause the int data struc

m:

features namof features nl in image l matches theMove to theIF Feature.N We hENDIF

l matches theIF Feature.N We hELSE

list of pixels (1)s (2)

inding F

he initial algoe no other almportant fora larger librar

es are conveifies that a feches the firs

hes data strucdvance corre

visit each pinitial collect

ctures in orde

med STARTnamed PART

e first pixel e next node oNext is a Mahave a match

e first pixel Next is a Mahave a match

Du•T

p•If

o

eatures

orithm is beclgorithms to r Prefab’s funary.

erted into a lieature was fot pixel of thecture. As weesponding n

xel in the imtion of linkeer to improv

TIAL

of feature inof linked listatch THEN h

of feature inatch THEN h

uring RuntimeTry to match next partial matches unf next pixel in the

of the feature then

cause Prefabcompare winctionality,

inked list of ound. The pre feature in t

e iterate throuodes in the p

mage only oned list featureve retrieval ti

n PARTIAL t in PARTIA

n START TH

pixel in the imagentil we get to the e

image is the samen add to Partial ma

b’s current liith. Because we develope

f pixels. The rogram visitthe data struugh the imagpartial match

nce, therefores could becime. Pseudo

THEN AL

HEN

e to next node in thend of linked list (4e as the starting piatches (10-16)

3

ibrary of

ed other

last node ts each ucture then ge we add hes.

re come very o code

he 4-9)ixel

Page 5: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

In ord

List of

slowly Hashta

list of data stkey an

2D Ar

compathe imfeature Later ldrama 2.2 I

Intrectan

where

Corne in th integr

an Muradov

14 15 16 17 EN

er to improv

f list of featuThis was in

y with a larg

able In the first

features thattructure wasnd correspon

rray

This implearing to the h

mage. Each coes

look ahead hatically.

Integral I

tegral imagegular subset

each value

Figure 3 ers of rectanghe image forral sum calc

ENDIF

NDFOR

ve the results

ures nitial structuer library of

implementat start with t getting very

nding value w

ementation whashcode calolumn/row c

hotspot was

Image

e is an algorit of a grid [F

in the image

beffroothcom

gle r ulation

Add ENDIF

s, various da

ure that was uf features.

ation of hashthis pixel. Thy large. The was a linked

was much faslculation in hcorresponds

added to this

thm for quicigure 2]. Ev

e correspond

This algorfore the runt

om the imageher features cmparing to t

feature.Nex

ata structures

used in the i

htable, the kehis appearednext implem

d list of featu

ster than the hashtable. Tto the colum

s implement

ckly and effiery row/colu

ds to the pixe

rithm was vetime executioe and keepincan be easilythe integral s

xt from STA

s for START

implementat

ey was a pixd to behave smentation of ures.

hashtable bThis matrix hmn/row in the

tation, which

iciently geneumn in the in

el value in th

ery promisinon, which in

ng features iny filtered outsum in the in

ART to PART

T and PARTI

tion but it ap

xel and the vaslowly becauf hashtable in

because retriehas the same e image and

h improved t

erating the suntegral imag

he integer for

g because mncludes calcun a linked list by calculatntegral imag

TIAL

IAL were us

ppeared to be

alue was a liuse the partiancluded colu

eval time waheight and w

d is linked to

the results

um of valuesge is calculat

rmat.

most of the wulating integst data structting integrale.

4

sed:

ehave very

ist of linked al matches umn as a

as instant width as the list of

s in a ted by:

work is done gral image ture. Many sum and

Page 6: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

togThrecres

an

horecmanohasamare

an Muradov

During thegether as a lihe last node ictangles by rsult gives us

d then saved

During the otspot pixel mctangle in thatches the va

ode. When wappens becaume. In fact, te reported if

Befo• Find• Divi• Calc• For a• Mar

HotspotNode

e preprocessiinked list. Inis a match norecursing to a minimal n

d as a check

runtime we matches the ie image of thalue in the ch

we get to the muse pixels mithroughout o

f the feature i

re Runtime Hotspotde every feature intoulate integral sum ofa linked list from thek last node a match n

t ChN

ing, stage evn this linked ode which lithe left, up,

number of re

node in the

go through timage pixel,he same areaheck node, wmatch node ight be in difour experimeis not also ex

o rectanglesf every rectanglese rectanglesnode

heck ode

very feature ilist the first inks to the feright, and th

ectangles. In

linked list. A

the main ima, then the nea as the next

we advance iwe need to efferent orderent it was coxactly match

Du• Tr• Ca• If

re• Ex

CheckNode

is divided innode is a ho

eature itself.hen down of ntegral sum f

As a result ou

age trying toxt step is tot check nodein the linkedexactly matcr in the rectaoncluded thahed.

uring Runtimery to match first nodalculate integral sumMatches advance in

eachedxactly match

k e

M

nto rectangleotspot which The feature

f the transparfor each recta

ur feature w

o match a hocalculate int

e. If integral d list until wech every pixangle but theat a large num

de by iterating througm of this rectanglen a linked list until ma

Match Node

es and then ch points to che was dividedrent pixel whangle is calc

will look as fo

otspot pixel. tegral sum fosum of this e get to the m

xel of the feae sum can stamber of false

gh the image

atch node is

FeatCla

5

combined heck nodes. d into hich as a culated by

ollows:

If the for the rectangle

match ature. This ay the e matches

ure ss

Page 7: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

2.

teccalwoonke

fasmafeafea

Ps

1 2 3 4 5 6 7

8 9 10 11 12

an Muradov

.3 Findin

After matcchnique was lculating theorks becausen small featury correspond

Matching mster method.atches to filtatures, in thiatures. This t

eudo-code f

IF tree con Sav IF a

0 EN Ad

2 ENDIF

ng Hotspo

ching integraused for fin

e integral sume most of theres. During pds to the hot

more than on After runnin

ter out big nuis case, tree btree is shallo

for the two h

ntains pixel ove next levelany of the sa

FOR ev

ENDFO

NDIF dd NEW to t

Before R•Find one o•Link to an•Build a tre

hotspots

ot and Ex

al sum of recnding featurem of the recte features in pre-processitspot pixel an

ne hotspot anng various teumber of feabehaves fastow and has h

hotspot and e

on the first lel pointed froaved nodes ivery feature iIF exactly m Dele MatcENDIF

OR

the saved po

untimeor more Hotspots actual feature

ee by combining si

xactly M

ctangles in thes in the imatangle, we cathe library a

ing stage, thend the value

nd then exacests it was coatures. Insteater. Every noheight of thre

exactly match

evel THENm this noden LIST contin the featur

match THENete from savech was found

inters LIST;

imilar

atching

he image, anage. After maan exactly mare small ande features we

e is the list of

ctly matchinoncluded thaad of using hode in the treee.

hing algorith

as NEW; tain pixel THres

N ed d

;

During Run•Try to matc

through the•Match othe•If rich the e

nother similaatching hotsp

match the fead exactly maere saved in f features tha

ng the featureat it usually hashtable to ee is a hotspo

hm is shown

HEN

ntimech first node by itee image (1-2)er hotspots if availnd then exactly m

ar but very sipot pixel, in

ature right awatching work

the hashtabat start with

e proved to btakes aroundcontain all tot and leave

n below:

erating

able (3)match (4- 9)

6

imple stead of way. This ks very fast le where this pixel.

be a much d 2 hotspot the s are list of

Page 8: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

2.4

dealgcofea

an Muradov

4 Thresh

This alcision whethgorithm provnsists of smaature’s size i

2.5 Aho The Ahfinds elemsimultaneoadditional transitions prefix [Figneed for ba

If the size of less than the•Find one or m•Link to an act•Build a tree b

similar hotsp

old of Pr

lgorithm comher to calculved to behavall features tis greater tha

o - Corasiho–Corasick ents of a fini

ously. This alinks betweebetween fai

gure 4]. This acktracking.

a feature is e limitmore Hotspotstual featureby combining ots

Before Runtim• Set a li

revious T

mbines aboveate and chec

ve very slowlthis algorithman the limit t

ick string matchite set of stri

algorithm conen the variouiled pattern maids the aut

memit

If thless• Fin• Div

rec• Cal

rec• For

rec• Ma

Two Tech

e described ack integral suly with smalm has an advthen we do c

hing algorithings within anstructs a finus internal nmatches to oomaton to tr

he size of a featus than the limitd Hotspotvide every feature intanglesculate integral sum otangle

r a linked list from thtangles

ark last node a match

hniques

algorithms ium or not. Fll features. Svantage by ncalculate inte

hm is a dictioan input textnite state ma

nodes. These other brancheransition bet

ure is

nto

of every

ese

h node

in such a wayFrom the resuSince the librnot calculatinegral sum.

onary-matcht and matcheachine that lo

extra internes of the trietween pattern

During Ru•Try to matcnode by itethrough the

•If check nocalculate insum

•Otherwise match

y that it makults integral rary of featung integral s

hing algorithes all patternook like a tri

nal links allowe that share an matches w

untimech first

erating e imagede exists

ntegral

exactly

7

kes a image

ures mostly sum. If the

hm that ns ie with w fast

a common without the

Page 9: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

Trie

an Muradov

Tdepends onruntime, soresult in thbigger thantested on a Fo{blue, red,

e includes th

Before R• Convert • Select ho• Build a li• Build the

his algorithmnly on followo during the he node then n 1, they are artificially cror instance le green, green

Aho-Cora

he following

Runtimeevery feature into ro

ot row by selecting thnked list from hot ro

e Trie

m lets us viswing the tranruntime we we found a divided in t

reated featuret’s build a tn}, {blue, gr

sick Trie wit

features: {b

owhe hotspotow and other rows

it every pixensitions in thjust need to feature. Bec

to the linked es of height trie from the reen, green}

Figure 4th transition

blue, red}, {b{green, b

Dur• Sea• If tr

dat• Go

ma

el in the imahe trie. The t

follow the pcause featured list of 1D fe

one. following f, {green, blu

4 ns and fail-tr

blue, red, greblue, red}.

ring Runtimearch the trie for pixelrie node contains reta structures (5)through the hot rowtch them (6)

age only oncetrie and featupointers in thes most of theatures. At f

features of heue, red}.

ransitions

een, green},

ls by following the tresults add them to co

ws and middle rows

e and the perures are builthe trie. If wehe time have first this algo

eight one: {b

{blue, green

ransitions (5)orresponding

and try to

8

rformance t before the e see a height orithm was

blue, red},

n, green},

Page 10: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkha

an Muradov

The whole

1) Divi

2) Put m

3) Buil

5) Runhead noUsing a

6) Go t

In orderow resultsnew instanbecause thimage. Theruntime a 2During runupdated to

Image repr

e process can

ide each feat

mini-feature

ld Trie

n Algorithm oodes into thea binary sear

through the q

er to improves has been usnce of result e program we new data s2D array is cntime, when

point to the

resentation o

n be represen

ture into 1D

es into the lin

on the imagee queue. Aftrch tree inste

queue and ca

e the performsed instead oclass has be

was creating tructure is mcreated wherthe hot row found hot ro

of this data s

nted as a pip

"mini-featur

nked list whe

e by markinger running aead of hashta

alculate num

mance duringof a queue. Ien created anmany result

much faster bre every rowis found, theows.

tructure is sh

eline proces

ures".

ere head nod

g middle noda few tests, oable gave fa

mber of featu

g runtime thIn the beginnand enqueuedt classes in obecause it us

w/column is re correspond

hown below

ss:

de contains h

des in the 2Done transitionaster results.

ures that wer

he following ning when a d into the qu

order to save ses only poinrepresent as ding entry in

w:

hotspot pixe

D array and pn is made on

re found

data structuhot row was

ueue. This wrow and col

nters. Beforea linked list

n the 2D arra

9

l.

putting n average.

ure for hot s found, a as slow lumn of the e the

node. ay is

Page 11: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkhan Muradov 10

3. Benchmarking Tool

In order to measure the performance of different algorithms, we needed a benchmarking tool that captures the average times over several trials. Datasets that were selected included different programs: YouTube video in Firefox, Microsoft Visual Studio, Calculator, Microsoft Word and its dialog boxes, Microsoft Excel, Microsoft PowerPoint, Skype, Windows Live Messenger, cs.washington.edu website in Internet Explorer, iTunes, PDF document in Acrobat Reader, and the Solitaire game. People use these programs every day and we needed to confirm that algorithms behave fast.

The benchmarking tool can be divided into two parts: a first phase where the datasets are created and the second phase for benchmarking the time it takes to run given functions over the image. During the first phase this tool starts capturing screen shot images after the user specifies the title of the window to be captured and name of the output xml file. User can stop the screenshot capturing at any time by clicking stop button or wait until the time exceeds the limit. The second phase uses a Wrapper Function class that contains all the function definitions that are used to benchmark the image. Each function is run over the image according to its priority.

For instance, let’s say we have two functions in our wrapper class where the first one represents Aho-Corasick algorithm with the highest priority and the second one is the Build Tree algorithm. Then when we run the benchmarking tool, it first uses the Aho-Corasick algorithm over the image and then the Build Tree algorithm is run over the same image. As an input of the second phase the program uses the same xml file for the image dataset which was created during the first phase. Each xml file for each trial consists of average times for every function executed during the trial and time spent on every function run on every image. Summary file includes average times in milliseconds for every function execution over all trials.

Benchmarking tool was used on various find features functions such as single-threaded and multi-threaded Build Tree algorithm, Integral Sum with one hotspot match algorithm, Integral Sum with one/two/three hotspot match and then exactly match algorithm and one hotspot match with or without integral sum using threshold of previous algorithms. Please see appendix for the table.

4. Building Bigger Library

In order to improve the functionality of existing software for library building, I have added the following features:

o Software automatically detects whether a new prototype goes to positive or negative set of examples.

o Scroller feature that lets the user browse images as a gallery view. This tool shows found prototypes by running find features algorithm on every image shown.

Page 12: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkhan Muradov 11

In order to get accurate results from our benchmark tool, Prefab needed a larger library of prototypes. Most algorithms brought satisfactory results when using our existing small library. But we were not sure if it is always the case. Having a larger library of prototypes helped identify bugs and determine disadvantages of new algorithms. The larger library I created contains around 700 prototypes from various frequently used programs, such as Mozilla Firefox, Google Chrome, Microsoft Office, YouTube, Calculator, and Microsoft Visual Studio.

5. Results

At first, the initial Build Tree algorithm was tested on various datasets of a size of 500-4000 images with the interval of 10ms. These benchmark tests were tested using initial large library of around 100 prototypes. The results showed that it takes much longer to find feature on YouTube webpage and Visual Studio datasets. In our understanding, images from Visual Studio dataset contain many buttons and widgets. On the other hand, an image’s pixels from YouTube video are more different than the image in the previous frame. [Table 1]

Dataset name Single-threaded Build Tree algorithm

Multi-threaded Build Tree algorithm

Calculator 83ms 33ms Visual Studio 265ms 159ms YouTube Video in Firefox Browser

237ms 138ms

Microsoft PowerPoint 2007

67ms 40ms

Microsoft Word 2007 110ms 63ms

Table 1

The first algorithm that was tested and compared to Build Tree approach was the Partially-matching algorithm. But the results received were much slower than single threaded build tree implementation. The bottle neck for this implementation is the number of comparisons we make in order to get a match.

In order to reduce the number of pixel comparisons we started using Integral Image algorithm which helped to filter out many features by calculating integral sum of a prototype. We also had to exactly match because some features might have same pixels but in different order, thus resulting with the same integral sum. Unfortunately we noticed that many features are alike and have the same integral sum. This approach proved to be fast with the old large library consisting of around 115 prototypes [Table 2]. But after building bigger library of prototypes probability of having features with same integral sum increased as number of prototypes in the library approached to 700 [Table 3].

Page 13: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkhan Muradov 12

Table 2

Benchmark results from running on quad-core machine using a library of size 115

Not

epa

d-in

it

128m

s

85m

s

142m

s

107m

s

75m

s

77m

s

123m

s

Chro

me-

sett

ings

-di

alog

247m

s

137m

s

291m

s

215m

s

147m

s

153m

s

260m

s

Goog

le

talk

80m

s

50m

s

95m

s

71m

s

49m

s

50m

s

82m

s

Not

epad

130m

s

87m

s

141m

s

108m

s

75m

s

77m

s

125m

s

Mac

-fir

efox

83m

s

51m

s

53m

s

42m

s

47m

s

44m

s

52m

s

Fire

fox

74m

s

54m

s

65m

s

50m

s

44m

s

48m

s

57m

s

Calc

ulat

or

53m

s

37m

s

47m

s

35m

s

32m

s

36m

s

42m

s

Imag

e na

me

/ im

plem

enta

tion

O

ver

3 tr

ials

in m

s

Sing

leth

read

ed

Build

Tree

Mul

tiTh

read

ed

Build

Tree

1 ho

tspo

t mat

ch +

In

tegr

alSu

m m

atch

+

exac

tly m

atch

1 ho

tspo

t mat

ch +

ex

actly

mat

ch

2 ho

tspo

t mat

ch +

ex

actly

mat

ch

3 ho

tspo

t mat

ch +

ex

actly

mat

ch

Thre

shol

d ho

tspo

t mat

ch

wit

h/w

itho

ut

Inte

gral

Sum

mat

ch +

ex

actly

mat

ch

Page 14: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkhan Muradov 13

Not

epad

-init

44m

s

25m

s

142m

s

40m

s

38m

s

77m

s

40

Chro

me-

sett

ings

-di

alog

93m

s

52m

s

181m

s

146m

s

124m

s

172m

s

97m

s

Goog

le

Talk

27m

s

17m

s

21m

s

19m

s

22m

s

24m

s

30m

s

Not

epad

46m

s

25m

s

45m

s

42m

s

41m

s

47m

s

43m

s

Mac

-fir

efox

83m

s

36m

s

65m

s

62m

s

47m

s

67m

s

88m

s

Fire

fox

93m

s

50m

s

202m

s

161m

s

44m

s

183m

s

87m

s

Calc

ulat

or

61m

s

37m

s

87m

s

63m

s

48m

s

78m

s

55m

s

Imag

e na

me

/ im

plem

enta

tion

O

ver

3 tr

ials

in m

s

Sing

leth

read

ed B

uild

Tree

Mul

tiTh

read

ed B

uild

Tree

1 ho

tspo

t mat

ch +

In

tegr

alSu

m m

atch

+

exac

tly m

atch

1 ho

tspo

t mat

ch +

exa

ctly

m

atch

2 ho

tspo

t mat

ch +

exa

ctly

m

atch

Thre

shol

d ho

tspo

t mat

ch

wit

h/w

itho

ut

Inte

gral

Sum

mat

ch +

ex

actly

mat

ch

Aho-

Cor

asic

k

Table 3

Benchmark results from running on six-core machine using a library with a size of 676 prototypes

Page 15: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkhan Muradov 14

Our next approaches without checking integral sum gave very decent results. Results were surprisingly fast compared to the integral sum implementation. Later two hotspots prior to exactly matching approach was tested and the results were even faster than multithreaded build tree match implementation [Table 2]. As it was expected results became slower with 3 hotspot matching. Matching hotspot requires 3 times more time than matching pixel one by one. This is because we need to calculate offset and then match the pixel. We came to conclusion that this approach is not very reliable because of the difficulty of optimizing how many hotspots we need to check before exactly matching the whole feature. As we can see from Table 2, one hotspot matching is faster with Firefox settings windows for OS X that two-hotspot matching algorithm.

Integral Image threshold approach was developed by combining above described two approaches. It is not very efficient to calculate integral sum for very small features and it is much faster to exactly match right away. So, this algorithm decides whether integral sum needs to be calculated or not. We see some improvements from just plain integral image algorithm but it was a little slower than exactly matching the image after one or two hotspot matches. This happens because majority of the features in the library are very small, of a size less than 10 pixels.

So far we have tried to reduce number of pixel visits in the image, reduce number of pixel comparisons, and decide if we really need to compare every pixel for very small features. Aho-Corasick implementation is based only on following transitions in the trie and going through each pixel in the image only once. This approach combines our previous goals in order to improve efficiency. Aho-Corasick algorithm is also very stable in a way that we do not need to guess how many hotspots we need to check before checking other pixels. Because most of the work is done during pre-processing time, Aho-Corasick approach proved to be really fast. In the Table 3, all of the above algorithms were re-run on a six-core machine using library that contains 676 prototypes. The results of this approach do not depend on the size of the library as much as above described algorithms.

6. Discussion and Conclusion

So far we have tried creating various algorithms that are very different from each other but have the same goal: improve the performance of Prefab feature recognition. At first, we tried to minimize number of pixel visits in the image by putting partial matches into various data structures. As a result we had to make a large number of comparisons, which slowed the whole process. In order to reduce number of comparisons we started using so called integral image. It helped to filter out many features and we were left only with features that contained the same number of pixels but in different order. This implementation was not as fast as we expected because the library consists mostly of small features. Calculating integral image for small features was significantly slower than matching pixel by pixel. Next we tried eliminating integral image calculation and using threshold algorithm with it. This gave us decent results but the approach was not very reliable because we cannot really decide which feature is small or big. The next strategy was to use Aho-Corasick algorithm, which uses a special trie structure with pointers and back-pointers. This helped us to achieve our goals: visit each pixel in the image only once and decrease number of comparisons.

Page 16: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkhan Muradov 15

Given what I have done I think it remains unseen which algorithm will prove to be faster at large scale. This work will help to understand if Prefab can find prototypes instantaneously in real time using a large library of features.

There are other approaches to be explored, such as reusing found features from previous frame in a new frame. This would significantly improve the performance because we do not need to recalculate the same features over the same parts of the image. Since Aho-Corasick implementation showed really good and stable results, we could try running it in parallel. The image could be divided into segments and each thread can look for features in each segment using the same trie. Also, if scaling is an issue then we could run Aho-Corasick algorithm in parallel with different libraries. In other words, a large library can be divided into many small ones and each thread would build its own trie. This might eliminate a large number of transitions while searching for features. At the end results from all the threads can be added together. All this work and new modifications would significantly help to get instantaneous widget recognition in applications using Prefab.

Acknowledgements

I want to thank James Fogarty, Morgan Dixon and Daniel Leventhal for all the help and discussions related to this work. This work was generously supported by the National Science Foundation under an REU supplement to award IIS-0812590.

Page 17: Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov During the ether as a li e last node i tangles by r ult gives us d then saved During

Orkhan Muradov 16

References

[1] Dixon, M. and Fogarty, J. (2010). Prefab: Implementing Advanced Behaviors Using Pixel-Based Reverse Engineering of Interface Structure. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2010), pp. 1525-1534.

[2] Dixon, M., Daniel Leventhal and Fogarty, J. (2011). Content and Hierarchy in Pixel-Based Methods for Reverse-Engineering Interface Structure. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. (CHI 2011), pp. 969-978.

[3] Wikipedia, Summed Area Table, http://en.wikipedia.org/wiki/Summed_area_table [4]Wikipedia, Aho-Corasick String Matching Algorithm, http://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_string_matching_algorithm

[5] Kilpeläinen P. (2005 Spring). Set matching and Aho-Corasick Algorithm, http://www.cs.uku.fi/~kilpelai/BSA05/lectures/slides04.pdf