Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov...
Transcript of Improving performance of prototype recognition in Prefab€¦ · rec ma no ha sam are n Muradov...
Improving performance of prototype recognition in Prefab
By
Orkhan Muradov
A senior thesis submitted in partial fulfillment of
The requirements for the degree of
Bachelor of Science
With Departmental Honors
Computer Science & Engineering
University of Washington
June 2011
Thesis and presentation approved by
_____________________________________
Date
_____________
Orkha
Abst
windoby revapplicnew wwheneanothevery im
1. In
indepeinteracinterprthe soudemoncursorpresencustomexecut
an Muradov
tract
Because mow building bverse engineeation has a g
ways. Most oever we use ter. In order tmportant.
ntroducti
Using Prefendently of tction in the creted using Purce windownstrated manr, visualizingnting softwarmization, andting this pixe
most graphicablocks, Prefaering an integreat potentiof the applicathem we opeto identify al
on and M
fab we can ntheir underlycontext of rePrefab. Thenw [Figure 1].ny potential eg state changre tutorials, ad more. All tel-based feat
Prefab
al user interfab helps to rerface and bual for identifations that werate very qull the parts o
Motivatio
now interpretying implemeeal application various enh. Papers preenhancemen
ges with phoadaptive GUthese interacture identific
b’s pixel-ba
faces that werecognize theuilding a treefying the stru
we use on daiuickly by scrf the interfac
on
t pixels of anentation. Thons. Pixels ahancements sented at CH
nts: dynamicsphor, param
UI visualizatiction techniqcation cycle
Figure
ased widge
e see nowadaem by lookine structure reructure of intily basis havrolling and sce correctly,
n existing aphis help reseaare first copiecan be adde
HI 2010 [1] aally expandi
meter spectruion, languagques recognimultiple tim
1
et identific
ays consist ong only at raepresenting tterfaces and
ve many widswitching fro, the speed o
pplication anarchers test aed from a sod, with inpuand CHI 201ing the motoum previews
ge translationze widget la
mes.
ation cycle
of various buaw pixels. Prthat interfaccustomizing
dgets and butom one windof this applic
nd modify thand deploy tource windowut then mapp11 [2] have or space withs with side vn, user interfayout in real
e
1
uttons or refab works e. This g them in ttons, and dow to cation is
hem their w and ed back to
h bubble views, face time by
Orkha
Prefabchoosedecisiothe co
pixel ia matcThe mthread
Befo•Find•Buil
rela
an Muradov
Successfullyb’s original ies a non-tranon tree in whlor of the pix
During runis the hotspoched feature.
multi-threadeds to look for
ore Runtimed non-transparentld a decision tree wative to the hotspo
y executing tmplementatinsparent pixehich every nxel at that of
De
ntime this imot pixel of a f. If the pixeld version of r features.
t hotspots for eachwhere each node ot and edges repre
this cycle reion uses a deel hotspot pi
node stores affset.
ecision tree
mplementatiofeature. It m doesn’t mat
f this implem
h feature in the librepresents an offs
esent pixel colors
quires a metecision tree ixel for eachn offset rela
Figure 2
of a Build Tr
on scans thromoves through
tch a pixel inmentation div
braryset
Du•T
m
thod for rapi[Figure 2]. T
h feature in thative to the h
2
Tree impleme
ough the imah the image n the decisiovides a bitma
uring RuntimeTry to match the hmatches then trav
idly finding This Build The library. T
hotspot and e
entation
age only oncand traverse
on tree, then ap into segm
hotspot pixel int heerse the tree unti
features in aTree approachThen Prefab bevery edge re
ce by checkines the tree untree traversa
ments and use
e decision tree. If il the feature is fou
2
an image. h first builds a epresents
ng if the ntil it finds al ends. es multiple
it und
Orkha
2. Ex
prototyperforalgorit
2.1 P
of thispixel ithis femore f
improvlarge, below
B•C•C
an Muradov
xamining
It is difficuypes is relati
rmance of feathms, benchm
Partially
Before the s structure is in the main ieature is addefeatures to th
The goal oving the perwe started u describes th
1 Bui2 Init3 FO4 5 6 7 8 9 10 11 12 13
efore RuntimeConvert every featCreate empty list o
g Strateg
ult to say howively small aature detectimarking soft
Matchin
execution aa match nod
image and ifed into the phe partial ma
of this approformance. B
using differenhe algorithm
ild a list of ftialize a list
OR each pixeIF pixel ENDIF IF pixel
eture into a linked of partial matches
gies for Fi
w reliable thand there areion is very imftware, and a
ng
all the featurede that specif a pixel matcpartial matchatches and ad
oach was to vBecause the int data struc
m:
features namof features nl in image l matches theMove to theIF Feature.N We hENDIF
l matches theIF Feature.N We hELSE
list of pixels (1)s (2)
inding F
he initial algoe no other almportant fora larger librar
es are conveifies that a feches the firs
hes data strucdvance corre
visit each pinitial collect
ctures in orde
med STARTnamed PART
e first pixel e next node oNext is a Mahave a match
e first pixel Next is a Mahave a match
Du•T
p•If
o
eatures
orithm is beclgorithms to r Prefab’s funary.
erted into a lieature was fot pixel of thecture. As weesponding n
xel in the imtion of linkeer to improv
TIAL
of feature inof linked listatch THEN h
of feature inatch THEN h
uring RuntimeTry to match next partial matches unf next pixel in the
of the feature then
cause Prefabcompare winctionality,
inked list of ound. The pre feature in t
e iterate throuodes in the p
mage only oned list featureve retrieval ti
n PARTIAL t in PARTIA
n START TH
pixel in the imagentil we get to the e
image is the samen add to Partial ma
b’s current liith. Because we develope
f pixels. The rogram visitthe data struugh the imagpartial match
nce, therefores could becime. Pseudo
THEN AL
HEN
e to next node in thend of linked list (4e as the starting piatches (10-16)
3
ibrary of
ed other
last node ts each ucture then ge we add hes.
re come very o code
he 4-9)ixel
Orkha
In ord
List of
slowly Hashta
list of data stkey an
2D Ar
compathe imfeature Later ldrama 2.2 I
Intrectan
where
Corne in th integr
an Muradov
14 15 16 17 EN
er to improv
f list of featuThis was in
y with a larg
able In the first
features thattructure wasnd correspon
rray
This implearing to the h
mage. Each coes
look ahead hatically.
Integral I
tegral imagegular subset
each value
Figure 3 ers of rectanghe image forral sum calc
ENDIF
NDFOR
ve the results
ures nitial structuer library of
implementat start with t getting very
nding value w
ementation whashcode calolumn/row c
hotspot was
Image
e is an algorit of a grid [F
in the image
beffroothcom
gle r ulation
Add ENDIF
s, various da
ure that was uf features.
ation of hashthis pixel. Thy large. The was a linked
was much faslculation in hcorresponds
added to this
thm for quicigure 2]. Ev
e correspond
This algorfore the runt
om the imageher features cmparing to t
feature.Nex
ata structures
used in the i
htable, the kehis appearednext implem
d list of featu
ster than the hashtable. Tto the colum
s implement
ckly and effiery row/colu
ds to the pixe
rithm was vetime executioe and keepincan be easilythe integral s
xt from STA
s for START
implementat
ey was a pixd to behave smentation of ures.
hashtable bThis matrix hmn/row in the
tation, which
iciently geneumn in the in
el value in th
ery promisinon, which in
ng features iny filtered outsum in the in
ART to PART
T and PARTI
tion but it ap
xel and the vaslowly becauf hashtable in
because retriehas the same e image and
h improved t
erating the suntegral imag
he integer for
g because mncludes calcun a linked list by calculatntegral imag
TIAL
IAL were us
ppeared to be
alue was a liuse the partiancluded colu
eval time waheight and w
d is linked to
the results
um of valuesge is calculat
rmat.
most of the wulating integst data structting integrale.
4
sed:
ehave very
ist of linked al matches umn as a
as instant width as the list of
s in a ted by:
work is done gral image ture. Many sum and
Orkha
togThrecres
an
horecmanohasamare
an Muradov
During thegether as a lihe last node ictangles by rsult gives us
d then saved
During the otspot pixel mctangle in thatches the va
ode. When wappens becaume. In fact, te reported if
Befo• Find• Divi• Calc• For a• Mar
HotspotNode
e preprocessiinked list. Inis a match norecursing to a minimal n
d as a check
runtime we matches the ie image of thalue in the ch
we get to the muse pixels mithroughout o
f the feature i
re Runtime Hotspotde every feature intoulate integral sum ofa linked list from thek last node a match n
t ChN
ing, stage evn this linked ode which lithe left, up,
number of re
node in the
go through timage pixel,he same areaheck node, wmatch node ight be in difour experimeis not also ex
o rectanglesf every rectanglese rectanglesnode
heck ode
very feature ilist the first inks to the feright, and th
ectangles. In
linked list. A
the main ima, then the nea as the next
we advance iwe need to efferent orderent it was coxactly match
Du• Tr• Ca• If
re• Ex
CheckNode
is divided innode is a ho
eature itself.hen down of ntegral sum f
As a result ou
age trying toxt step is tot check nodein the linkedexactly matcr in the rectaoncluded thahed.
uring Runtimery to match first nodalculate integral sumMatches advance in
eachedxactly match
k e
M
nto rectangleotspot which The feature
f the transparfor each recta
ur feature w
o match a hocalculate int
e. If integral d list until wech every pixangle but theat a large num
de by iterating througm of this rectanglen a linked list until ma
Match Node
es and then ch points to che was dividedrent pixel whangle is calc
will look as fo
otspot pixel. tegral sum fosum of this e get to the m
xel of the feae sum can stamber of false
gh the image
atch node is
FeatCla
5
combined heck nodes. d into hich as a culated by
ollows:
If the for the rectangle
match ature. This ay the e matches
ure ss
Orkha
2.
teccalwoonke
fasmafeafea
Ps
1 2 3 4 5 6 7
8 9 10 11 12
an Muradov
.3 Findin
After matcchnique was lculating theorks becausen small featury correspond
Matching mster method.atches to filtatures, in thiatures. This t
eudo-code f
IF tree con Sav IF a
0 EN Ad
2 ENDIF
ng Hotspo
ching integraused for fin
e integral sume most of theres. During pds to the hot
more than on After runnin
ter out big nuis case, tree btree is shallo
for the two h
ntains pixel ove next levelany of the sa
FOR ev
ENDFO
NDIF dd NEW to t
Before R•Find one o•Link to an•Build a tre
hotspots
ot and Ex
al sum of recnding featurem of the recte features in pre-processitspot pixel an
ne hotspot anng various teumber of feabehaves fastow and has h
hotspot and e
on the first lel pointed froaved nodes ivery feature iIF exactly m Dele MatcENDIF
OR
the saved po
untimeor more Hotspots actual feature
ee by combining si
xactly M
ctangles in thes in the imatangle, we cathe library a
ing stage, thend the value
nd then exacests it was coatures. Insteater. Every noheight of thre
exactly match
evel THENm this noden LIST contin the featur
match THENete from savech was found
inters LIST;
imilar
atching
he image, anage. After maan exactly mare small ande features we
e is the list of
ctly matchinoncluded thaad of using hode in the treee.
hing algorith
as NEW; tain pixel THres
N ed d
;
During Run•Try to matc
through the•Match othe•If rich the e
nother similaatching hotsp
match the fead exactly maere saved in f features tha
ng the featureat it usually hashtable to ee is a hotspo
hm is shown
HEN
ntimech first node by itee image (1-2)er hotspots if availnd then exactly m
ar but very sipot pixel, in
ature right awatching work
the hashtabat start with
e proved to btakes aroundcontain all tot and leave
n below:
erating
able (3)match (4- 9)
6
imple stead of way. This ks very fast le where this pixel.
be a much d 2 hotspot the s are list of
Orkha
2.4
dealgcofea
an Muradov
4 Thresh
This alcision whethgorithm provnsists of smaature’s size i
2.5 Aho The Ahfinds elemsimultaneoadditional transitions prefix [Figneed for ba
If the size of less than the•Find one or m•Link to an act•Build a tree b
similar hotsp
old of Pr
lgorithm comher to calculved to behavall features tis greater tha
o - Corasiho–Corasick ents of a fini
ously. This alinks betweebetween fai
gure 4]. This acktracking.
a feature is e limitmore Hotspotstual featureby combining ots
Before Runtim• Set a li
revious T
mbines aboveate and chec
ve very slowlthis algorithman the limit t
ick string matchite set of stri
algorithm conen the variouiled pattern maids the aut
memit
If thless• Fin• Div
rec• Cal
rec• For
rec• Ma
Two Tech
e described ack integral suly with smalm has an advthen we do c
hing algorithings within anstructs a finus internal nmatches to oomaton to tr
he size of a featus than the limitd Hotspotvide every feature intanglesculate integral sum otangle
r a linked list from thtangles
ark last node a match
hniques
algorithms ium or not. Fll features. Svantage by ncalculate inte
hm is a dictioan input textnite state ma
nodes. These other brancheransition bet
ure is
nto
of every
ese
h node
in such a wayFrom the resuSince the librnot calculatinegral sum.
onary-matcht and matcheachine that lo
extra internes of the trietween pattern
During Ru•Try to matcnode by itethrough the
•If check nocalculate insum
•Otherwise match
y that it makults integral rary of featung integral s
hing algorithes all patternook like a tri
nal links allowe that share an matches w
untimech first
erating e imagede exists
ntegral
exactly
7
kes a image
ures mostly sum. If the
hm that ns ie with w fast
a common without the
Orkha
Trie
an Muradov
Tdepends onruntime, soresult in thbigger thantested on a Fo{blue, red,
e includes th
Before R• Convert • Select ho• Build a li• Build the
his algorithmnly on followo during the he node then n 1, they are artificially cror instance le green, green
Aho-Cora
he following
Runtimeevery feature into ro
ot row by selecting thnked list from hot ro
e Trie
m lets us viswing the tranruntime we we found a divided in t
reated featuret’s build a tn}, {blue, gr
sick Trie wit
features: {b
owhe hotspotow and other rows
it every pixensitions in thjust need to feature. Bec
to the linked es of height trie from the reen, green}
Figure 4th transition
blue, red}, {b{green, b
Dur• Sea• If tr
dat• Go
ma
el in the imahe trie. The t
follow the pcause featured list of 1D fe
one. following f, {green, blu
4 ns and fail-tr
blue, red, greblue, red}.
ring Runtimearch the trie for pixelrie node contains reta structures (5)through the hot rowtch them (6)
age only oncetrie and featupointers in thes most of theatures. At f
features of heue, red}.
ransitions
een, green},
ls by following the tresults add them to co
ws and middle rows
e and the perures are builthe trie. If wehe time have first this algo
eight one: {b
{blue, green
ransitions (5)orresponding
and try to
8
rformance t before the e see a height orithm was
blue, red},
n, green},
Orkha
an Muradov
The whole
1) Divi
2) Put m
3) Buil
5) Runhead noUsing a
6) Go t
In orderow resultsnew instanbecause thimage. Theruntime a 2During runupdated to
Image repr
e process can
ide each feat
mini-feature
ld Trie
n Algorithm oodes into thea binary sear
through the q
er to improves has been usnce of result e program we new data s2D array is cntime, when
point to the
resentation o
n be represen
ture into 1D
es into the lin
on the imagee queue. Aftrch tree inste
queue and ca
e the performsed instead oclass has be
was creating tructure is mcreated wherthe hot row found hot ro
of this data s
nted as a pip
"mini-featur
nked list whe
e by markinger running aead of hashta
alculate num
mance duringof a queue. Ien created anmany result
much faster bre every rowis found, theows.
tructure is sh
eline proces
ures".
ere head nod
g middle noda few tests, oable gave fa
mber of featu
g runtime thIn the beginnand enqueuedt classes in obecause it us
w/column is re correspond
hown below
ss:
de contains h
des in the 2Done transitionaster results.
ures that wer
he following ning when a d into the qu
order to save ses only poinrepresent as ding entry in
w:
hotspot pixe
D array and pn is made on
re found
data structuhot row was
ueue. This wrow and col
nters. Beforea linked list
n the 2D arra
9
l.
putting n average.
ure for hot s found, a as slow lumn of the e the
node. ay is
Orkhan Muradov 10
3. Benchmarking Tool
In order to measure the performance of different algorithms, we needed a benchmarking tool that captures the average times over several trials. Datasets that were selected included different programs: YouTube video in Firefox, Microsoft Visual Studio, Calculator, Microsoft Word and its dialog boxes, Microsoft Excel, Microsoft PowerPoint, Skype, Windows Live Messenger, cs.washington.edu website in Internet Explorer, iTunes, PDF document in Acrobat Reader, and the Solitaire game. People use these programs every day and we needed to confirm that algorithms behave fast.
The benchmarking tool can be divided into two parts: a first phase where the datasets are created and the second phase for benchmarking the time it takes to run given functions over the image. During the first phase this tool starts capturing screen shot images after the user specifies the title of the window to be captured and name of the output xml file. User can stop the screenshot capturing at any time by clicking stop button or wait until the time exceeds the limit. The second phase uses a Wrapper Function class that contains all the function definitions that are used to benchmark the image. Each function is run over the image according to its priority.
For instance, let’s say we have two functions in our wrapper class where the first one represents Aho-Corasick algorithm with the highest priority and the second one is the Build Tree algorithm. Then when we run the benchmarking tool, it first uses the Aho-Corasick algorithm over the image and then the Build Tree algorithm is run over the same image. As an input of the second phase the program uses the same xml file for the image dataset which was created during the first phase. Each xml file for each trial consists of average times for every function executed during the trial and time spent on every function run on every image. Summary file includes average times in milliseconds for every function execution over all trials.
Benchmarking tool was used on various find features functions such as single-threaded and multi-threaded Build Tree algorithm, Integral Sum with one hotspot match algorithm, Integral Sum with one/two/three hotspot match and then exactly match algorithm and one hotspot match with or without integral sum using threshold of previous algorithms. Please see appendix for the table.
4. Building Bigger Library
In order to improve the functionality of existing software for library building, I have added the following features:
o Software automatically detects whether a new prototype goes to positive or negative set of examples.
o Scroller feature that lets the user browse images as a gallery view. This tool shows found prototypes by running find features algorithm on every image shown.
Orkhan Muradov 11
In order to get accurate results from our benchmark tool, Prefab needed a larger library of prototypes. Most algorithms brought satisfactory results when using our existing small library. But we were not sure if it is always the case. Having a larger library of prototypes helped identify bugs and determine disadvantages of new algorithms. The larger library I created contains around 700 prototypes from various frequently used programs, such as Mozilla Firefox, Google Chrome, Microsoft Office, YouTube, Calculator, and Microsoft Visual Studio.
5. Results
At first, the initial Build Tree algorithm was tested on various datasets of a size of 500-4000 images with the interval of 10ms. These benchmark tests were tested using initial large library of around 100 prototypes. The results showed that it takes much longer to find feature on YouTube webpage and Visual Studio datasets. In our understanding, images from Visual Studio dataset contain many buttons and widgets. On the other hand, an image’s pixels from YouTube video are more different than the image in the previous frame. [Table 1]
Dataset name Single-threaded Build Tree algorithm
Multi-threaded Build Tree algorithm
Calculator 83ms 33ms Visual Studio 265ms 159ms YouTube Video in Firefox Browser
237ms 138ms
Microsoft PowerPoint 2007
67ms 40ms
Microsoft Word 2007 110ms 63ms
Table 1
The first algorithm that was tested and compared to Build Tree approach was the Partially-matching algorithm. But the results received were much slower than single threaded build tree implementation. The bottle neck for this implementation is the number of comparisons we make in order to get a match.
In order to reduce the number of pixel comparisons we started using Integral Image algorithm which helped to filter out many features by calculating integral sum of a prototype. We also had to exactly match because some features might have same pixels but in different order, thus resulting with the same integral sum. Unfortunately we noticed that many features are alike and have the same integral sum. This approach proved to be fast with the old large library consisting of around 115 prototypes [Table 2]. But after building bigger library of prototypes probability of having features with same integral sum increased as number of prototypes in the library approached to 700 [Table 3].
Orkhan Muradov 12
Table 2
Benchmark results from running on quad-core machine using a library of size 115
Not
epa
d-in
it
128m
s
85m
s
142m
s
107m
s
75m
s
77m
s
123m
s
Chro
me-
sett
ings
-di
alog
247m
s
137m
s
291m
s
215m
s
147m
s
153m
s
260m
s
Goog
le
talk
80m
s
50m
s
95m
s
71m
s
49m
s
50m
s
82m
s
Not
epad
130m
s
87m
s
141m
s
108m
s
75m
s
77m
s
125m
s
Mac
-fir
efox
83m
s
51m
s
53m
s
42m
s
47m
s
44m
s
52m
s
Fire
fox
74m
s
54m
s
65m
s
50m
s
44m
s
48m
s
57m
s
Calc
ulat
or
53m
s
37m
s
47m
s
35m
s
32m
s
36m
s
42m
s
Imag
e na
me
/ im
plem
enta
tion
O
ver
3 tr
ials
in m
s
Sing
leth
read
ed
Build
Tree
Mul
tiTh
read
ed
Build
Tree
1 ho
tspo
t mat
ch +
In
tegr
alSu
m m
atch
+
exac
tly m
atch
1 ho
tspo
t mat
ch +
ex
actly
mat
ch
2 ho
tspo
t mat
ch +
ex
actly
mat
ch
3 ho
tspo
t mat
ch +
ex
actly
mat
ch
Thre
shol
d ho
tspo
t mat
ch
wit
h/w
itho
ut
Inte
gral
Sum
mat
ch +
ex
actly
mat
ch
Orkhan Muradov 13
Not
epad
-init
44m
s
25m
s
142m
s
40m
s
38m
s
77m
s
40
Chro
me-
sett
ings
-di
alog
93m
s
52m
s
181m
s
146m
s
124m
s
172m
s
97m
s
Goog
le
Talk
27m
s
17m
s
21m
s
19m
s
22m
s
24m
s
30m
s
Not
epad
46m
s
25m
s
45m
s
42m
s
41m
s
47m
s
43m
s
Mac
-fir
efox
83m
s
36m
s
65m
s
62m
s
47m
s
67m
s
88m
s
Fire
fox
93m
s
50m
s
202m
s
161m
s
44m
s
183m
s
87m
s
Calc
ulat
or
61m
s
37m
s
87m
s
63m
s
48m
s
78m
s
55m
s
Imag
e na
me
/ im
plem
enta
tion
O
ver
3 tr
ials
in m
s
Sing
leth
read
ed B
uild
Tree
Mul
tiTh
read
ed B
uild
Tree
1 ho
tspo
t mat
ch +
In
tegr
alSu
m m
atch
+
exac
tly m
atch
1 ho
tspo
t mat
ch +
exa
ctly
m
atch
2 ho
tspo
t mat
ch +
exa
ctly
m
atch
Thre
shol
d ho
tspo
t mat
ch
wit
h/w
itho
ut
Inte
gral
Sum
mat
ch +
ex
actly
mat
ch
Aho-
Cor
asic
k
Table 3
Benchmark results from running on six-core machine using a library with a size of 676 prototypes
Orkhan Muradov 14
Our next approaches without checking integral sum gave very decent results. Results were surprisingly fast compared to the integral sum implementation. Later two hotspots prior to exactly matching approach was tested and the results were even faster than multithreaded build tree match implementation [Table 2]. As it was expected results became slower with 3 hotspot matching. Matching hotspot requires 3 times more time than matching pixel one by one. This is because we need to calculate offset and then match the pixel. We came to conclusion that this approach is not very reliable because of the difficulty of optimizing how many hotspots we need to check before exactly matching the whole feature. As we can see from Table 2, one hotspot matching is faster with Firefox settings windows for OS X that two-hotspot matching algorithm.
Integral Image threshold approach was developed by combining above described two approaches. It is not very efficient to calculate integral sum for very small features and it is much faster to exactly match right away. So, this algorithm decides whether integral sum needs to be calculated or not. We see some improvements from just plain integral image algorithm but it was a little slower than exactly matching the image after one or two hotspot matches. This happens because majority of the features in the library are very small, of a size less than 10 pixels.
So far we have tried to reduce number of pixel visits in the image, reduce number of pixel comparisons, and decide if we really need to compare every pixel for very small features. Aho-Corasick implementation is based only on following transitions in the trie and going through each pixel in the image only once. This approach combines our previous goals in order to improve efficiency. Aho-Corasick algorithm is also very stable in a way that we do not need to guess how many hotspots we need to check before checking other pixels. Because most of the work is done during pre-processing time, Aho-Corasick approach proved to be really fast. In the Table 3, all of the above algorithms were re-run on a six-core machine using library that contains 676 prototypes. The results of this approach do not depend on the size of the library as much as above described algorithms.
6. Discussion and Conclusion
So far we have tried creating various algorithms that are very different from each other but have the same goal: improve the performance of Prefab feature recognition. At first, we tried to minimize number of pixel visits in the image by putting partial matches into various data structures. As a result we had to make a large number of comparisons, which slowed the whole process. In order to reduce number of comparisons we started using so called integral image. It helped to filter out many features and we were left only with features that contained the same number of pixels but in different order. This implementation was not as fast as we expected because the library consists mostly of small features. Calculating integral image for small features was significantly slower than matching pixel by pixel. Next we tried eliminating integral image calculation and using threshold algorithm with it. This gave us decent results but the approach was not very reliable because we cannot really decide which feature is small or big. The next strategy was to use Aho-Corasick algorithm, which uses a special trie structure with pointers and back-pointers. This helped us to achieve our goals: visit each pixel in the image only once and decrease number of comparisons.
Orkhan Muradov 15
Given what I have done I think it remains unseen which algorithm will prove to be faster at large scale. This work will help to understand if Prefab can find prototypes instantaneously in real time using a large library of features.
There are other approaches to be explored, such as reusing found features from previous frame in a new frame. This would significantly improve the performance because we do not need to recalculate the same features over the same parts of the image. Since Aho-Corasick implementation showed really good and stable results, we could try running it in parallel. The image could be divided into segments and each thread can look for features in each segment using the same trie. Also, if scaling is an issue then we could run Aho-Corasick algorithm in parallel with different libraries. In other words, a large library can be divided into many small ones and each thread would build its own trie. This might eliminate a large number of transitions while searching for features. At the end results from all the threads can be added together. All this work and new modifications would significantly help to get instantaneous widget recognition in applications using Prefab.
Acknowledgements
I want to thank James Fogarty, Morgan Dixon and Daniel Leventhal for all the help and discussions related to this work. This work was generously supported by the National Science Foundation under an REU supplement to award IIS-0812590.
Orkhan Muradov 16
References
[1] Dixon, M. and Fogarty, J. (2010). Prefab: Implementing Advanced Behaviors Using Pixel-Based Reverse Engineering of Interface Structure. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2010), pp. 1525-1534.
[2] Dixon, M., Daniel Leventhal and Fogarty, J. (2011). Content and Hierarchy in Pixel-Based Methods for Reverse-Engineering Interface Structure. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. (CHI 2011), pp. 969-978.
[3] Wikipedia, Summed Area Table, http://en.wikipedia.org/wiki/Summed_area_table [4]Wikipedia, Aho-Corasick String Matching Algorithm, http://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_string_matching_algorithm
[5] Kilpeläinen P. (2005 Spring). Set matching and Aho-Corasick Algorithm, http://www.cs.uku.fi/~kilpelai/BSA05/lectures/slides04.pdf