Put
ting
Con
text
into
V
isio
n
Der
ek H
oiem
Sep
tem
ber 1
5, 2
004
Que
stio
ns to
Ans
wer
Wha
t is
cont
ext?
How
is c
onte
xt u
sed
in h
uman
vis
ion?
How
is c
onte
xt c
urre
ntly
use
d in
com
pute
r vi
sion
?
Con
clus
ions
Wha
t is
cont
ext?
Any
dat
a or
met
a-da
ta n
ot d
irect
ly
prod
uced
by
the
pres
ence
of a
n ob
ject
Nea
rby
imag
e da
taContext
Wha
t is
cont
ext?
Any
dat
a or
met
a-da
ta n
ot d
irect
ly
prod
uced
by
the
pres
ence
of a
n ob
ject
Nea
rby
imag
e da
taS
cene
info
rmat
ion
Context
Context
Wha
t is
cont
ext?
Any
dat
a or
met
a-da
ta n
ot d
irect
ly
prod
uced
by
the
pres
ence
of a
n ob
ject
Nea
rby
imag
e da
taS
cene
info
rmat
ion
Pre
senc
e, lo
catio
ns o
f oth
er o
bjec
tsTree
How
do
we
use
cont
ext?
Atte
ntio
n
Are
ther
e an
y liv
e fis
h in
this
pic
ture
?
Clu
es fo
r Fun
ctio
n
Wha
t is
this
?
Clu
es fo
r Fun
ctio
n
Wha
t is
this
?
Now
can
you
tell?
Low
-Res
Sce
nes
Wha
t is
this
?
Low
-Res
Sce
nes
Wha
t is
this
?
Now
can
you
tell?
Mor
e Lo
w-R
es
Wha
t are
thes
e bl
obs?
Mor
e Lo
w-R
es
The
sam
e pi
xels
! (a
car)
Why
is c
onte
xt u
sefu
l?
Obj
ects
def
ined
at l
east
par
tially
by
func
tion •Tr
ees
grow
in g
roun
d •
Bird
s ca
n fly
(usu
ally
)•
Doo
r kno
bs h
elp
open
doo
rs
Why
is c
onte
xt u
sefu
l?
Obj
ects
def
ined
at l
east
par
tially
by
func
tion
Con
text
giv
es c
lues
abo
ut fu
nctio
n•
Not
root
ed in
to th
e gr
ound
no
t tre
e•
Obj
ect i
n sk
y {c
loud
, bird
, UFO
, pla
ne,
supe
rman
} •
Doo
r kno
bs a
lway
s on
doo
rs
Why
is c
onte
xt u
sefu
l?
Obj
ects
def
ined
at l
east
par
tially
by
func
tion
Con
text
giv
es c
lues
abo
ut fu
nctio
nO
bjec
ts li
ke s
ome
scen
es b
ette
r tha
n ot
hers •To
ilets
like
bat
hroo
ms
•Fi
sh li
ke w
ater
Why
is c
onte
xt u
sefu
l?
Obj
ects
def
ined
at l
east
par
tially
by
func
tion
Con
text
giv
es c
lues
abo
ut fu
nctio
nO
bjec
ts li
ke s
ome
scen
es b
ette
r tha
n ot
hers
Man
y ob
ject
s ar
e us
ed to
geth
er a
nd,
thus
, ofte
n ap
pear
toge
ther
•K
ettle
and
sto
ve•
Key
boar
d an
d m
onito
r
How
is c
onte
xt u
sed
in
com
pute
r vis
ion?
Nei
ghbo
r-ba
sed
Con
text
Mar
kov
Ran
dom
Fie
ld (M
RF)
in
corp
orat
es c
onte
xtua
l con
stra
ints
site
s (o
r nod
es)
neig
hbor
s
Blo
bs a
nd W
ords
–C
arbo
netto
2004
Nei
ghbo
r-ba
sed
cont
ext (
MR
F) u
sefu
l eve
n w
hen
train
ing
data
is n
ot fu
lly s
uper
vise
d
Lear
ns m
odel
s of
obj
ects
giv
en c
aptio
ned
imag
es
…
Dis
crim
inat
ive
Ran
dom
Fi
elds
–K
umar
200
3U
sing
dat
a su
rrou
ndin
g th
e la
bel s
ite (n
ot ju
st a
t th
e la
bel s
ite) i
mpr
oves
resu
lts
Bui
ldin
gs v
s.
Non
-Bui
ldin
gs
Mul
ti-sc
ale
Con
ditio
nal R
ando
m
Fiel
d (m
CR
F) –
He
2004
Inde
pend
ent d
ata-
base
d la
bels
Raw
imag
e
Loca
l con
text
Sce
ne c
onte
xt
mC
RF
Fina
l dec
isio
n ba
sed
onC
lass
ifica
tion
(loca
l dat
a-ba
sed)
Loca
l lab
els
(wha
t rel
atio
n ne
arby
ob
ject
s ha
ve to
eac
h ot
her
Imag
e-w
ide
labe
ls (c
aptu
res
coar
se
scen
e co
ntex
t)
mC
RF
Res
ults
Nei
ghbo
r-ba
sed
Con
text
R
efer
ence
s
P. C
arbo
netto
, N. F
reita
san
d K
. Bar
nard
. “A
Sta
tistic
al
Mod
el fo
r Gen
eral
Con
text
ual O
bjec
t Rec
ogni
tion,
” E
CC
V, 2
004
S. K
umar
and
M. H
eber
t, “D
iscr
imin
ativ
e R
ando
m
Fiel
ds: A
Dis
crim
inat
ive
Fram
ewor
k fo
r Con
text
ual
Inte
ract
ion
in C
lass
ifica
tion,
” IC
CV
, 200
3
X. H
e, R
. Zem
elan
d M
. Car
reira
-Per
piñá
n, “M
ultis
cale
Con
ditio
nal R
ando
m F
ield
s fo
r Im
age
Labe
ling,
” C
VP
R, 2
004
Sce
ne-b
ased
Con
text
Ave
rage
pic
ture
s co
ntai
ning
hea
ds a
t thr
ee s
cale
s
Con
text
Prim
ing
–To
rral
ba20
01/2
003
Obj
ect
Pre
senc
e
Con
text
(gen
eral
ly ig
nore
d)
Loca
l evi
denc
e
(wha
t eve
ryon
e us
es)
Sca
leLo
catio
nP
ose
Imag
e M
easu
rem
ents
Pos
e an
d S
hape
Prim
ing
Focu
s of
A
ttent
ion
Sca
le
Sel
ectio
nO
bjec
t P
rimin
g
Get
ting
the
Gis
t of a
Sce
ne
Sim
ple
repr
esen
tatio
n
Spe
ctra
l cha
ract
eris
tics
(e.g
., G
abor
filte
rs) w
ith
coar
se d
escr
iptio
n of
spa
tial a
rran
gem
ent
PC
A re
duct
ion
Pro
babi
litie
s m
odel
ed w
ith m
ixtu
re o
f Gau
ssia
ns
(200
3) o
r log
istic
regr
essi
on (M
urph
y 20
03)
Con
text
Prim
ing
Res
ults
Obj
ect P
rese
nce
Focu
s of
Atte
ntio
n
Sca
le S
elec
tion
Sm
all
Larg
e
Usi
ng th
e Fo
rres
t to
See
the
Tree
s –
Mur
phy
(200
3)
Ada
boos
t Pat
ch-B
ased
O
bjec
t Det
ecto
rG
ist o
f Sce
ne(b
oost
ed re
gres
sion
)
Det
ecto
r Con
fiden
ce
at L
ocat
ion/
Sca
leE
xpec
ted
Loca
tion/
Sca
le
of O
bjec
t
Com
bine
(logi
stic
regr
essi
on)
Obj
ect P
roba
bilit
y at
Lo
catio
n/S
cale
Obj
ect D
etec
tion
+ S
cene
C
onte
xt R
esul
ts
Ofte
n do
esn’
t hel
p th
at m
uch
May
be
due
to p
oor u
se o
f con
text
Ass
umes
inde
pend
ence
of c
onte
xt a
nd lo
cal e
vide
nce
Onl
y us
es e
xpec
ted
loca
tion/
scal
e fro
m c
onte
xt
Key
boar
dS
cree
nP
eopl
e
Sce
ne-b
ased
Con
text
R
efer
ence
sE
. Ade
lson
, “O
n S
eein
g S
tuff:
The
Per
cept
ion
of M
ater
ials
by
Hum
ans
and
Mac
hine
s,” S
PIE
, 200
1
B. B
ose
and
E. G
rimso
n, “I
mpr
ovin
g O
bjec
t Cla
ssifi
catio
n in
Far
-Fie
ld
Vid
eo,”
EC
CV
, 200
4
K. M
urph
y, A
. Tor
ralb
aan
d W
. Fre
eman
, “U
sing
the
Forr
est t
o S
ee th
e Tr
ees:
A G
raph
ical
Mod
el R
elat
ing
Feat
ures
, Obj
ect,
and
Sce
nes,
”NIP
S,
2003
U. R
utis
haus
er, D
. Wal
ther
, C. K
och,
and
P. P
eron
a, “I
s bo
ttom
-up
atte
ntio
n us
eful
for o
bjec
t rec
ogni
tion?
,” C
VP
R, 2
004
A. T
orra
lba,
“Con
text
ual P
rimin
g fo
r Obj
ect D
etec
tion,
” IJC
V, 2
003
A. T
orra
lba
and
P. S
inha
, “S
tatis
tical
Con
text
Prim
ing
for O
bjec
t Det
ectio
n,”
ICC
V, 2
001
A. T
orra
lba,
K. M
urph
y, W
. Fre
eman
, and
M. R
ubin
, “C
onte
xt-B
ased
Vis
ion
Sys
tem
for P
lace
and
Obj
ect R
ecog
nitio
n,” I
CC
V, 2
003
Obj
ect-b
ased
Con
text
Mut
ual B
oost
ing
–Fi
nk
(200
3)
Loca
l Win
dow
Con
text
ual W
indo
w
eyes
face
sFi
lters
Obj
ect
Like
lihoo
ds+
+R
aw Im
age
Con
fiden
ce
Mut
ual B
oost
ing
Res
ults
Lear
ned
Feat
ures
Firs
t-Sta
ge C
lass
ifier
(MIT
+CM
U)
Con
text
ual M
odel
s us
ing
BR
Fs–
Torr
alba
2004
Tem
plat
e fe
atur
es
Bui
ld s
truct
ure
of C
RF
usin
g bo
ostin
g
Oth
er o
bjec
ts’ l
ocat
ions
’ lik
elih
oods
pr
opag
ate
thro
ugh
netw
ork
Labe
ling
a S
treet
Sce
ne
Labe
ling
an O
ffice
Sce
ne
F =
loca
l
G =
com
patib
ility
Obj
ect-b
ased
Con
text
R
efer
ence
s
M. F
ink
and
P. P
eron
a, “M
utua
l Boo
stin
g fo
r C
onte
xtua
l Inf
eren
ce,”
NIP
S, 2
003
A. T
orra
lba,
K. M
urph
y, a
nd W
. Fre
eman
, “C
onte
xtua
l M
odel
s fo
r Obj
ect D
etec
tion
usin
g B
oost
ed R
ando
m
Fiel
ds,”
AI M
emo
2004
-013
, 200
4
Wha
t els
e ca
n be
don
e?
Sce
ne S
truct
ure
Impr
ove
unde
rsta
ndin
g of
sce
ne
stru
ctur
eFl
oor,
wal
ls, c
eilin
gS
ky, g
roun
d, ro
ads,
bui
ldin
gs
Sem
antic
s vs
. Low
-leve
l
Low
-Lev
elS
eman
tics
Imag
e S
tatis
tics
Pre
senc
e of
Obj
ect
Hig
h-D
Sta
tistic
sU
sefu
l to
OD H
igh-
D
Imag
e S
tatis
tics
Sem
antic
s
Pre
senc
e of
Obj
ect
Hig
h-D
Low
-D
Hig
h-D
Sta
tistic
sU
sefu
l to
OD H
igh-
D
Put
ting
it al
l tog
ethe
r
Sce
ne G
ist
Con
text
Prim
ing
Bas
ic S
truct
ure
Iden
tific
atio
nN
eigh
bor-B
ased
C
onte
xt
Obj
ect
Det
ectio
n
Obj
ect-B
ased
C
onte
xt
Sce
neR
ecog
nitio
nS
cene
-Bas
ed
Con
text
Sum
mar
y
Nei
ghbo
r-ba
sed
cont
ext
Usi
ng n
earb
y la
bels
ess
entia
l for
“com
plet
e la
belin
g”
task
s
Usi
ng n
earb
y la
bels
use
ful e
ven
with
out c
ompl
etel
y su
perv
ised
trai
ning
dat
a
Usi
ng n
earb
y la
bels
and
near
by d
ata
is b
ette
r tha
n ju
st u
sing
nea
rby
labe
ls
Labe
ls c
an b
e us
ed to
ext
ract
loca
l and
sce
ne c
onte
xt
Sum
mar
y
Sce
ne-b
ased
con
text
“Gis
t” re
pres
enta
tion
suita
ble
for f
ocus
ing
atte
ntio
n or
de
term
inin
g lik
elih
ood
of o
bjec
t pre
senc
e
Sce
ne s
truct
ure
wou
ld p
rovi
de a
dditi
onal
use
ful
info
rmat
ion
(but
diff
icul
t to
extra
ct)
Sce
ne la
bel w
ould
pro
vide
add
ition
al u
sefu
l in
form
atio
n
Sum
mar
y
Obj
ect-b
ased
con
text
Eve
n si
mpl
e m
etho
ds o
f usi
ng o
ther
obj
ects
’ loc
atio
ns
impr
ove
resu
lts (F
ink)
Usi
ng B
RFs
, sys
tem
s ca
n au
tom
atic
ally
lear
n to
find
ea
sier
obj
ects
firs
t and
to u
se th
ose
obje
cts
as
cont
ext f
or o
ther
obj
ects
Con
clus
ions
Gen
eral
Few
obj
ect d
etec
tion
rese
arch
ers
use
cont
ext
Con
text
, whe
n us
ed e
ffect
ivel
y, c
an im
prov
e re
sults
dr
amat
ical
ly
A m
ore
inte
grat
ed a
ppro
ach
to u
se o
f con
text
and
da
ta c
ould
impr
ove
imag
e un
ders
tand
ing
Ref
eren
ces
E. A
dels
on, “
On
See
ing
Stu
ff: T
he P
erce
ptio
n of
Mat
eria
ls b
y H
uman
s an
d M
achi
nes,
” SP
IE,
2001
B. B
ose
and
E. G
rimso
n, “I
mpr
ovin
g O
bjec
t Cla
ssifi
catio
n in
Far
-Fie
ld V
ideo
,” E
CC
V, 2
004
P. C
arbo
netto
, N. F
reita
san
d K
. Bar
nard
. “A
Sta
tistic
al M
odel
for G
ener
al C
onte
xtua
l Obj
ect
Rec
ogni
tion,
” EC
CV
, 200
4M
. Fin
k an
d P
. Per
ona,
“Mut
ual B
oost
ing
for C
onte
xtua
l Inf
eren
ce,”
NIP
S, 2
003
X. H
e, R
. Zem
elan
d M
. Car
reira
-Per
piñá
n, “M
ultis
cale
Con
ditio
nal R
ando
m F
ield
s fo
r Im
age
Labe
ling,
” CV
PR
, 200
4 S
. Kum
ar a
nd M
. Heb
ert,
“Dis
crim
inat
ive
Ran
dom
Fie
lds:
A D
iscr
imin
ativ
e Fr
amew
ork
for
Con
text
ual I
nter
actio
n in
Cla
ssifi
catio
n,” I
CC
V, 2
003
J. L
affe
rty, A
. McC
allu
m a
nd F
. Per
eira
, “C
ondi
tiona
l ran
dom
fiel
ds: P
roba
bilis
tic m
odel
s fo
r se
gmen
ting
and
labe
ling
sequ
ence
dat
a,” I
CM
L, 2
001
K. M
urph
y, A
. Tor
ralb
aan
d W
. Fre
eman
, “U
sing
the
Forre
st to
See
the
Tree
s: A
Gra
phic
al
Mod
el R
elat
ing
Feat
ures
, Obj
ect,
and
Sce
nes,
” NIP
S, 2
003
U. R
utis
haus
er, D
. Wal
ther
, C. K
och,
and
P. P
eron
a, “I
s bo
ttom
-up
atte
ntio
n us
eful
for o
bjec
t re
cogn
ition
?,”
CV
PR
, 200
4A
. Tor
ralb
a, “C
onte
xtua
l Prim
ing
for O
bjec
t Det
ectio
n,” I
JCV
, 200
3A
. Tor
ralb
aan
d P
. Sin
ha, “
Stat
istic
al C
onte
xt P
rimin
g fo
r Obj
ect D
etec
tion,
” IC
CV
, 200
1A
. Tor
ralb
a, K
. Mur
phy,
and
W. F
reem
an, “
Con
text
ual M
odel
s fo
r Obj
ect D
etec
tion
usin
g B
oost
ed R
ando
m F
ield
s,” A
I Mem
o 20
04-0
13, 2
004
A. T
orra
lba,
K. M
urph
y, W
. Fre
eman
, and
M. R
ubin
, “C
onte
xt-B
ased
Vis
ion
Sys
tem
for P
lace
an
d O
bjec
t Rec
ogni
tion,
” IC
CV
, 200
3
Top Related