Deep Multi-Modal Image Correspondence Learning ...chenliu/floorplan-matching/... · Deep...

7
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 106 107 CVPR #138 CVPR #138 CVPR 2017 Submission #138. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE. Deep Multi-Modal Image Correspondence Learning Supplementary Material Anonymous CVPR submission Paper ID 138 In the supplementary material, we show more compre- hensive experimental results and evaluations for 1) receptive field visualization (Figure 1 and Figure 2), 2) object discov- ery visualization (Figure 3), 3) image placement application (Figure 4), 4) image retrieval application (Figure 5), and 5) localization application (Figure 6 and Figure 7). References [1] C. Barnes, E. Shechtman, A. Finkelstein, and D. Goldman. Patchmatch: a randomized correspondence algorithm for struc- tural image editing. ACM TOG, 28(3):24, 2009. 3 [2] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Object detectors emerge in deep scene cnns. In ICLR, 2015. 1, 6 Figure 1: More receptive field visualization results. We adapt the idea in [2] to visualize the learned Receptive Fields (RFs). The network learns to attend to the room in a floor- plan corresponding to the photograph. These results are for bathrooms. 1

Transcript of Deep Multi-Modal Image Correspondence Learning ...chenliu/floorplan-matching/... · Deep...

000001002003004005006007008009010011012013014015016017018019020021022023024025026027028029030031032033034035036037038039040041042043044045046047048049050051052053

054055056057058059060061062063064065066067068069070071072073074075076077078079080081082083084085086087088089090091092093094095096097098099100101102103104105106107

CVPR#138

CVPR#138

CVPR 2017 Submission #138. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

Deep Multi-Modal Image Correspondence LearningSupplementary Material

Anonymous CVPR submission

Paper ID 138

In the supplementary material, we show more compre-hensive experimental results and evaluations for 1) receptivefield visualization (Figure 1 and Figure 2), 2) object discov-ery visualization (Figure 3), 3) image placement application(Figure 4), 4) image retrieval application (Figure 5), and 5)localization application (Figure 6 and Figure 7).

References[1] C. Barnes, E. Shechtman, A. Finkelstein, and D. Goldman.

Patchmatch: a randomized correspondence algorithm for struc-tural image editing. ACM TOG, 28(3):24, 2009. 3

[2] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba.Object detectors emerge in deep scene cnns. In ICLR, 2015. 1,6

Figure 1: More receptive field visualization results. Weadapt the idea in [2] to visualize the learned Receptive Fields(RFs). The network learns to attend to the room in a floor-plan corresponding to the photograph. These results are forbathrooms.

1

108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161

162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215

CVPR#138

CVPR#138

CVPR 2017 Submission #138. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

Figure 2: More receptive field visualization results. Rows 1-2 are for kitchens and 3-4 are for living-rooms. Row 5-7 showfailure cases (bathrooms, kitchens, and living-rooms, respectively).

2

216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269

270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323

CVPR#138

CVPR#138

CVPR 2017 Submission #138. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

Figure 3: More object discovery visualization results. We remove an object from a photograph by PatchMatch [1] and thecorresponding object icon from a floorplan manually. The similarity scores often suggest that the network has learned to detectobjects and match their detections to overcome the modality differences. Two examples in the bottom row are failure caseswhere the similarity scores fail to follow the expected pattern.

3

324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377

378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431

CVPR#138

CVPR#138

CVPR 2017 Submission #138. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

Figure 4: More image placement results. The success of RF visualization allows us to visualize photographs in the context of afloorplan image, an effective application for real estate websites.

4

432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485

486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539

CVPR#138

CVPR#138

CVPR 2017 Submission #138. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

Figure 5: More image retrieval results. Our network is able to show likely room apperances only from a floorplan. Rows 1-3,4-6, and 7-8 are for bathrooms, kitchens, and living-rooms, respectively.

5

540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593

594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647

CVPR#138

CVPR#138

CVPR 2017 Submission #138. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

Figure 6: More localization results. The simplification technique [2] is particularly effective for floorplan images, enabling thelocalization of an exact image region corresponding to the photograph.

6

648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701

702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755

CVPR#138

CVPR#138

CVPR 2017 Submission #138. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

Figure 7: More localization results.

7