Li- Jia Li, Richard Socher , Li Fei-Fei

download Li- Jia  Li, Richard  Socher , Li  Fei-Fei

of 43

  • date post

  • Category


  • view

  • download


Embed Size (px)


Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework. Li- Jia Li, Richard Socher , Li Fei-Fei. City Travel. Pagoda. Sunrise Sunshine Sun. Weber et al 00 Fergus et al 03 Felzenswalb et al 04. Classification. City Travel. - PowerPoint PPT Presentation

Transcript of Li- Jia Li, Richard Socher , Li Fei-Fei

  • Towards Total Scene Understanding:Classification, Annotation and Segmentation in an Automatic FrameworkLi-Jia Li, Richard Socher, Li Fei-Fei*

  • *City TravelPagodaSunriseSunshineSun

  • *City TravelPagodaSunriseSunshineSunSegmentationClassificationAnnotationRemark: Approaches in yellow will be used to compare withour model in later Experiments.

  • *City TravelPagodaSunriseSunshineSunSegmentationClassificationAnnotationTotal Scene Understanding

  • Application*

  • *ClassificationAnnotationSegmentationMutually beneficial!

  • *AthleteHorseGrassTreesSkySaddleClassificationAnnotationSegmentationHorseHorseclass: Polo

  • *HorseHorseHorseHorseHorseSkyTreeGrassAthleteHorseGrassTreesSkySaddleClassificationAnnotationSegmentationHorseAthleteclass: Polo

  • *class: PoloHorseHorseHorseHorseHorseAthleteHorseGrassTreesSkySaddleClassificationAnnotationSegmentation

  • *Related Work:Tu et al 03AnnotationSegmentationLi & Fei-Fei 07AnnotationClassificationSkyGrassHorseAthleteClass: PoloClassificationSegmentationTreeHeitz et al 08Class: Polo

  • LearningModelRecognition & ExperimentOutlineClassificationAnnotationSegmentation

  • *CNrORNFXArNtZSTDAthleteHorseGrassTreesSkySaddle

  • *CVisualTextclass: PoloAthleteHorseGrassTreesSkySaddle

    Joint distribution of random variableVisual ComponentText Component.D

  • *O*Text Component.DVisualTextCclass: Polo

  • *RNFColor LocationTexture ShapeText Component.ODVisualTextCclass: Polo

  • RNFODVisualTextCclass: Polo*XArText Component.

  • RNFODVisualTextCclass: PoloXArZNrNtConnector variableAthleteHorseGrassTreesSkySaddle

    Text Component.

  • RNFODVisualTextCclass: PoloXArZNrNtConnector variable.SAthleteHorseGrassTreesSkySaddle

    AthleteHorseGrassTreesSkySaddleVisibleNot visibleSwitch variable

  • RNFODVisualTextCclass: PoloXArZNrNtConnector variableSAthleteHorseGrassTreesSkySaddle

    VisibleNot visibleSwitch variableTHorse.

  • LearningModelRecognition & ExperimentOutline

  • *LearningExact Inference is Intractable !Relationship of the random variables

  • *Relationship of the random variablesTop-down forceBottom-up force from visual informationBottom-up force from text informationCollapsed Gibbs Sampling(R. Neal, 2000)

  • Scene/Event imagesfrom the InternetThere is no object-text correspondence *

  • Scene/Event imagesfrom the InternetOur model builds the correspondence CNrORNFXArNtZSTD*

  • *AthleteHorseGrassTreesSkySaddleAthleteHorseGrassBall

    However, a big obstacle is: many objects always co-occur together???Scene/Event imagesfrom the Internet

  • *CRNFXArNrZNtTSOOne solution: some good initialization of OAthleteHorseGrassTreesSkySaddleScene/Event imagesfrom the Internet

  • Scene/Event imagesfrom the Internet*Initializing O: obtain internet images for each O Object images

  • *Scene/Event imagesCRNFXArNrZNtTSOAny object detection& segmentationAlgorithmDInitializing O: train an object detector for each OObject imagesEvent/Scene images

  • *Scene/Event imagesBlack box object detection& segmentationBlack box object detection& segmentationCRNFXArNrZNtTSODInitialize O in the scene image by the trained object detectorsObject imagesEvent/Scene imagesAny object detection& segmentationAlgorithm

  • *Scene/Event imagesBlack box object detection& segmentationBlack box object detection& segmentationCRNFXArNrZNtTSOBlack box object detection& segmentationDInitialize O in the scene image by the trained object detectorsCao & Fei-Fei, 2007CXRONrArOur ModelObject imagesEvent/Scene images

  • CRNFXArNrZNtTSODAuto-semi-supervised learning: Small # of initialized images + Large # of uninitialized imagesOur Model+Small # of initialized imagesLarge # of uninitialized imagesScene/Event images

  • Large # of uninitialized imagesLearningModelRecognition & ExperimentOutlineSmall # of automatically initialized images

  • Badminton Bocce Croquet Polo*8 Event/Scene ClassesRemark: Tags are not used during testing

  • Rock climbing Rowing Sailing Snowboarding*8 Event/Scene Classes

  • *CNrRNFXArNtZSTLearned model: ODO

  • *AthleteGrassHorseCNrONFXArNtZSTDRLearned model: R

  • *CNrORNFXArNtZTDSLearned model: S

  • *8 way classification: 54%ClassificationAnnotationSegmentation

  • *ClassificationAnnotationSegmentationAlipr: Li et al 03Corr LDA: Blei et al 03

  • *ClassificationAnnotationSegmentation

  • Effect of top-down class context*HorseModel w/o top-down classFull Model

  • Large # of uninitialized imagesSmall # of automatically initialized imagesLearningModelRecognition & Experiment

  • ThankProf. Silvio Savarese , Juan Carlos Niebles, Chong Wang, Barry Chai, Min Sun, Bangpeng Yao, Hao Su, Jia Deng, anonymous reviewers

    And You