Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

37
Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia Agenda Agenda

Transcript of Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Page 1: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

AgendaAgenda

Page 2: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Presentation of ImageLab Presentation of ImageLab

Computer Vision

for robotic automation

Digital Library content-based

retrieval

Medical Imaging

Off-line Video analysis

for telemetryand forensics

People and vehicle

surveillance

Video analysisfor

indoor/outdoorsurveillance

Multimedia: video

annotation

Imagelab-SoftechLab of Computer Vision,Pattern Recognition and Multimedia

Dipartimento di Ingegneria dell’Informazione

Università di Modena e Reggio Emilia Italyhttp://imagelab.ing.unimore.it

Page 3: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Imagelab: recent projects in surveillanceImagelab: recent projects in surveillance• THIS Transport hubs intelligent surveillance EU JLS/CHIPS

Project 2009-2010• VIDI-Video: STREP VI FP EU (VISOR VideosSurveillance

Online Repository) 2007-2009

• BE SAFE NATO Science for Peace project 2007-2009• Detection of infiltrated objects for security 2006-2008

Australian Council

• Behave_Lib : Regione Emilia Romagna Tecnopolo Softech 2010-2013

• LAICA Regione Emilia Romagna; 2005-2007• FREE_SURF MIUR PRIN Project 2006-2008

• Building site surveillance: with Bridge-129 Italia 2009-2010

• Stopped Vehicles with Digitek Srl 2007-2008• SmokeWave: with Bridge-129 Italia 2007-2010• Sakbot for Traffic Analysis with Traficon 2004-2006• Mobile surveillance with Sistemi Integrati 2007

• Domotica per disabili: posture detection FCRM 2004-2005

Projects:

European

International

Italian & Regional

With Companies

Page 4: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

AD-HOC: Appearance Driven Human tracking AD-HOC: Appearance Driven Human tracking with Occlusion Handlingwith Occlusion Handling

Page 5: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Key aspectsKey aspects

• Based on the SAKBOT system– Background estimation and updating– Shadow removal

• Appearance based tracking– we aim at recovering a pixel based foreground mask,

even during an occlusion– Recovering of missing parts from the background

subtraction– Managing split and merge situations

• Occlusion detection and classification– Classify the differences as real shape changes or

occlusions

Page 6: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Example 1 (from ViSOR)Example 1 (from ViSOR)

Page 7: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Example 2 from PETS 2002Example 2 from PETS 2002

Page 8: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Example 3Example 3

Page 9: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Other experimental resultsOther experimental resultsImagelab videos (available on ViSOR)

PETS series

Page 10: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Results on the PETS2006 datasetResults on the PETS2006 dataset

Working in real time at 10 fps!

Working in real time at 10 fps!

Page 11: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Posture classificationPosture classification

11

Page 12: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Distributed surveillanceDistributed surveillancewith non overlapping field of viewwith non overlapping field of view

Page 13: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Exploit the knowledge about the sceneExploit the knowledge about the scene

• To avoid all-to-all matches, the tracking system can exploit the knowledge about the scene– Preferential paths -> Pathnodes– Border line / exit zones– Physical constraints & Forbidden zones NVR– Temporal constraints

Page 14: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Tracking with pathnodeTracking with pathnode

A possible path between Camera1 and Camera 4

Page 15: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Pathnodes lead particle diffusionPathnodes lead particle diffusion

Page 16: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Results with PF and pathnodesResults with PF and pathnodes

Single camera tracking: Multicamera tracking Recall=90.27% Recall=84.16%

Precision=88.64% Precision=80.00%

Page 17: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

“VIP: Vision tool for comparing Images of People” Lantagne & al., Vision Interface 2003

Each extracted silhouette is segmented into significant region using the JSEG algorithm( Y. Deng ,B.S. Manjunath: “Unsupervised segmentation of color-texture regions in images and video” )

Colour and texture descriptors are calculated for each region

The colour descriptor is a modified version of the descriptor presented in Y. Deng et al.: “Efficient color representation for Image retrieval”. Basically an HSV histogram of the dominant colors.

The texture descriptor is based on D.K.Park et al.: “Efficient Use of Local Edge Histogram Descriptor”. Essentially this descriptor characterizes the edge density inside a region according to different orientations ( 0°, 45°, 90° and 135° )

The similarity between two regions is the weighted sum of the two descriptor similarities:

Page 18: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

To compare the regions inside two silhouette, a region matching scheme is used,involving a modified version of the IRM algorithm presented in J.Z. Wang et al, ”Simplicity:Semantics-sensitive integrated matching for picture libraries” .

The IRM algorithm is simple and works as follows:

1) The first step is to calculate all of the similaritiesbetween all regions.

2) Similarities are sorted in decreasing order, thefirst one is selected, and areas of therespective pair of regions are compared.A weight, equal to the smallest percentage areabetween the two regions, is assigned to the similarity measure.

3) Then, the percentage area of the largest region is updated by removing the percentage area of the smallest region so that it can be matched again. The smallest region will not be matched anymore with any other region.

4) The process continues in decreasing order for all of the similarities.

In the end the overall similarity between the two region sets is calculated as:

Page 19: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

ViSOR: Video Surveillance Online RepositoryViSOR: Video Surveillance Online Repository

Page 20: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

The ViSOR video repositoryThe ViSOR video repository

Page 21: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Aims of ViSOR

• Gather and make freely available a repository of surveillance videos

• Store metadata annotations, both

manually provided as ground-truth and automatically generated by video surveillance tools and systems

• Execute Online performance evaluation and comparison

• Create an open forum to exchange, compare and discuss problems and results on video surveillance

Page 22: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Different types of annotation

• Structural Annotation: video size, authors, keywords,…

• Base Annotation: ground-truth, with concepts referred to the whole video. Annotation tool: online!

• GT Annotation: ground-truth, with a frame level annotation; concepts can be referred to the whole video, to a frame interval or to a single frame. Annotation tool: Viper-GT (offline)

• Automatic Annotation: output of automatic systems shared by ViSOR users.

Page 23: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Video corpus set: the 14 categories

Page 24: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Outdoor multicameraOutdoor multicamera

Synchronized views

Page 25: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Surveillance of entrance door of a buildingSurveillance of entrance door of a building

• About 10h!

Page 26: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Videos for smoke detection with GTVideos for smoke detection with GT

Page 27: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Videos for shadow detectionVideos for shadow detection

• Already used from many researcher working on shadow detection

• Some videos with GT

A. Prati, I. Mikic, M.M. Trivedi, R. Cucchiara, "Detecting Moving Shadows: Algorithms and Evaluation" in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, n. 7, pp. 918-923, July, 2003

Page 28: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

Some statisticsSome statistics

We need videos and annotations!

Page 29: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

SIMULTANEOUS HMM ACTION SIMULTANEOUS HMM ACTION SEGMENTATION AND RECOGNITIONSEGMENTATION AND RECOGNITION

Action recognition

Page 30: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

• Classical approach: – Given a set of training videos

containing an atomic action each (manually labelled)

– Given a new video with a single action

• … find the most likely action

Dataset: "Actions as Space-Time Shapes (ICCV '05)." M. Blank, L. Gorelick, E. Shechtman, M. Irani, R. Basri

Probabilistic Action ClassificationProbabilistic Action Classification

Page 31: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

• Definition of a feature set• For each frame t, computation of the feature set

Ot (observations)

• Given a set of training observations O={O1…OT} for each action, training of an HMM (k) for each action k

• Given a new set of observations O={O1…OT}

• Find the model (k) which maximise P(k|O)

Classical HMM FrameworkClassical HMM Framework

Page 32: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

• Computed on the extracted blob after the foreground segmentation and people tracking:

A sample 17-dim feature setA sample 17-dim feature set

Page 33: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

From the Rabiner tutorialFrom the Rabiner tutorial

Page 34: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

• Given a video with a sequence of actions

– Which is the current action? Frame by frame action classification(online – Action recognition)

– When does an action finish and the next one start? (offline – Action segmentation)

R. Vezzani, M. Piccardi, R. Cucchiara, "An efficient Bayesian framework for on-line action recognition" in press on Proceedings of the IEEE International Conference on Image Processing, Cairo, Egypt, November 7-11, 2009

Online action RecognitionOnline action Recognition

Page 35: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

I do not know when the action starts and when it finishes.

Using all the observations, the first action only is recognized

A possible solution: “brute force”. For each action, for each starting frame, for each ending frame, compute the model likelihood and select the maximum. UNFEASIBLE

Main problem of this approachMain problem of this approach

Page 36: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

• Subsample of the starting frames (1 each 10)• Adoption of recursive formulas• Computation of the emission probability once for

each model (Action)

• Current frame as Ending frame• Maximum length of each action

• The computational complexity is compliant with real time requirements

Our approachOur approach

Page 37: Roberto Vezzani - Imagelab – Università di Modena e Reggio EmiliaAgenda.

Roberto Vezzani - Imagelab – Università di Modena e Reggio Emilia

• Sequences with different starting frame have different length

• Unfair comparisons using the traditional HMM schema

• The output of each HMM is normalized using the sequence length and a term related to the mean duration of the considered action

• This allows to classify the current action and, at the same time, to perform an online action segmentation

Different length sequencesDifferent length sequences