Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2
description
Transcript of Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2
![Page 1: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/1.jpg)
ICDSC'08 1
MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN ARESOURCE CONSTRAINED MULTIMODAL
SENSOR NETWORKJayanth Nayak1, Luis Gonzalez-Argueta2, Bi Song2,
Amit Roy-Chowdhury2, Ertem Tuncel2Department of Electrical Engineering,
University of California, Riverside
9/8/2008
Bourns College of EngineeringInformation Processing Laboratorywww.ipl.ee.ucr.edu
![Page 2: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/2.jpg)
ICDSC'08 2
Overview
IntroductionProblem FormulationAudio And Video ProcessingCamera Control StrategyComputing Final Tracks Of All TargetsExperimental ResultsConclusionAcknowledgements
9/8/2008
![Page 3: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/3.jpg)
ICDSC'08 3
Motivation
Obtaining multi-resolution video from a highly active environment requires a large number of cameras.Disadvantages
Cost of buying, installing and maintainingBandwidth limitationsProcessing and storagePrivacy
Our goal: minimize the quantity of cameras by a control mechanism that directs the attention of the cameras to the interesting parts.
9/8/2008
![Page 4: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/4.jpg)
ICDSC'08 4
Proposed Strategy
Audio sensors direct the pan/tilt/zoom of the camera to the location of the event.Audio data intelligently turns on the camera and video data turns off the camera.Audio and video data are fused to obtain tracks of all targets in the scene.
9/8/2008
![Page 5: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/5.jpg)
ICDSC'08 5
Example Scenario
9/8/2008
An example scenario where audio can be used to efficiently control two video cameras. There are four tracks that need to be inferred. Directly indicated on tracks are time instants of interest, i.e., initiation and end of each track, mergings, splittings, and cross-overs. The mergings and crossovers are further emphasized by X. Two innermost tracks coincide in the entire time interval (t2, t3). The cameras C1 and C2 need to be panned, zoomed, and tilted as decided based on their own output and that of the audio sensors a1, . . . , aM.
![Page 6: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/6.jpg)
ICDSC'08 6
Relation To Previous Work
Fusion of simultaneous audio and video data.Our audio and video data are captured at disjoint time intervals.
Dense network of vision sensors.In order to cover a large field, we focus on controlling a reduced set of vision sensors.
Our video and audio data is analyzed from dynamic scenes.
9/8/2008
![Page 7: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/7.jpg)
ICDSC'08 7
Problem Formulation
Audio sensors A = {a1, . . . , aM} are distributed across ground plane RR is also observable from a set of controllable cameras C = {c 1, . . . ,cL}.However, entire region R may not be covered with one set of camera settings.p-tracks: tracks belonging to targetsa-tracks: tracks obtained by clustering audioResolving p-track ambiguity
Camera ControlPerson Matching
9/8/2008
![Page 8: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/8.jpg)
ICDSC'08 8
Tracking System Overview
9/8/2008
a-tracks
Overall camera control system. Audio sensors A = {a1, . . . , aM} are distributed across regions Ri. The set of audio clusters are denoted by Bt, and Kt− represent the set of confirmed a-tracks estimated based on observations before time t. P/T/Z cameras are denoted by C = {c1, . . . , cL}. Ground plane positions are denoted by Ot
k .
![Page 9: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/9.jpg)
ICDSC'08 9
Processing Audio and Video
a-tracks are clusters of audio data that are above amplitude threshold
Tracked using Kalman FilterIn video, people are detected using histogram of orientated gradients and tracked using Auxilary Particle Filter
9/8/2008
![Page 10: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/10.jpg)
ICDSC'08 10
Mapping From Image Plane to Ground Plane
Learned parameters are used to transform tracks from image plane to ground planeEstimate projective transformation matrix H during a calibration phasePrecompute H for each PTZ setting of each camera
9/8/2008
vanishing line
![Page 11: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/11.jpg)
ICDSC'08 11
Tracking System Overview
9/8/2008
![Page 12: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/12.jpg)
ICDSC'08 12
Camera Control
Camera controlGoal: avoid ambiguity or disambiguate when tracks
are created or deletedintersectmerge
Set pan/tilt/zoom parameters
9/8/2008
![Page 13: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/13.jpg)
ICDSC'08 13
Setting Camera Parameters
Heuristic algorithmCover ground plane by regions Ri
l Ri
l in field of view of camera Cl Camera parameters
Tracking algorithm specifies point of interest x from last known a-track
If no camera on, find Ri l containing x
Reassign a camera and set its parameters if x approaches boundary of current Ri
l
9/8/2008
li
li
li ZTP ,,
![Page 14: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/14.jpg)
ICDSC'08 14
Camera Control Based on Track Trajectories
Intersection
9/8/2008
SeparationMerger
Sudden Appearance Undetected Disappearance
Sudden Disappearance
Locatio
n(M
eters)
Time(Seconds)
Locatio
n(M
eters)
Time(Seconds)
Locatio
n(M
eters)
Time(Seconds)
Locatio
n(M
eters)
Time(Seconds)
Locatio
n(M
eters)
Time(Seconds)
Switch to video
Locatio
n(M
eters)
Time(Seconds)
![Page 15: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/15.jpg)
ICDSC'08 15
Creating Final Tracks Of All Targets
Bipartite graph matching over a set of color histograms
We collect features as the target enters and exits the scene in video.For every new a-track, features are collected from a small set of frames.The weight of an edge is the distance between the observed video features.Additionally, audio data is enforced on the weights.
9/8/2008
![Page 16: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/16.jpg)
ICDSC'08 16
Creating Final Tracks Using Bipartite Matching
9/8/2008
Locatio
n(M
eters)
Time(Seconds)
Audio AudioVideo[a+, a-]
[b+, b-]
[c+]
[d+]
[e+, e-]
Tracking in Audio and Video
Locatio
n(M
eters)
Time(Seconds)
Tracking in Audio Only
Three tracks are recovered by matching every node (entry and exit from the scene) where video was capture.
Two tracks are recovered . However, red and green show the wrong path.
Audio cannot disambiguate independence once the clusters have merged.
[f+]
[g+]
Video
abcdefg
+-
Bipartite Graph Matching
abcdefg
abcdefg
+-
Bipartite Graph Matching Without Audio Constraint
abcdefg
[d-]
[c-]
![Page 17: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/17.jpg)
ICDSC'08 17
Experimental Results
9/8/2008
Inter P-Track Distance at a Merge EventInter P-Track Distance at a Crossover Event
![Page 18: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/18.jpg)
ICDSC'08 18
Experimental Results (Cont.)
9/8/2008
Click To Review Layout
![Page 19: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/19.jpg)
ICDSC'08 19
Conclusion
Goal: minimize camera usage in a surveillance system
Save power, bandwidth, storage and moneyAlleviate privacy concerns
Proposed a probabilistic scheme for opportunistically deploying cameras in a multimodal network. Showed detailed experimental results on real data collected in multimodal networks.Final set of tracks are computed by bipartite matching
9/8/2008
![Page 20: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/20.jpg)
ICDSC'08 20
Acknowledgements
This work was supported by Aware Building: ONR-N00014-07-C-0311 and the NSF CNS 0551719.
Bi Song2 and Amit Roy-Chowdhury2 were additionally supported by NSF-ECCS 0622176 and ARO-W911NF-07-1-0485.
9/8/2008
![Page 21: Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2](https://reader036.fdocuments.net/reader036/viewer/2022062323/568160b6550346895dcfd9f3/html5/thumbnails/21.jpg)
Thank You.
Questions?
Jayanth Nayak1
[email protected] Gonzalez-Argueta2, Bi Song2,
Amit Roy-Chowdhury2, Ertem Tuncel2
{largueta,bsong,amitrc,ertem}@ee.ucr.edu
9/8/2008
Bourns College of EngineeringInformation Processing Laboratorywww.ipl.ee.ucr.edu