Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of...
Transcript of Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of...
![Page 1: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/1.jpg)
Similarity Search inNon-text Data
Pavel Zezula
Faculty of Informatics
Masaryk University, Brno
24.11.2011 KEG 24.11.
![Page 2: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/2.jpg)
Real-Life MotivationThe social psychology view
• Any event in the history of organism is, in a sense, unique.
• Recognition, learning, and judgment presuppose an ability to categorize stimuli and classify situations by similarity.
• Similarity (proximity, resemblance, communality, representativeness, psychological distance, etc.) is fundamental to theories of perception, learning, judgment, etc.
KEG 24.11.24.11.2011
![Page 3: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/3.jpg)
Contemporary Networked MediaThe digital data view
• Almost everything that we see, read, hear, write, measure, or observe can be digital.
• Users autonomously contribute to production of global media and the growth is exponential.
• Sites like Flickr, YouTube, Facebook host user contributed content for a variety of events.
• The elements of networked media are related by numerous multi-facet links of similarity.
24.11.2011 KEG 24.11.
![Page 4: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/4.jpg)
Examples
• Does the computer disk of a suspected criminal contain illegal multimedia material?
• What are the stocks with similar price histories?
• Which companies advertise their logos in the direct TV transmission of football match?
• Is it the situation on the web getting close to any of the network attacks which resulted in significant damage in the past?
24.11.2011 KEG 24.11.
![Page 5: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/5.jpg)
Challenge
• Networked media is getting close to the human “fact-bases”.
• Similarity data management is needed to connect, search, filter, merge, relate, rank, cluster, classify, identify, or categorize objects across various collections.
WHY?It is the similarity which is in the world revealing.
24.11.2011 KEG 24.11.
![Page 6: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/6.jpg)
Limitations:Data Types
We have• Attributes
– Numbers, strings, etc.
• Text (text-based)– Documents, annotations
We need• Multimedia
– Image, video, audio
• Security – Biometrics
• Medicine– EKG, EEG, EMG, EMR, CT, etc.
• Scientific data– Biology, chemistry, physics,
life sciences, economics
• Others– Motion, emotion, events, etc.
24.11.2011 KEG 24.11.
![Page 7: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/7.jpg)
Limitations:Models of Similarity
We have• Simple geometric models,
typically vector spaces
We need• More complex model
• Non metric models
• Asymmetric similarity
• Subjective similarity
• Context aware similarity
• Complex similarity
• Etc.
24.11.2011 KEG 24.11.
![Page 8: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/8.jpg)
Limitations:Queries
We have• Simple query
– Nearest neighbor
– Range
We need• More query types
– Reverse NN, distinct NN, similarity join
• Other similarity-based operations– Filtering, classification, event
detection, clustering, etc.
• Similarity algebra– May become the basis of a
“Similarity Data Management System”
24.11.2011 KEG 24.11.
![Page 9: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/9.jpg)
Limitations:Implementation Strategies
We have• Centralized or parallel
processing
We need• Scalable and distributed
architectures
• MapReduce like approaches
• P2P architectures
• Cloud computing
• Self-organized architectures
• Etc.
24.11.2011 KEG 24.11.
![Page 10: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/10.jpg)
24.11.2011 KEG 24.11.
Search Strategy Evolution
Scalability● data volume - exponential● number of users (queries)● variety of data types● multi-lingual, -feature –modal queries
Determinismexact match ► similarityprecise ► approximatesame answer ► good answer; recommendationfixed query ► personalized; context awarefixed infrastr. ► dynamic mapping; mobile dev.
grad
e
high
low
well established cutting-edge research
pe
er-
to-p
ee
r
cen
tral
ize
d
par
alle
l
dis
trib
ute
d
self
-org
aniz
ed
![Page 11: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/11.jpg)
Word Cloud of Applications
24.11.2011 KEG 24.11.
![Page 12: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/12.jpg)
Metric Search Grows in Popularity
Hanan SametFoundation of Multidimensional andMetric Data StructuresMorgan Kaufmann, 2006
P. Zezula, G. Amato, V. Dohnal, and M. BatkoSimilarity Search: The Metric Space ApproachSpringer, 2006
24.11.2011 KEG 24.11.
![Page 13: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/13.jpg)
The MUFIN Approach
MUFIN: MUlti-Feature Indexing Network
SEARCH
infrastructure
ScalabilityP2P structure
Extensibilitymetric space
Tuning of performanceInternet / GRID / LANnetwork independence
24.11.2011 KEG 24.11.
![Page 14: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/14.jpg)
Metric Spacean Abstraction of Similarity
• Metric space: M = (D,d)– D – domain
– distance function d(x,y)
x,y,z D
• d(x,y) > 0 - non-negativity
• d(x,y) = 0 x = y - identity
• d(x,y) = d(y,x) - symmetry
• d(x,y) ≤ d(x,z) + d(z,y) - triangle inequality
24.11.2011 KEG 24.11.
![Page 15: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/15.jpg)
Examples of Distance Functions
• Lp metric functions (for vectors)• L1 – city-block distance
• L2 – Euclidean distance
• L – infinity
• edit distance (for strings)• minimal number of insertions, deletions and substitutions
• d(‘application’, ‘applet’) = 6
• Jaccard’s coefficient (for sets A,B)
n
i
ii yxyxL1
1 ||),(
n
i
ii yxyxL1
2
2 ),(
ii
n
i
yxyxL max),(1
BA
BABAd 1,
24.11.2011 KEG 24.11.
![Page 16: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/16.jpg)
Examples of Distance Functions
• Quadratic-form distance
– for vectors with correlated dimensions
• Hausdorff distance
– for sets with elements related by another distance
• Earth movers distance
– primarily for histograms (sets of weighted features)
• and many others
24.11.2011 KEG 24.11.
![Page 17: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/17.jpg)
Similarity Search Problem
• For X D in metric space M,
pre-process X so that the similarity queries
are executed efficiently.
• similarity queries– range search
– R(q,r) = { x X | d(q,x) r }
q D, r 0q
r
24.11.2011 KEG 24.11.
![Page 18: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/18.jpg)
Similarity Queries
• k-nearest neighbours
– NN(q,k) = A, q D, k > 0
– A X, |A| = k
– x A, y X – A, d(q,x) < d(q,y)
• similarity join
– X = {x1, x2, … xN}, Y = {y1, y2, … yM}
– {(xi,yj) | d(xi,yj) < }
– similarity „self“ join X = Y
q
k=5
24.11.2011 KEG 24.11.
![Page 19: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/19.jpg)
r
Basic Partitioning Principles
• ball partitioning • { x X | d(p,x) ≤ r }
• { x X | d(p,x) r }
• multiple ball partitioning• { x X | d(p,x) ≤ r1 }
• { x X | d(p,x) > r1 and d(p,x) ≤ r2}
• { x X | d(p,x) > r2 }
r1
r2
p
p
24.11.2011 KEG 24.11.
![Page 20: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/20.jpg)
Basic Partitioning Principles
• generalised hyperplane
• { x X | d(p1,x) ≤ d(p2,x) }
• { x X | d(p1,x) > d(p2,x) }
p2
p1
24.11.2011 KEG 24.11.
![Page 21: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/21.jpg)
The M-tree [Ciaccia, Patella, Zezula, VLDB 1997]
1)Paged organization
2)Dynamic
3) Suitable for arbitrary metric spaces
4) I/O and CPU optimization - computing d can be time-consuming
24.11.2011 KEG 24.11.
![Page 22: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/22.jpg)
The M-tree Idea
• Depending on the metric, the “shape” of index regions changes
C D E F
A B
B
FD
EA
C
Metric: L2 (Euclidean)
L1 (city-block) L (max-metric) weighted-Euclidean quadratic form
24.11.2011 KEG 24.11.
![Page 23: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/23.jpg)
The M-tree on the Web
• Home page: http://www-db.deis.unibo.it/Mtree/
M-tree software can be freely downloaded
– Based on GiST package (Berkeley Univ.)
Google Scholar: 1 300 citations in Dec. 2011
24.11.2011 KEG 24.11.
![Page 24: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/24.jpg)
M-tree family
• Bulk loading
• Slim-tree
• Multi-way insertion
• PM-tree
• M2-tree
• etc.
24.11.2011 KEG 24.11.
![Page 25: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/25.jpg)
D-Index [Dohnal, Gennaro, Zezula, MTA 2002]
4 separable buckets at
the first level
2 separable buckets at
the second level
exclusion bucket of
the whole structure
24.11.2011 KEG 24.11.
![Page 26: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/26.jpg)
D-index: Insertion
24.11.2011 KEG 24.11.
![Page 27: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/27.jpg)
D-index: Range Search
q
r
q
r
q
r
q
r
q
r
q
r
24.11.2011 KEG 24.11.
![Page 28: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/28.jpg)
Implementation Postulates of Distributed Indexes
• scalability – nodes (computers) can be added (removed)
• no hot-spots – no centralized nodes, no flooding by messages
• update independence – network update at one site does not require an immediate change propagation to all the other sites
24.11.2011 KEG 24.11.
![Page 29: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/29.jpg)
Peer-to-Peer Indexing
• Native metric techniques: GHT*, VPT*
• Transformation techniques: M-CAN, M-Chord
24.11.2011 KEG 24.11.
![Page 30: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/30.jpg)
Image search
Image base
24.11.2011 KEG 24.11.
![Page 31: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/31.jpg)
Images and their Descriptors
Image level
R
B
G
Descriptor level
24.11.2011 KEG 24.11.
![Page 32: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/32.jpg)
• Largest publicly available collection of high-quality images metadata: 106 million images
• Each image contains:• Five MPEG-7 VDs: Scalable Color, Color Structure, Color Layout, Edge
Histogram, Homogeneous Texture
• Other textual information: title, tags, comments, etc.
• Photos have been crawled from the Flickr photo-sharing site.
http://cophir.isti.cnr.it/
100Mimages + metadata + MPEG-7 VDs
CoPhIR: Content-based PhotoImage Retrieval
24.11.2011 KEG 24.11.
![Page 33: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/33.jpg)
MUFINSEARCHENGINE
infrastructure
ScalabilityM-Chord + M-Tree
ExtensibilityCOPHIR
edge histogram
color structure
scalable color
homogeneous texture
color layout
6 x IBM server x3400
Image Search Demohttp://mufin.fi.muni.cz/imgsearch/
24.11.2011 KEG 24.11.
![Page 35: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/35.jpg)
KEG 24.11.
MUFIN Trends Summary
• MUFIN - a universal similarity search technology
• Research directions in:– Core technology
– Applications
– A style of computing
MUFINSearchEngine
infrastructure
ScalabilityP2P structures
Extensibilitymetric space
Performance Tuning
24.11.2011
![Page 36: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/36.jpg)
24.11.2011 KEG 24.11.
Core Technology
• Development of the MUFIN core technology
October 28, 2011
MUFINSearchEngine
infrastructurePerformance Tuning
![Page 37: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/37.jpg)
Applications
– Images:• Sub-image retrieval
• Ranking
• Annotation
• Categorization
• Benchmarking
– Biometrics:• Face recognition
• Fingerprint recognition
• Gait recognition
– Signals:• Audio recognition
• Time series similarity
– Videos:• Event detection
24.11.2011 KEG 24.11.
![Page 38: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/38.jpg)
24.11.2011 KEG 24.11.
A New Style of Computing
• From the project-oriented approach towards similarity cloud
• Advantages:
– Cloud makes similarity search accessible to common users
– Computational resources are shared – users don’t need to maintain any hardware infrastructure
– Users don’t need to care for the OS, security, software platform, etc.
![Page 39: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/39.jpg)
Current Research Activities
• Image Query Postprocessing
• Sub-image Searching
• Remote Biometrics
• Event Detection in Video
• Signal Processing
24.11.2011 KEG 24.11.
![Page 40: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/40.jpg)
Query Postprocessing
• The understanding of similarity is:– subjective
– context-dependent
– multi-modal
• Semantic gap
• Overcoming semantic gap by combining aspects– semantics-learning
– result postprocessing
– relevance feedback & iterative search
• Our objectives– Large general data collections with various quality of metadata
– Online searching response times
24.11.2011 KEG 24.11.
![Page 41: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/41.jpg)
Query Postprocessing by Ranking
• Two-phase query evaluation model– Search the whole collection by some aspects => candidate set
– Rank the candidate set – sort by other aspects
Initial search Ranking
Advantages– Fast, enables to combine more similarity measures
– Enables cooperation with user
Disadvantages – Only a subset of the whole dataset is used in the ranking phase
24.11.2011 KEG 24.11.
![Page 42: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/42.jpg)
Sub-image Searching
• Retrieves all images containing the query image
• Based on local image descriptors– Scale Invariant Feature Transform (SIFT):
• Descriptor – content of a small neighborhood
• Locator – coordinates of the neighborhood
• Scale – importance of the descriptor
– Image a set of features, descriptors
– Task: Find matching pairs (similar features)24.11.2011 KEG 24.11.
Query Answer:
![Page 43: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/43.jpg)
Remote Biometrics: Motivation
• Most biometrics require the subject’s cooperation
– Fingerprint, iris, palmprint, handwriting, voice recognition
• Challenge – recognizing people at a distance
– Capture devices do not require a close contact with the subject (e.g., surveillance cameras)• It can be applied unobtrusively
– Face and gait recognition at a distance
– Problems – camera view, lighting, pose
– Applications – surveillance, security24.11.2011 KEG 24.11.
![Page 44: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/44.jpg)
Remote Biometrics: Approaches
• Detection, normalization, extraction, recognition• Face recognition
– Methods:• Appearance-based – analyze the face as a whole• Model-based – compare individual features (e.g., eyes, mouth)
– MUFIN face recognition demo: http://mufin.fi.muni.cz/faces-feret/
• Gait recognition– Less likely to be obscured, low resolution suffices– Methods are based on shape or dynamics of the person:
• Appearance-based – analyze person’s silhouettes• Model-based – compare features (e.g., trajectory, angular velocity)
24.11.2011 KEG 24.11.
![Page 45: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/45.jpg)
Event Detection in Video
• Video– continuous data
– several aspects• image, sound, text, motion, temporal
• Event
– defined aspects occurring in given time interval
• definition of a sample aspect by example or value
• definition is imprecise – looking for “similar” aspects
– combination of aspects• aggregation function
• Current approaches– annotation-based, learning-based (classifiers)
– specific domains
ExampleTV news (by image) AND about IRAQ (by text) AND burning vehicles (by image) AND time interval < 1 minute (by temporal)
24.11.2011 KEG 24.11.
![Page 46: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/46.jpg)
Signal Processing
• Vast amount of signals produced:
– Biomedicine data – ECG, CT
– Biometric data – personal identification
– Audio data – audio similarity, recognition
– Sub-image searching
– Financial time series – analysis, forecasting
– Time series streams
• Demand for
– a graceful handling of this data
– flexible reactions to new application needs
24.11.2011 KEG 24.11.
![Page 47: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/47.jpg)
Flexible Subsequence Matching
• Generic engine for rapid development of subsequence matching applications
– can be used for any class of one-dimensional signals
– Implementation of various subsequence matching approaches
– Demo web application
Subsequence MatchingLayer
User Application
24.11.2011 KEG 24.11.
![Page 48: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/48.jpg)
Demo application
24.11.2011 KEG 24.11.
![Page 49: Similarity Search in Non-text Datacontributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. 24.11.2011](https://reader033.fdocuments.net/reader033/viewer/2022060513/5f2be61834c09f6cf75e8c55/html5/thumbnails/49.jpg)
Face Retrieval Application
• 10,000 images with people
• 14,000 faces
• Face detection – MPEG7
24.11.2011 KEG 24.11.