Efficient summarization framework for multi-attribute uncertain data

Efficient summarization framework for multi-attribute

uncertain data

Jie Xu, Dmitri V. Kalashnikov, Sharad Mehrotra

Uncertain Data Set

The Summarization Problem

location (e.g. LA)

face (e.g. Jeff, Kate)

visual concepts (e.g. water, plant, sky)

Extractive

Abstractive

Kate Jeff wedding at LA

Modeling Information

Summarization Process

What information does this image contain?

Extract best subset

dataset summary

Metrics?- Coverage Agrawal, WSDM’09; Li, WWW’09; Liu, SDM‘’09;

Sinha, WWW’11 - Diversity Vee, ICDE’08; Ziegler, WWW’05- Quality Sinha, WWW’11

information

object

Existing Techniques

Kennedy et al. WWW’08

Simon et al. ICCV’07

Sinha et al. WWW’11

Hu et al. KDD’04

Ly et al. CoRR’11

Inouye et al. SocialCom ’11

Li et al. WWW’09

Liu et al. SDM’09

• Do not consider information in multiple attributes

• Do not deal with uncertain data

image customer review doc/micro-blog

Challenges

Design a summarization framework for

Multi-attribute Data

Uncertain/Probabilistic Data.

visual concept

face tags

locationtimeevent

visual conceptsP(sky) = 0.7, P(people) = 0.9

data processing(e.g. vision analysis)

Existing techniques typically model & summarize a single information dimension

Limitations of existing techniques - 1

Summarize only information about visual content (Kennedy et al. WWW’08,Simon et al. ICCV’07)

Summarize only information about review content (Hu et al. KDD’04,Ly et al. CoRR’11)

What information is in the image?

{sky}, {plant}, …

{Kate}, {Jeff}

{wedding}

{12/01/2012}

{Los Angeles}

Elemental IU

Is that all?

{Kate, Jeff}{sky, plant}…

Intra-attribute IU

Even more information from attributes?

{Kate, LA}

Inter-attribute IU

{Kate, Jeff, wedding}

Are all information units interesting?

Is {Sharad, Mike} an interesting intra-attribute IU?

Yes, they often have coffee together and appear frequently in other photos

Are all of the 2n combinations of people interesting? Shall we select a summary that covers all these information?

Well, probably not! I don’t care about person X and person Y who happen to be together in the photo of this large group.

Is {Liyan, Ling} interesting?

Yes from my perspective, because they are both my close friends

Mine for interesting information units

O1face

{Jeff, Kate}

O2face

O3face

{Jeff, Kate, Tom}

O4face

{Kate, Tom}

O5face

{Jeff, Kate}

Onface

{Jeff, Kate}

Modified Item-set mining algorithm

frequentcorrelated{Jeff, Kate}

Mine for interesting information units

O1face

{Jeff, Kate}

O2face

{Jeff}

O3face

{Jeff, Kate, Tom}

O4face

{Kate, Tom}

O5face

{Jeff, Kate}

Onface

{Jeff, Kate}

Mine from social context

(e.g. Jeff is friend of Kate,

Tom is a close friend of the user)

{Jeff, Kate}

Can not handle probabilistic attributes

Limitation of existing techniques – 2

dataset summary

P(Jeff) = 0.8

P(Jeff) = 0.6

Not sure whether an object covers an IU in another object

objects

Deterministic Coverage Model --- Example

Coverage = 8 / 14

dataset summary

information

object

Probabilistic Coverage Model

Expected amount of information covered by S

Expected amount of total information

Simplify to compute efficiently

Can be computed in polynomial time

The function is sub-modular

Optimization Problem for summarization

Parameters: dataset O = {o1, o2, · · · , on} positive number K

Finding summary with Maximum Expected Coverage is NP-hard.

We developed an efficient greedy algorithm to solve it.

For each object o in O \ S,Compute hkjhkhk

Basic Greedy Algorithm

Expensive to compute Cov. It is

(Object-level optimization)

Too many operations of computing Cov.

(Iteration-levelOptimization)

Initialize S = empty set

Select o* with max

Nodone

Efficiency optimization – Object-level

Reduce the time required to compute the coverage for one object

Instead of directly compute and optimize coverage in each iteration, compute the gain of adding one object o to summary S

gain(S,o) = -

Updating gain(S,o) is much more efficient ( )

Submodularity of Coverage

Expected Coverage Cov(S,O) is submodular:

Cov(S, O)Cov(S ∪ o, O) – Cov(S, O)

Cov(T, O)Cov(T ∪ o) - Cov(T, O)

Efficiency optimization – Iteration-level

Reduce the number of object-level computations (i.e. gain(S,o) ) in each iteration of the greedy process

While traversing objects in O \ S, we maintain the maximum gain so far gain*.

an upper bound Upper(S, O) on gain(S,o). For any

prune an object o if Upper(S, o) < gain*.

By definition

By submodularity

Update in constant time

Experiment -- Datasets

Facebook Photo Set 200 photos uploaded by 10 Facebook users

Review Dataset Reviews about 10 hotels from TripAdvisor.

Each hotel has about 250 reviews on average.

Flickr Photo Set 20,000 photos from Flickr.

visual concept

event timeface

visual conceptfacets rating

visual event time

Experiment – Quality

Experiment – Efficiency

Basic greedy algorithm without optimization runs more than 1 minute

Summary

Developed a new extractive summarization framework

Multi-attribute data.

Uncertain/Probabilistic data.

Generates high-quality summaries.

Highly efficient.

Efficient summarization framework for multi-attribute uncertain data

Documents

Transcript of Efficient summarization framework for multi-attribute uncertain data

468 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA …203.170.84.89/~idawis33/DataScienceLab/.../TKDE-Liu14-uncertain.fi… · Uncertain One-Class Learning and Concept Summarization Learning

Patent Summarization and Paraphrasing - Electrical …ece.drexel.edu/walsh/David_PatentSummarization.pdfPatent Summarization I Patent Summarization is the technique of summarizing

Scene Summarization

MULTI DOCUMENT TEXT SUMMARIZATION USING … · Basically text Summarization methods can be classified into extractive and abstractive summarization. An extractive summarization method

Road to Summarization

Speech Summarization

NAME/ALIAS: ARCHETYP: AUGEN-/HAARFARBE: …€¦ · charakterdaten attribute zustandsmonitor kÖrperliche attribute geistige attribute besondere attribute besondere attribute konstitution:

Video Co-summarization: Video Summarization by …...Video Co-summarization: Video Summarization by Visual Co-occurrence Wen-Sheng Chu1 Yale Song2 Alejandro Jaimes2 1Robotics Institute,

Some Neutrosophic Uncertain Linguistic Number Heronian Mean Operators and Their Application to Multi-Attribute Group Decision Making

Video summarization via spatio-temporal deep architecturefuturemedia.szu.edu.cn/assets/files/Video summarization... · 2020. 9. 7. · video summarization task. Zhang et al. tried

Lecture: Summarization

Evaluating Text Summarization

Document Summarization

Video Summarization Ppt

Class Summarization BySecEngCL

summarization heat transfer.docx

A summarization Journey

Developing Summarization Skills

UNL Document Summarization

SUMMARIZATION AND AGGREGATION