Query personalization

21
Query Personalization (Subscriber Friendly)

Transcript of Query personalization

Query Personalization(Subscriber Friendly)

Typical Pub/Sub• All subscriptions are considered

equally

• Just matching a publication whenever there’s a satisfied subscription

Top-k Pub/Sub• Users can express some events are

more important than others by ranking subscriptions• A publication is scored against a

satisfied subscription space

Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&TItem = SmartphoneCarrier = AT&T

Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&T

Item = SmartphoneCarrier = AT&T

How a publication is covered by a subscription?Let’s assume,

o Subscription (S) = {b1 Ʌ b2 Ʌ ……………….. Ʌ bq}oPublication (P) = {a1 Ʌ a2 Ʌ ………………….Ʌ ap}

oP is covered by S, iff ϔbi ϵ S, then Ѐaj ϵ P

a1,a2,a3,………….....,ap

b1,b2,…………………,bq

Not covered

a1,a2,a3,………….....,ap

b1,b2,b3

b1,b2,b3,b4

b1,b2,……..bj

Covered!

Worst case scenario

• Bob subscribed to all matching subscriptions

Item = Smartphone

Item = SmartphoneCarrier = AT&T Carrier = AT&T

Item = SmartphoneCarrier = AT&TOS = Android OS = Android

OS = AndroidCarrier = AT&T

Item = SmartphoneOS = Android

{φ}

|P| = n = 3

|S| =

Item = SmartphoneCarrier = AT&TOS = Android Subscription Space

How a subscription is covered by a subscription?• Can be represented using a preference graph• Given two subscriptions and , covers , iff, • for each publication p • s.t. covers p, • it holds that covers p

• Node ≈ Subscription

Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&TOS = Android

OS = AndroidCarrier = AT&T

Item = SmartphoneOS = Android

Item = SmartphoneCarrier = AT&TOS = Android

How to assign preference over subscription?

Quantitative approach• Assign interest to each

subscription

Qualitative approach• Specify the interest between two

subscriptions

Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&T

0.7

0.5

0.9

Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&T

>

<

Interesting Question

• How can we compare quantitative & qualitative models, which are used by a specific user?

• For the moment, Let’s go with quantitative approach

Worst case scenario

• Bob subscribed to all matching subscriptions

Item = SmartphoneCarrier = AT&TOS = Android Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&TOS = Android

OS = AndroidCarrier = AT&T

Item = SmartphoneOS = Android

Item = SmartphoneCarrier = AT&TOS = Android

0.7 0.5 0.9

0.8 0.6 0.9

0.8

Pref_score = Aggregation_op (score1,…….score8);s.t. Aggregation_op ϵ {Max, Min, Average} Subscription Space

Preference graph performance

• Can prune useless subscriptions when walking along the graph for a publication matching• But in the worst case when nodes grow exponentially, • It becomes bottleneck, when

• We have many users associated with each subscription• The subscriptions are supported by many operators

• Attr {=,!=,>,<,…etc.} value

• Proposed solution• Reduce the size of subscription space!

So How? (Open to discuss)

• We stick with the most specific subscription for a particular user that can cover most number of other subscriptions

Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&TOS = Android

OS = AndroidCarrier = AT&T

Item = SmartphoneOS = Android

Item = SmartphoneCarrier = AT&TOS = Android

Subscription Space

So How? (Open to discuss)

• Instead of assign a score to the whole subscription, we assign a comparison score to each attribute-value tuple

BobItem = Smartphone (0.4)Carrier = AT&T (0.4)OS = Android (0.2)

Subscription Space

How to assign comparison scores?

• Static way• When user assigns scores, we

keep them as finalized score for the subscription

• Dynamic way• When user assigns scores, we

change them based on his previous score assignment

Static assignment (On user demand)

Item = Smartphone (0.7)

Item = Smartphone (0.4) Carrier = AT&T (0.4)

Item = Smartphone (0.4)OS = Android (0.2)

Subscription Space

Dynamic assignment (Statistical model)

Item = Smartphone (0.7)

Item = Smartphone (0.4) (operation[0.7, 0.4]) Carrier = AT&T (0.4)

Item = Smartphone (0.3)(operation[0.7, 0.3])OS = Android (0.3)

Subscription Space

Goal: In worst case

• Minimum number of most specific subscriptions can represent all others, based on tuples with assigned scores

Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&TOS = Android

OS = AndroidCarrier = AT&T

Item = SmartphoneOS = Android

Item = Smartphone (0.4)Carrier = AT&T (0.3)OS = Android (0.1)

Subscription Space

But what about the publications’ cover relation?Let’s recap,

o Subscription (S) = {b1 Ʌ b2 Ʌ ……………….. Ʌ bq}oPublication (P) = {a1 Ʌ a2 Ʌ ………………….Ʌ ap}

oP is covered by S, iff ϔbi ϵ S, then Ѐaj ϵ P

a1,a2,a3,………….....,ap

b1,b2,…………………,bq

Not covered!

a1,a2,a3,………….....,ap

b1,b2,b3

b1,b2,b3,b4

b1,b2,……..bj

Covered!

Worst case scenario

• Bob subscribed to all matching subscriptions

Item = Smartphone

Item = SmartphoneCarrier = AT&T Carrier = AT&T

Item = SmartphoneCarrier = AT&TOS = Android OS = Android

OS = AndroidCarrier = AT&T

Item = SmartphoneOS = Android

{φ}

|P| = n = 3

|S| =

Item = SmartphoneCarrier = AT&TOS = Android

Subscription Space

Let’s change it a bit

• Recap!o Subscription (S) = {b1 Ʌ b2 Ʌ ……………….. Ʌ bq}oPublication (P) = {a1 Ʌ a2 Ʌ ………………….Ʌ ap}

oP is covered by S, iff at least Ѐbi ϵ S, then Ѐaj ϵ P

a1,a2,a3,………….....,ap

b1,b2,…………………,bq

Covered!

a1,a2,a3,………….....,ap

b1,b2,b3

b1,b2,b3,b4

b1,b2,……..bj

Covered!

Worst case scenario

• Now Bob’s single subscription is open for all partial matching publications

Item = Smartphone

Item = SmartphoneCarrier = AT&T Carrier = AT&T

OS = Android

OS = AndroidCarrier = AT&T

Item = SmartphoneOS = Android

Item = SmartphoneCarrier = AT&TOS = Android….

Publication Space

Item = Smartphone

Item = SmartphoneCarrier = AT&T

Carrier = AT&TOS = Android

OS = AndroidCarrier = AT&T

Item = SmartphoneOS = Android

Item = Smartphone (0.4)Carrier = AT&T (0.3)OS = Android (0.1)

Subscription Space

Correctness

• Our score assignment to the subscription tuples• Does the trick?

• Should look out when applying other metrics too• Publications’ diversification

• Minimize redundancy• Source authority

• Reliable publication sources; Ex. Top seller• Freshness

• Event windows• To increase the novelty of delivered publications

REFERENCES

1) M. Drosou, E. Pitoura, and K. Stefanidis, “Preferential Publish / Subscribe,” in Personalized Access, Profile Management, and Context Awareness: Databases, 2008, pp. 9–16.

2) M. Drosou, K. Stefanidis, and E. Pitoura, “Preference-aware publish/subscribe delivery with diversity,” Proc. Third ACM Int. Conf. Distrib. Event-Based Syst. - DEBS ’09, p. 1, 2009.

3) M. Drosou, “Ranked Publish / Subscribe Delivery Extended abstract for DEBS PhD Workshop,” PhD Work. conjunction with DEBS 2009 Conf., 2009.