Sven Bittner, 12 April 2007 Talk at the 5th New Zealand Computer Science Research Student Conference...

24
Sven Bittner, 12 April 2007 Sven Bittner, 12 April 2007 Talk at the 5th New Zealand Computer Science Talk at the 5th New Zealand Computer Science Research Student Conference Research Student Conference NEWS ALERT: NEWS ALERT: (Kiwi or Cow) and Chainsaw = (Kiwi or Cow) and Chainsaw = (Kiwi and Chainsaw) or (Cow and (Kiwi and Chainsaw) or (Cow and Chainsaw)? Chainsaw)? YEAH RIGHT... YEAH RIGHT...

Transcript of Sven Bittner, 12 April 2007 Talk at the 5th New Zealand Computer Science Research Student Conference...

Sven Bittner, 12 April 2007Sven Bittner, 12 April 2007

Talk at the 5th New Zealand Computer Science Talk at the 5th New Zealand Computer Science Research Student ConferenceResearch Student Conference

NEWS ALERT:NEWS ALERT:

(Kiwi or Cow) and Chainsaw = (Kiwi or Cow) and Chainsaw = (Kiwi and Chainsaw) or (Cow and (Kiwi and Chainsaw) or (Cow and

Chainsaw)?Chainsaw)?

YEAH RIGHT...YEAH RIGHT...

22/23/23

Structure of TalkStructure of Talk• Motivation & ProblemMotivation & Problem

• Undertaken ResearchUndertaken Research– FilteringFiltering– Event Routing Event Routing – AdvertisementsAdvertisements

• Current Steps & SummaryCurrent Steps & Summary

Sven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub SystemsSven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub Systems

33/23/23

Structure of TalkStructure of Talk• Motivation & ProblemMotivation & Problem

• Undertaken ResearchUndertaken Research– FilteringFiltering– Event Routing Event Routing – AdvertisementsAdvertisements

• Current Steps & SummaryCurrent Steps & Summary

Sven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub SystemsSven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub Systems

44/23/23

Problem: Information Problem: Information OverflowOverflow

• MoreMore and more and more electronically available electronically available informationinformation

• Users only want Users only want information they are information they are interestedinterested inin

Motivation Motivation Research Filtering Routing Advertisements Research Filtering Routing Advertisements SummarySummary

55/23/23

Solution: Publish/SubscribeSolution: Publish/Subscribe

Incoming informationIncoming informationInformation of interestInformation of interest

FilteringFiltering of information (event messages) of information (event messages)according to user interests (subscriptions)according to user interests (subscriptions)

Motivation Motivation Research Filtering Routing Advertisements Research Filtering Routing Advertisements SummarySummary

66/23/23

Pub/Sub Systems: DetailsPub/Sub Systems: DetailsPublishersPublishers SubscribersSubscribersPub/sub systemPub/sub system

PublishPublish

event messagesevent messages

RegisterRegister

subscriptionssubscriptions

SendsSends

notificationsnotifications

BB77

BB99

Filtering and routingFiltering and routing

BB11

BB33

BB44

BB55

BB88

Advertise futureAdvertise future

event event

messagesmessages

Event routing tableEvent routing table

… …… …

Subs. routing tableSubs. routing table

… …… …

Event routing tableEvent routing table

… …… …

Subs. routing tableSubs. routing table

… …… …

…… ……

…… …… Advertisem.Advertisem.

and subscript.and subscript.

index structuresindex structures

BB66

BB22

Motivation Motivation Research Filtering Routing Advertisements Research Filtering Routing Advertisements SummarySummary

77/23/23

Problem/HypothesisProblem/Hypothesis• Focus on Focus on conjunctiveconjunctive subscr./advert. subscr./advert.

• Argument: Boolean forms can be Argument: Boolean forms can be convertedconverted to to DNFDNF– DNF DNF exponentialexponential in in sizesize– Already many subscr./advert. without Already many subscr./advert. without

conversionconversion

• Hypothesis Hypothesis [B06][B06]– Direct support of Direct support of BooleanBoolean form form decreasesdecreases

memorymemory usage usage without degradingwithout degrading efficiencyefficiency

Motivation Motivation Research Filtering Routing Advertisements Research Filtering Routing Advertisements SummarySummary

Original: (Kiwi or Cow) and Chainsaw

DNF: (Kiwi and Chainsaw) or (Cow and Chainsaw)

88/23/23

Structure of TalkStructure of Talk• Motivation & Problem Motivation & Problem

• Undertaken ResearchUndertaken Research– FilteringFiltering– Event Routing Event Routing – AdvertisementsAdvertisements

• Current Steps & SummaryCurrent Steps & Summary

Sven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub SystemsSven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub Systems

99/23/23

Three Research AreasThree Research Areas1. 1. CentralCentral filteringfiltering of arbitrary Boolean of arbitrary Boolean

subscriptionssubscriptions

2. Event 2. Event routing optimizationsrouting optimizations for for arbitrary Boolean subscriptionsarbitrary Boolean subscriptions

3. Support of arbitrary Boolean 3. Support of arbitrary Boolean advertisementsadvertisements

a) Calculation of a) Calculation of overlappingsoverlappings

b) b) Subscription routing optimizationSubscription routing optimization

Motivation Motivation Research Research Filtering Routing Advertisements Filtering Routing Advertisements SummarySummary

1010/23/23

Central Filtering Algorithm Central Filtering Algorithm (1)(1)

• Utilization of one-dimensional Utilization of one-dimensional indexesindexes• ExtensionExtension of conjunctive counting of conjunctive counting

approachapproach• Three-step filtering Three-step filtering [BH05a][BH05a]

IncominIncomingg

eventevent

MatchingMatching

predicatpredicateses

CandidateCandidate

subscriptiosubscriptionsns

MatchingMatching

subscriptiosubscriptionsns

PredicatePredicate

matchingmatching

CandidateCandidate

subscriptisubscriptionon

matchingmatching

RealReal

subscriptisubscriptionon

matchingmatching

Subscription Subscription

indexesindexesPredicate Predicate

indexesindexesSubscriptionsSubscriptions

Motivation Motivation Research Research FilteringFiltering Routing Advertisements Routing Advertisements SummarySummary

1111/23/23

Central Filtering Algorithm Central Filtering Algorithm (2)(2)

• Evaluation Evaluation [BH05b][BH05b]– MemoryMemory requirements requirements

• Development of Development of characterizationcharacterization schemescheme• TheoreticalTheoretical analysis based on scheme analysis based on scheme• PracticalPractical verificationverification

– EfficiencyEfficiency• Empirical experiments, similar/better resultsEmpirical experiments, similar/better results

Proves hypothesis for central Proves hypothesis for central componentscomponents

Motivation Motivation Research Research FilteringFiltering Routing Advertisements Routing Advertisements SummarySummary

1212/23/23

Structure of TalkStructure of Talk• Motivation & ProblemMotivation & Problem

• Undertaken ResearchUndertaken Research– FilteringFiltering– Event Routing Event Routing – AdvertisementsAdvertisements

• Current Steps & SummaryCurrent Steps & Summary

Sven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub SystemsSven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub Systems

1313/23/23

Event Routing Optimization Event Routing Optimization (1)(1)

• Subscription pruning Subscription pruning [BH06a][BH06a]– ApplicableApplicable to to all kindsall kinds of subscriptions of subscriptions– TailoredTailored for various for various targettarget parametersparameters

(me-mory usage, filter efficiency, network (me-mory usage, filter efficiency, network load) load) [BH06c][BH06c]

• Optimization ideaOptimization idea– BroadeningBroadening of subscriptions by pruning of subscriptions by pruning– NoNo effecteffect on filtering on filtering accuracyaccuracy (only internal) (only internal)

ReductionReduction of of complexitycomplexity of routing of routing tabletable entriesentries

Motivation Motivation Research Filtering Research Filtering RoutingRouting Advertisements Advertisements SummarySummary

1414/23/23

Event Routing Optimization Event Routing Optimization (2)(2)

Subscription …Subscription …

Neighbor Neighbor NN11 N N22 N N33 ……

Un-optimized routing tableUn-optimized routing table

Subscription …Subscription …

Neighbor Neighbor NN11 N N22 N N33 ……

Optimized routing tableOptimized routing table

Selected pruning Selected pruning

on on selectedselected

entriesentries

Motivation Motivation Research Filtering Research Filtering RoutingRouting Advertisements Advertisements SummarySummary

1515/23/23

Event Routing Optimization Event Routing Optimization (3)(3)

• Analysis Analysis (empirical experiments)(empirical experiments)

– Strong Strong reductionreduction in table in table size size (e.g., by 80%)(e.g., by 80%)– Strong Strong increaseincrease in in throughput throughput (e.g., by (e.g., by

50%)50%) PromisingPromising optimization optimization effecteffect

• ComparisonComparison to covering optimization to covering optimization

– StableStable optimization optimization behaviorbehavior of pruning of pruning– ApplicableApplicable if other optimizations fail (both if other optimizations fail (both

subscription structure and relationships)subscription structure and relationships)

Proves hypothesis for distributed settingProves hypothesis for distributed settingMotivation Motivation Research Filtering Research Filtering RoutingRouting Advertisements Advertisements SummarySummary

1616/23/23

Structure of TalkStructure of Talk• Motivation & ProblemMotivation & Problem

• Undertaken ResearchUndertaken Research– FilteringFiltering– Event RoutingEvent Routing– AdvertisementsAdvertisements

• Current Steps & SummaryCurrent Steps & Summary

Sven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub SystemsSven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub Systems

1717/23/23

Support of Advertisements Support of Advertisements (1)(1)

1.1. Calculation of overlappings (two Calculation of overlappings (two directions)directions)

– Overlapping subscriptions: all subscriptions Overlapping subscriptions: all subscriptions that that potentiallypotentially matchmatch messages messages described by advertisementdescribed by advertisement

– Similar to matching algorithm, Similar to matching algorithm, threethree stepssteps• Disjoint predicate matchingDisjoint predicate matching• Candidate overlapping subscription matchingCandidate overlapping subscription matching• Real overlapping subscription matchingReal overlapping subscription matching

Motivation Motivation Research Filtering Routing Research Filtering Routing AdvertisementsAdvertisements SummarySummary

1818/23/23

Support of Advertisements Support of Advertisements (2)(2)

1.1. Calculation of overlappings (two Calculation of overlappings (two directions)directions)

– Evaluation and comparison Evaluation and comparison [BH06b][BH06b]

• SimilarSimilar efficiency for efficiency for functionfunction problem (all problem (all overlappings)overlappings)

• HigherHigher performance (e.g., 85% more efficient) performance (e.g., 85% more efficient) for for decisiondecision problem (at least one problem (at least one overlapping)overlapping)

Motivation Motivation Research Filtering Routing Research Filtering Routing AdvertisementsAdvertisements SummarySummary

1919/23/23

Support of Advertisements Support of Advertisements (3)(3)

2.2. Advertisement-based optimizationAdvertisement-based optimization

– Advertisement pruning Advertisement pruning [BH06d][BH06d]• First First designateddesignated subscription routing subscription routing

optimizationoptimization• Pruning of advertisementsPruning of advertisements• Target parameter: Target parameter: minimalminimal increaseincrease of of

overlappingsoverlappings when pruning when pruning

– EvaluationEvaluation• Proposed measure fulfils design goalProposed measure fulfils design goal

Proves hypothesis for advertisementsProves hypothesis for advertisementsMotivation Motivation Research Filtering Routing Research Filtering Routing AdvertisementsAdvertisements SummarySummary

2020/23/23

Structure of TalkStructure of Talk• Motivation & ProblemMotivation & Problem

• Undertaken ResearchUndertaken Research– FilteringFiltering– Event Routing Event Routing – AdvertisementsAdvertisements

• Current Steps & SummaryCurrent Steps & Summary

Sven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub SystemsSven Bittner – Expressive Subscriptions and Advertisements in Pub/Sub Systems

2121/23/23

Current StepsCurrent Steps

• Detailed experimental studyDetailed experimental study

• Writing upWriting up

• RedraftingRedrafting

• Redrafting…Redrafting…

Motivation Motivation Research Filtering Routing Advertisements Research Filtering Routing Advertisements SummarySummary

2222/23/23

SummarySummary• ClaimClaim

Pub/sub systems should Pub/sub systems should not convertnot convert subscriptions and advertisements subscriptions and advertisements to DNFto DNF for time and space for time and space efficiencyefficiency reasons reasons

• ProofProof– Filtering algorithm (central system)Filtering algorithm (central system)– Event routing optimization (distributed Event routing optimization (distributed

system)system)– Support for advertisements (if used)Support for advertisements (if used)

Motivation Motivation Research Filtering Routing Advertisements Research Filtering Routing Advertisements SummarySummary

2323/23/23

ConclusionsConclusions• Pub/sub systems should directly work Pub/sub systems should directly work

on Boolean expressions (as opposed to on Boolean expressions (as opposed to DBMSs)DBMSs)

(Kiwi or Cow) and Chainsaw =(Kiwi or Cow) and Chainsaw =

(Kiwi and Chainsaw) or (Cow and Chainsaw)?(Kiwi and Chainsaw) or (Cow and Chainsaw)?

Motivation Motivation Research Filtering Routing Advertisements Research Filtering Routing Advertisements SummarySummary

Not the same in pub/sub, because Boolean formNot the same in pub/sub, because Boolean form• More More spacespace efficientefficient• More More timetime efficientefficient

Sven Bittner, [email protected] Bittner, [email protected]

Talk: Expressive Subscriptions and Talk: Expressive Subscriptions and

Advertisements in Pub/Sub SystemsAdvertisements in Pub/Sub Systems

Selected further reading:Selected further reading:

[BH05a][BH05a] S. Bittner and A. Hinze. On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems. S. Bittner and A. Hinze. On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems. In In Proceedings of the 25th IEEE International Conference on Distributed Computing Systems Workshops Proceedings of the 25th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW '05)(ICDCSW '05), Columbus, USA, June 2005. , Columbus, USA, June 2005.

[BH05b][BH05b] S. Bittner and A. Hinze. A Detailed Investigation of Memory Requirements for Publish/Subscribe S. Bittner and A. Hinze. A Detailed Investigation of Memory Requirements for Publish/Subscribe Filtering Algorithms. In Filtering Algorithms. In Proceedings of the 13th International Conference on Cooperative Information Proceedings of the 13th International Conference on Cooperative Information Systems (CoopIS 2005)Systems (CoopIS 2005), Agia Napa, Cyprus, 31 October-4 November, 2005., Agia Napa, Cyprus, 31 October-4 November, 2005.

[BH06a][BH06a] S. Bittner and A. Hinze. Pruning Subscriptions in Distributed Publish/Subscribe Systems. In S. Bittner and A. Hinze. Pruning Subscriptions in Distributed Publish/Subscribe Systems. In Proc. Proc. of the 29th Australasian Computer Science Conference (ACSC 2006)of the 29th Australasian Computer Science Conference (ACSC 2006), Hobart, Australia, 16-19 January, , Hobart, Australia, 16-19 January, 2006.2006.

[BH06b][BH06b] S. Bittner and A. Hinze. Arbitrary Boolean Advertisements: The Final Step in Supporting the S. Bittner and A. Hinze. Arbitrary Boolean Advertisements: The Final Step in Supporting the Boolean Pub/Sub Model. Technical Report 06/2006. Computer Science Department, Waikato University, Boolean Pub/Sub Model. Technical Report 06/2006. Computer Science Department, Waikato University, June 2006. June 2006.

[BH06c][BH06c] S. Bittner and A. Hinze. Dimension-Based Subscription Pruning for Publish/Subscribe Systems. In S. Bittner and A. Hinze. Dimension-Based Subscription Pruning for Publish/Subscribe Systems. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems Workshops Proceedings of the 26th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW '06)(ICDCSW '06), Lisbon, Portugal, July 2006. , Lisbon, Portugal, July 2006.

[BH06d][BH06d] S. Bittner and A. Hinze. Optimizing Pub/Sub Systems by Advertisement Pruning. In S. Bittner and A. Hinze. Optimizing Pub/Sub Systems by Advertisement Pruning. In Proceedings of Proceedings of the 8th International Symposium on Distributed Objects and Applications (DOA 2006)the 8th International Symposium on Distributed Objects and Applications (DOA 2006), Montpellier, , Montpellier, France, 30 October-1 November 2006.France, 30 October-1 November 2006.

[B06][B06] S. Bittner. Supporting Arbitrary Boolean Subscriptions in Distributed Pub/Sub Systems. In S. Bittner. Supporting Arbitrary Boolean Subscriptions in Distributed Pub/Sub Systems. In Proceedings of the 3rd Intern. Middleware Doctoral Symposium (MDS 2006)Proceedings of the 3rd Intern. Middleware Doctoral Symposium (MDS 2006), Australia, November 2006., Australia, November 2006.

Thank you for your attention!Thank you for your attention!