IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Information Aging
-
Upload
kalman-graffi -
Category
Internet
-
view
110 -
download
2
description
Transcript of IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Information Aging
Continuous Gossip-based AggregationThrough Dynamic Information Aging
Vitaliy Rapp, Kalman GraffiTechnology of Social Networks Group,
University of Düsseldorf, Germany
Email: [email protected]
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 2
P2P Systems
Peer-to-Peer Network– Decentralized self-organizing
overlay network with shared resource usage
– Consist of several independent peers, cooperating with each other
Advantages:– Scalability through distribution of
responsibility– No single point of failure
Types of P2P Networks– Structured
• Use of distributed index structure (DHT)
• Peers have assigned unique IDs, and can be addressed directly
– Unstructured• Peers can communicate only with their
direct neighbors• Peers do not have special
responsibilities
IP Network(Underlay)
Overlay Connection
Peer-to-Peer Service Delivery
H(„my data“)= 3107
2207
29063485
201116221008709
611
H(„my data“)= 3107
2207
29063485
201116221008709
611
PeerID = PubKey
Direct communication
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 3
Future Peer-to-Peer Applications: Social Networks
A P2P Framework for Social Networks (LifeSocial)– Framework: combining a wide set of useful modules
• Storage, messaging, security, caching, app-hosting, multicast, pub/sub …
• Distributed data structures, monitoring – Social network on top of platform
• Build through “plugins” (apps)• Configurable GUI supports app growth
See p2pframework.com
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 4
Main Challenges for Future P2P Applications
Security:– Secure overlays, user management, key infrastructure– Secure (encrypted, authenticated, integer) communication– Access control, role-based, identity-based
Controlled quality / performance– First step monitoring: statistical aggregation over all nodes– Hop count, node count, reply times, traffic overhead, used overlay
functions, …– Statistics:
• Min, max, average,standard deviation
– Requirements• Precise• Timely• Low-cost
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 5
Agenda
Gossip based Aggregation
Continuous Gossip-based Aggregation
Evaluation
Conclusions
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 6
► Gossiping Protocols
Idea:– Communicate only with neighbors (gossip)
• Assumes no specific overlay topology– Exchange and aggregate information
• E.g. calculate averages, minimum, maximum
Characteristics– Gossip protocols are round-based (epochs)– For every round
• Each node selects a subset of nodes to interact with (pairwise)• The selection function is often probabilistic;• Nodes interact via “small” messages• Local state changes due to new information
– In general: “quick” convergence
D. Kempe, A. Dobra,J. Gehrke, “Gossip-Based Computation of Aggregate Information,” IEEE Symposium on Foundations of Computer Science (FOCS’03)
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 7
Assumptions– Input: local states of peers at time – Initialization defines aggregation function
round(0){– 1. (for average calculation)– 2. (for average calculation)– 3.
Round (r>0){– 1. Let be all pairs sent to during round r-1– 2. ; – 3. Choose a target node uniformly at random– 4. Send the pair to j and self– 5. is the estimate of aggregate in round r }
Gossip-Protocol: PushSum
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 8
Initialization of PushSum
Result of gossiping:– Input: – Output:
Calculating the average:– For all nodes: – Output:
Node count:– One single node: – All other nodes: – Output: with being the average share of 1 among n peers
Calculating the sum:– One single node: – All other nodes: – Output:
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 9
Example Average Calculation
Example: 12 nodes– Initial state– After 1 round
• With communication links– After 5 rounds– After 10 rounds
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 11
Inpre-cise
Performance and Complexity of Push-Sum
Performance: precision– Simulations with 1M nodes
• Gossip every 5 second– For most time:
• False values• Although convergence exist
– Problem• Peer count starts always at 0
Convergence time• n = number of nodes• = accepted relative error
– Push-Sum converges quickly– Problem:
• Huge message overhead per node
•
W. Terpstra, C. Leng, A. Buchmann: Brief Announcement: Practical Summation via Gossip, ACM Symposium on Principles of Distributed Computing (PODC 2007)
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 12
Churn – Reference: Node Count
Comparative Evaluation– Node count: 1000, churn– Tree-based monitoring: update
interval 15 sec, branching factor 8 – PushSum:30 messages per epoch– Centralized for comparison, update
interval 60 sec– Same overhead allowed for all
monitoring approaches
Simulation setup– Churn with joining and
instantly leaving nodes– Both decentralized
solutions• Use ca. 200 bytes/s per
node• For better comparability
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 13
Reference Signals: Steps, Sawtooth and Sine
PushSum– Imprecise monitoring – Epochs are visible– Although same traffic overhead
Centralized and tree-based– Precise– Tree become imprecise with
too much churn
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 14
Epoch based Approach
Idea:– Restart the calculation after N rounds to consider new measurements
Implementation:– Bound N round to one so called epoch– At the start of each epoch all peers resets their estimates– All peers witch join the network do not participate at the current epoch
• All joining peers receives the current round of running epoch
Advantages:– Robust, easy to implement, works with any algorithm
Disadvantages:– Requires synchronization for epoch starts– How to estimate a good epoch length
• Long: good convergence on old data• Short: bad convergence of fresh data
– Restarts the algorithm even when it’s not necessary
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 15
Agenda
Gossip based Aggregation
Continuous Gossip-based Aggregation
Evaluation
Conclusions
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 16
Aging Approach
Idea:– Consider new measurements in each round with a ratio of α
Approach:– Let calculated values converge to holding values– Convergence rate is the same at every peer– Proposed function: (v, c) (1 - α)·c + α·v
• c – current estimation / statistic• v – fresh measurement• α – aging factor (e.g. 0.01)
Advantages:– Dynamic adaptation, no need to restart– No synchronization required at joining or due to epoch starts
Disadvantages:– Calculated aggregate values do not converge in the actual sense– Sum calculation need adjustment
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 17
Aging - Example
1
5
7
3
= 4
= 4
= 4
= 4
= 4, α = 0.2
= 3.4
= 4.6
= 3.8
= 4.2
= 3.8
= 3.8
= 4.2
= 4.2
= 4
= 4
= 4
= 4
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 18
Aging - Example
1
5
7
3
= 4
= 4
= 4
= 4
= 4, α = 0.2 = 3, α = 0.2
= 3.4
= 4.2
= 3.8 = 3.6
= 3.6
= 3.9
= 3.9
= 3.75
= 3.75
= 3.32
= 3.6
= 4
= 3.66
= 3.66 = 3.63
= 3.63 = 3.645
= 3.645 = 3.116
= 3.904
= 3.516
= 3.51
= 3.51
= 3.513
= 3.513
= 3.5115
= 3.5115
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 19
Sum Calculation with Aging
Idea:– Apply “aging” – Restart only when the peer holding 1 leaves the system
Basic implementation:– Every Peer is holding following values:
• MAX • VERSION • AVG
– MAX is used to identify the loss of the initial value• When MAX value fells under defined threshold calculation is restarted• With a small probability every peer can initialize the restart• Peer initializing restart set its AVG value to 1
– Version is used to identify duplications
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 20
Agenda
Gossip based Aggregation
Continuous Gossip-based Aggregation
Evaluation
Conclusions
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 21
Evaluation through Simulations
Main questions– Monitoring precision (relative error) – Costs (traffic and messages)
Setup– 5000 nodes, just join, no lookups– Two scenarios - Churn: no and KAD-based
• 0 - 60 minute: joining phase• 65 – 240 minute: churn (if activated)
– Aging factor: α = 0.01– Gossip round: 10 seconds, unsynchronized
Layer setup– User / application: no overlay usage, just maintenance– Overlay: Chord (as graph)– Network model:
• Global Network Positioning delay model• OECD bandwidth model
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 22
► PeerfactSim.KOM (see www.peerfact.org)
Type– Event-based simulator in Java– Focus on simulation
of p2p systems on various layers • User, application• Services: monitoring, replication …• Overlays• Network models
Layered Architecture– Easy exchange of components– Testing of new applications / mechanisms
Main idea– Layers have several implementations– Enables testing of individual layer
mechanisms• on its own and • in combination with other layers
Application
Overlay
UserS
imula
tion
En
gine
Network
Service
Transport
Application
Overlay
UserS
imula
tion
En
gine
Network
Service
Transport
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 23
Network Size Estimation
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 24
Online Time Estimation
Node estimation– Calculation of sum is worst case scenario
• Average of once 1 and (n-1) times a 0– Relative error
• No churn < 0.01, with churn < 0.1 in average 0.05
Average calculations easier: e.g. online time estimation
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 25
Operation Cost Estimation
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 26
Agenda
Gossip based Aggregation
Continuous Gossip-based Aggregation
Evaluation
Conclusions
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 27
Conclusions
Gossiping – Monitoring is needed for future p2p applications– Gossiping can be used in any topology
• Very robust and versatile– Problems with epoch-based gossiping
• New measurements are considered only at restart of epochs• Results of previous epochs are not reused• Hard to identify ideal epoch length
– Tradeoff between convergence and freshness
Continuous Gossip-based Aggregation – Continuously measures current network status– Integrates fresh measurement in every round with fixed ratio– High precision ( 0.01 average, 0.05 sum under churn)– Low costs (1,5 kb/s in average at a round length of 10s)
Kalman Graffi Heinrich Heine Universität Düsseldorf 13. April 2023 28
Thank You for Your Attention
Jun.-Prof. Dr.-Ing. Kalman Graffi Technology of Social Networks GroupInstitute of Computer ScienceHeinrich-Heine-Universität Düsseldorf
eMail: [email protected] Web: www.p2pframework.comWeb: www.peerfact.org
?