Linear Time Byzantine Self- Stabilizing Clock Synchronization Ariel Daliot 1, Danny Dolev 1, Hanna...
-
Upload
savion-sword -
Category
Documents
-
view
230 -
download
2
Transcript of Linear Time Byzantine Self- Stabilizing Clock Synchronization Ariel Daliot 1, Danny Dolev 1, Hanna...
Linear Time Byzantine Self-Stabilizing Clock Synchronization
Ariel Daliot1, Danny Dolev1, Hanna Parnas2
1School of Engineering and Computer Science,2Department of Neurobiology and the Otto Loewi Center for Cellular and
Molecular Neurobiology, The Hebrew University of Jerusalem, Israel
This research is supported in part by IntelCOMM Grant - Internet Network/Transport Layer & QoS Environment
(IXA)
Lecture Outline
What is “Pulse Synchronization”
Examples of pulse synchronization in nature
A biologically inspired pulse synchronization algorithm for distributed computer networks
Efficient Byzantine Self-Stabilizing clock synchronization above pulse synchronization
The target is to synchronize pulses from any state and any faults
.....|.............|..................|.....................|...................|....
……...|.............|..................|.....................|..............|..........
.......|.............|..................|.....................|..................|..... t
……………......|.............|..................|.....................|.......................
…......|.............|..................|.....................|................|........
…………….|.............|..................|.....................|.....|...................
.……......|.............|..................|.....................|...........|.............
.....||||||........||.....|||......||......||......|.......||.||.||.....|......||.......……
…….....|.............|||.||.||.||||...............|.......|||||||||||||||||||||...||||.||||…...
cycle
Synchronized stateArbitrary state
Self-Stabilizing “Pulse Synchronization”
Convergence: Starting from an arbitrary state s, the system reaches a synchronized state in finite time
Closure: If s is a synchronized state of the system at real-time t0 then real-time t≥ t0 :
1. The system state at time t is a synchronized state
2. «Linear Envelope», for every correct node p:a[t-t0] + b ψp(t, t0) g[ t-t0] + h
Ψp(t1,t2) is is the number of pulses a correct node pi invoked during a real time interval [t1,t2] within which pi was continuously correct
Fault Models
Many problems trivial with no faults, some unsolvable with a single fault (E.g. Byzantine Generals)
Common fault models: Crash/Link/Message faults
Byzantine failures (“malicious” faults)– Usually proven to require n>3f to tolerate f faults
– Not solvable for some problems
Transient faults (system in arbitrary state or total chaos)– Requires Self-Stabilizing algorithms in order to overcome
– Not solvable for some problems (Clock Synchronization)
Self-StabilizationAddresses the situation when ALL nodes can concurrently be faulty for a limited period of time
A SS algorithm realizes its task once the system is back within the assumption boundaries
Is orthogonal to Byzantine failures, i.e. these are uncorrelated fault models
Byzantine algorithms typically focus on limiting the influence of faulty nodes once the task has been realized
Self-stabilizing algorithms focus on realizing the task following a “catastrophic” state
Synchrony phenomena in biology
The phenomenon of synchronization is displayed by many biological systems – Synchronized flashing of the male malaccae fireflies – Oscillations of the neurons in the circadian pacemaker,
determining the day-night rhythm– Crickets that chirp in unison– Coordinated mass spawning in corals– Audience clapping together after a “good” performance
We were inspired by the pacemaker network in the cardiac ganglion of lobsters
Synchrony phenomena in biology
The phenomenon of synchronization is displayed by many biological systems – Synchronized flashing of the male malaccae fireflies
– Oscillations of the neurons in the circadian pacemaker, determining the day-night rhythm
– Crickets that chirp in unison
– Coordinated mass spawning in corals
– Audience clapping together after a “good” performance
We were inspired by the pacemaker network in the cardiac ganglion of lobsters
Cardiac ganglion of the lobster (Sivan, Dolev & Parnas, 2000)
Four interneurons tightly synchronize their pulses in order to give the heart its optimal pulse rate (though one is enough for activation)
Able to adjust the synchronized firing pace, up to a certain bound (e.g. while escaping a predator)
motor
neurons
|..|.. |.|.||.
|..|.. |..|.. |..|..
A related problem – real-time-Clock Synchronization (rCS)
There exists γ, t0 , ν, a and b such thatt≥t0:Agreement. For any correct nodes p, q |Cp(t) - Cq(t)| ≤ γ, (precision)Validity. For every correct node p (1+ν)-1t +a ≤ Cp(t) ≤ (1+ν)t + b, (accuracy)
Optimal precision is d.(1-1/n)Optimal accuracy is ν =
real-time-Clock Synchronization
rCS has two additional constraints over pulse synchronization:– The pulses have labels (“the time”)– The time needs to approximate real time
Most Byzantine rCS use the following principles:– At every time the computers exchange clock values– They operate some function on the received values
(which seeks to neutralize the effect of the Byzantine values and set the clocks close to each other)
real-time-Clock Synchronization impossibility result with no external time
source
=> This works only if the clocks initially have close values
=> Which implies rCS cannot be solved when all clocks hold arbitrary times
=>Which means there is no self-stabilizing algorithm for rCS
I.e. if the clocks are initially far apart they cannot both synchronize AND estimate real time
=> Internal rCS assume clocks are initially synched
FAB8 AIT WSR - ww16-17/2003
Main outages and issues: …. synchronize problem in VAX - On WW16.5
- Impact: 8 CW SC's unable to introduce lots for a period of 4 hr and 15 min.( from 23:00 until 03:15). Root cause: - the job which synchronize the time between the VAXes failed on 22:00 (Thursday Night) and created gaps between the machines clocks. This gap caused the remotes which worked with CW* to get to loop status, with error message of FCM message. Solution: Time synchronized. Helpdesk will get alerts when this job will fail again.
A Distributed System according to Lamport
“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.”
Applicability of logical clocks
Many algorithms depending on clock synchronization actually only require synchronized logical clocksE.g.: TDMA, Kerberos tickets, DHCP leases, global snapshots, data base time stamps and many othersWhy then not use a self-stabilizing Byzantine clock synchronization algorithm that synchronizes logical time?
Self-Stabilizing Byzantine Clock Synchronization
Because the known previously best self-stabilizing Byzantine clock synchronization algorithm converges in expected (n-f).n6(n-f) time! (Dolev-Welch, 95)
The difficulty lies in the fact that:– the initial clock values can differ arbitrarily– there is no agreed time for exchanging the values and
setting the clock according to the values received– the clocks can wrap around
Clock Synchronization using synchronized pulses
We assume no outside source of real-timeAt every pulse exchange clock values and operate some clock adjustment function on the received multisetIf clocks were initially close to real time then they will stay close to real time If not then the clocks will proceed synchronously close to logical time
This scheme yields a Byzantine self-stabilizing clock synchronization algorithm with convergence time, accuracy and precision on the order of existing rCS
The Byzantine Self-Stabilizing Clock Synchronization Algorithm
At “pulse” eventBegin
Clock := ET;Wait for every correct node to invoke a pulse;ET := SS-Strong-Byz-Agreement(ET + cycle mod M);
End
ET - Expected Time of next pulseCycle - Expected elapsed logical time until next pulseM - Bound on clock value
The Byzantine Self-Stabilizing Pulse Synchronization Algorithm
if (cycle_countdown = 0) thensend “Propose-Pulse” message to all;
if (received f+1 distinct “Propose-Pulse” messages) then send “propose-Pulse” message to all;
if (received n-f distinct “Propose-Pulse” messages) then invoke “pulse” event;cycle_countdown := cycle;flush “Propose-Pulse” message counter;ignore “Propose-Pulse” messages for 2d(1+)
time;
The Self-Stabilizing Byzantine Strong Agreement Algorithm
Any Strong Byzantine Agreement algorithm can be used
Agreement and validity is not ensured until the pulses synchronize
Self-stabilization is supported by counting recovering nodes as correct only following cycle+time-for-BA of correct behavior
We use a slightly modified version of the Toueg, Perry and Srikanth (1987) Strong Byzantine Agreement algorithm
It has the advantage of “early stopping”: if all correct nodes start with identical values then termination is within 2 rounds
Hence, during continuous correct system behavior clock synchronization is maintained with very little overhead
*global pulse Comparison chartAlgorithm Self-
Stabilizing /Byzantine
Precision Accuracy Convergence Time
Messages
PULSE-CS SS+BYZ 2d + O() 2d + O() cycle+2(2f+5)d O(nf2)
NOADJUST-CS
SS+BYZ 2d + O() 2d + O() cycle O(n2)
DHSS BYZ d+O() (f+1)d+ O() 2(f+1)d O(n2)
LL-APPROX BYZ 5 + O() + O() d+O() O(n2)
DW-SYNCH* SS+BYZ 0 0 M.22(n-f) n2.M.22(n-f)
DW-BYZ-SS SS+BYZ 4(n-f) + O() (n-f) +O() O(nO(n)) O(nO(n))
PT-SYNC* SS 0 0 4n2 O(n2)
The teaching of Pythagoras
“Evolution is the law of life”
“Unity is the law of God”
“Number is the law of the universe”
Related ProblemsDigital Clock Synchronization– Agreement on pulse counters, with or without a global pulse
Clock Synchronization– Common notion of real time, high precision and accuracy
Phase Clocks– Agreement on pulse counters in asynchronous settings
Synchronized Rates– Clocks progress at approximately the same rate, the times may differ
Firing Squad– All nodes enter the same state in step k after a process has initialized fire
Pulse Synchronization– Precise synchronization of regular pulses, slack linear envelope accuracy