Art of Disorderly Programming

21
Art of disorderly programming Shripad Agashe Twitter: @shripadagashe Blog: https://shripad-agashe.github.io/ ThoughtWorks

Transcript of Art of Disorderly Programming

Art of disorderly programming

Shripad AgasheTwitter: @shripadagashe

Blog: https://shripad-agashe.github.io/

ThoughtWorks

Order• The arrangement or disposition of people or things

in relation to each other according to a particular sequence, pattern, or method.

• The sequence used is often time

Disorder

source: http://www.imdb.com/media/rm2988222976/tt0209144?ref_=ttmi_mi_all_prd_29

Life with ambiguous orderint i = 5;

i = ( ++ i + ++ i)

assert ( i ==13 )

Absence of sequencing leads to undefined behavior even in a single process

Role of Time• Time is a free monotonic function on every computer

• Programmers can easily relate with time instead of a random monotonic function

• But Time is not that useful for time alone

• Causal ordering of event

• Failure detector i.e. knowing when upper bound on message deliver is breached

• Consistent snapshot

But there is no “Now”• The time elapsed between when I said that sentence and

when you hear it will be at least a couple of micro seconds.

• Typical x86 clocks vary their speeds depending on all kinds of unpredictable environmental factors, such as load, heat, and power. Even the difference between the top and bottom of the same rack can lead to a variance in skew.

• If the best that can be done in a wildly expensive environment like Google's is to live with an uncertainty of several milliseconds, most of us should assume that our own clocks are off by much more than that.

Source: https://queue.acm.org/detail.cfm?id=2745385

Models of enforcing order• Program order • Total order

Possible end state = {3}

Possible end states = {5,3,7} or {3,5,7} or {7,5,3}…..

Sometimes deterministic outcome is required

Downside of locking

Amdahl’s law takes over

σP

More on Amdahl’s law

JVM with LockMethod Time (ms)

Single thread 300

Single thread with lock 10,000

Two threads with lock 224,000

Single thread with CAS 5,700

Two threads with CAS 30,000

Single thread with volatile write 4,700

Source:http://lmax-exchange.github.io/disruptor/files/Disruptor-1.0.pdf

So we need to relax consistency

Shades of consistency

Eventual Consistency

Consistent Prefix

Monotonic Reads

Bounded Staleness

Read My Writes

Strong Consistency

Source: http://research.microsoft.com/pubs/157411/ConsistencyAndBaseballReport.pdf

But consistency guarantees are difficult

The logical answer

• The hidden cost of forfeiting consistency, which is the need to know the system’s invariants.

• The subtle beauty of a consistent system is that the invariants tend to hold even when the designer does not know what they are.

• In EC solutions, one must be explicit about all the invariants, which is both challenging and prone to error.

Is eventual consistency even applicable

• Without formal techniques it is very difficult to reason about constraints in presence of temporal non determinism

• Often EC will require business process change making it even more challenging

• The questions one need to answer is

• Will it ever be consistent

• Will it ever break business constraints

• What is the cost of breaking business constraints if any

CALM• CALM stands for consistency as logical

monotonicity.

• Operations which can be modeled as sets and expressed as selection, projection and join can be implemented as eventually consistent.

• Operations such as aggregation or anti-join can only be implemented via blocking operation.

Lets understand monotonic logic

• Monotonic logic is ever increasing set of facts

• Emergence of new fact will not invalidate earlier inferences

• As more facts emerge it tends to converge with consistent state

Non monotonic logic• Anti Join i.e. Negation Logic

• Evidence of absence is not absence of Evidence

• Aggregation

• The entire input set has to be known to arrive at consistent and accurate answer.

Workflow CALM ExamplePossible end states

{5,3,7}

{3,5,7}

{7,5,_}

{7, _ , _ }

Polling

In conclusion• Even a smallest of serial path can limit scalability

• Polling rather than blocking at client side works better

• Monotonic set based operations open up possibility of giving variety of answer. We can choose the level of correctness in the system.