Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016
-
Upload
codemotion -
Category
Technology
-
view
62 -
download
1
Transcript of Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016
![Page 1: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/1.jpg)
Distributed Systems + NodeJSBruno Bossola
MILAN 25-26 NOVEMBER 2016
@bbossola
![Page 2: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/2.jpg)
@bbossola
Whoami
● Developer since 1988
● XP Coach 2000+
● Co-founder of JUG Torino
● Java Champion since 2005
● CTO @ EF (Education First)
I live in London, love the weather...
![Page 3: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/3.jpg)
@bbossola
Agenda
● Distributed programming
● How does it work, what does it mean
● The CAP theorem
● CAP explained with code
– CA system using two phase commit
– AP system using sloppy quorums
– CP system using majority quorums
● What next?
● Q&A
![Page 4: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/4.jpg)
@bbossola
Distributed programming
● Do we need it?
![Page 5: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/5.jpg)
@bbossola
Distributed programming
● Any system should deal with two tasks:
– Storage
– Computation
● How do we deal with scale?
● How do we use multiple computers to do what we used todo on one?
![Page 6: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/6.jpg)
@bbossola
What do we want to achieve?
● Scalability
● Availability
● Consistency
![Page 7: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/7.jpg)
@bbossola
Scalability
● The ability of a system/network/process to:
– handle a growing amount of work
– be enlarged to accommodate new growth
A scalable system continue to meet the needs of its users as thescale increase
clipart courtesy of openclipart.orgclipart courtesy of openclipart.org
![Page 8: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/8.jpg)
@bbossola
Scalability flavours
● size:
– more nodes, more speed
– more nodes, more space
– more data, same latency
● geographic:
– more data centers, quicker response
● administrative:
– more machines, no additional work
![Page 9: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/9.jpg)
@bbossola
How do we scale? partitioning
● Slice the dataset into smaller independent sets
● reduces the impact of dataset growth
– improves performance by limiting the amount of data tobe examined
– improves availability by the ability of partitions to failindipendently
![Page 10: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/10.jpg)
@bbossola
How do we scale? partitioning
● But can also be a source of problems
– what happens if a partition become unavailable?
– what if It becomes slower?
– what if it becomes unresponsive?
clipart courtesy of openclipart.org
![Page 11: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/11.jpg)
@bbossola
How do we scale? replication
● Copies of the same data on multiple machines
● Benefits:
– allows more servers to take part in the computation
– improves performance by making additional computingpower and bandwidth
– improves availability by creating copy of the data
![Page 12: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/12.jpg)
@bbossola
How do we scale? replication
● But it's also a source of problems
– there are independent copies of the data
– need to be kept in sync on multiple machines
● Your system must follow a consistency model
v4 v4
v8
v8 v4 v5
v7
v8
clipart courtesy of openclipart.org
![Page 13: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/13.jpg)
@bbossola
Availability
● The proportion of time a system is in functioning conditions
● The system is fault-tolerant
– the ability of your system to behave in a well definedmanner once a fault occurs
● All clients can always read and write
– In distributed systems this is achieved by redundancy
clipart courtesy of openclipart.org
![Page 14: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/14.jpg)
@bbossola
Introducing: performance
● The amount of useful work accomplished compared to thetime and resources used
● Basically:
– short response time for a unit of work
– high rate of processing
– low utilization of resources
clipart courtesy of openclipart.org
![Page 15: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/15.jpg)
@bbossola
Introducing: latency
● The period between the initiation of something and theoccurrence
● The time between something happened and the time it hasan impact or become visible
● more high level examples:
– how long until you become a zombie after a bite?
– how long until my post is visible to others?
clipart courtesy of cliparts.co
![Page 16: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/16.jpg)
@bbossola
Consistency
● Any read on a data item X returns a value correspondingto the result of the most recent write on X.
● Each client always has the same view of the data
● Also know as “Strong Consistency”
clipart courtesy of cliparts.co
![Page 17: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/17.jpg)
@bbossola
Consistency flavours
● Strong consistency
– every replica sees every update in the same order.
– no two replicas may have different values at the sametime.
● Weak consistency
– every replica will see every update, but possibly indifferent orders.
● Eventual consistency
– every replica will eventually see every update and willeventually agree on all values.
![Page 18: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/18.jpg)
@bbossola
The CAP theorem
CONSISTENCY AVAILABILITY
PARTITIONTOLERANCE
![Page 19: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/19.jpg)
@bbossola
The CAP theorem
● You cannot have all :(
● You can select twoproperties at once
Sorry, this has been mathematically proven and no, has not been debunked.
![Page 20: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/20.jpg)
@bbossola
The CAP theorem
CA systems!
● You selected consistency and availability!
● Strict quorum protocols(two/multi phase commit)
● Most RDBMS
Hey! A network partition willf**k you up good!
![Page 21: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/21.jpg)
@bbossola
The CAP theorem
AP systems!
● You selected availability and partition tolerance!
● Sloppy quorums andconflict resolution protocols
● Amazon Dynamo, Riak,Cassandra
![Page 22: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/22.jpg)
@bbossola
The CAP theorem
CP systems!
● You selected consistency and partition tolerance!
● Majority quorum protocols(paxos, raft, zab)
● Apache Zookeeper,Google Spanner
![Page 23: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/23.jpg)
@bbossola
NodeJS time!
● Let's write our brand new key value store
● We will code all three different flavours
● We will have many nodes, fully replicated
● No sharding
● We will kill servers!
● We will trigger network partitions!
– (no worries. it's a simulation!)
clipart courtesy of cliparts.co
![Page 24: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/24.jpg)
@bbossola
Node APP
General design
<proto> APIStorage
API
GET (k) SET (k,v)
<proto> Storage
Database
<proto> Core
fX fY fZ fK
![Page 25: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/25.jpg)
@bbossola
CA key-value store
● Uses classic two-phase commit
● Works like a local system
● Not partition tolerant
![Page 26: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/26.jpg)
@bbossola
Nodeapp
CA: two phase commit, simplified
2PCAPI
Storage API
GET (k) SET (k,v)
Storage
Database
2PC Core
propose(tx)
commit(tx)
rollback(tx)
![Page 27: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/27.jpg)
@bbossola
AP key-value store
● Eventually consistent design
● Prioritizes availability over consistency
![Page 28: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/28.jpg)
@bbossola
Nodeapp`
AP: sloppy quorums, simplified
QUORUMAPI
Storage API
GET (k) SET (k,v)
Storage
Database
QUORUM Core
(read) (repair)
propose(tx)
commit(tx)
rollback(tx)
![Page 29: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/29.jpg)
@bbossola
CP key-value store
● Uses majority quorum (raft)
● Guarantees eventual consistency
![Page 30: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/30.jpg)
@bbossola
CP: majority quorums (raft, simplified)
RAFTAPI
Storage API
GET (k) SET (k,v)
Storage
Database
RAFT Core
beat
voteme history
Nodeapp`
Urgently needs refactoring!!!!
![Page 31: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/31.jpg)
@bbossola
What about BASE?
● It's just a way to qualify eventually consistent systems
● BAsic Availability
– The database appears to work most of the time.
● Soft-state
– Stores don’t have to be write-consistent, nor do differentreplicas have to be mutually consistent all the time.
● Eventual consistency
– Stores exhibit consistency at some later point (e.g.,lazily at read time).
![Page 32: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/32.jpg)
@bbossola
What about Lamport clocks?
● It's a mechanism to maintain a distributed notion of time
● Each process maintains a counter
– Whenever a process does work, increment the counter
– Whenever a process sends a message, include thecounter
– When a message is received, set the counter tomax(local_counter, received_counter) + 1
clipart courtesy of cliparts.co
![Page 33: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/33.jpg)
@bbossola
What about Vector clocks?
● Maintains an array of N Lamport clocks, one per eachnode
● Whenever a process does work, increment the logicalclock value of the node in the vector
● Whenever a process sends a message, include the fullvector
● When a message is received:
– update each element in
● max(local, received)– increment the logical clock
– of the current node in the vector
clipart courtesy of cliparts.co
![Page 34: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/34.jpg)
@bbossola
What next?
● Learn the lingo and the basics
● Do your homework
● Start playing with these concepts
● It's complicated, but not rocket science
● Be inspired!
![Page 35: Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016](https://reader031.fdocuments.net/reader031/viewer/2022030305/5870e04c1a28abcf288b48e1/html5/thumbnails/35.jpg)
@bbossola
Q&A
Amazon Dynamo:http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
The RAFT consensus algorithm:https://raft.github.io/http://thesecretlivesofdata.com/raft/
The code used into this presentation:https://github.com/bbossola/sysdist
clipart courtesy of cliparts.co