Distributedsystems 100912185813-phpapp01

Distributed Systems

scalability and high availability

Renato Lucindo - lucindo.github.com - @rlucindo

Distributed System Design

Renato Lucindo

Call me Lucindo (or Linus)2002 - Bachelor Computer Science2007 - M.Sc. Computer Science (Combinatorial Optimization)7+ year developing Distributed Systems

My default answer: "I don't know."


Agenda

Scalability

High Availability

Problems

Tips and Tricks

Learning More


Distributed Systems

Multiple computers that interact with each other over a network to achieve a common goalPurpose

ScalabilityHigh availability

source: http://www.cnds.jhu.edu/


Scalability

System ability to handle gracefully a growing amount of work

Scale up (vertical)Add resources to a single nodeImprove existing code to handle more work

Scale out (horizontal)Add more nodes to a systemLinear (or better) scalabilityDi

stributed System Design

Scalability - Vertical

Add: CPU, Memory, Disks (bigger box) Handling more simultaneous:

ConnectionsOperationsUsers

Choose a good I/O and concurrency modelNon-blocking I/OAsynchronous I/OThreads (single, pool, per-connection)Event handling patterns (Reactor, Proactor, ...)

Memory model?STM


Scalability - Vertical

Careful with numbersRequests per second# of ConnectionsSimultaneous operations

Event handlingThink front-endSlow connections/clientsIt's slower than other options

In doubt, go asyncBack-end

Thread pool (thread per-connection)No eventsProcess per-core


Scalability - Horizontal

Add nodes to handle more workFront-end

StraightforwardStateless

Back-endMaster/Slave(s)Partitioning

DHTVolatile Index



Master/SlaveWrite on single MasterRead on Slaves (one or more)Scales reads



Partitioning (Sharding)Distribute dada across nodes

Generally involves data de-normalizationWhere is some specific data?

Master IndexHash (DTH, Consistent Hashing)Volatile Index

Joins done in application levelNoSQL friendly



Volatile Index: build and maintain data index as cached information (all clients)


High Availability

"Processes, as well as people, die"

Handle hardware and software failuresEliminate single point of failure

RedundancyFailoverReplicas


High Availability - Failover/Redundancy


High Availability - Replicas

Two or more copies of same dataReplica granularity

From node replica to "row" replicaLoad balancingWrite concurrencyReplica updatesKey for high availability and root of several problems


Problems


Problems - CAP Theorem



Consistency: all operations (reads/writes) yield a global consistent state

Availability: all requests (on non-failed servers) must have a response

Partition Tolerance: nodes may not be able to communicate with each other.

Pick TwoDistributed System Design


C + A: network problems might stop the system

Examples:Oracle RAC, IBM DB2 ParallelRDBMS (Master/Slave)Google File SystemHDFS (Hadoop)



C + P: clients can't always perform operations

Examples:Distributed lock-systems: Chubby, ZooKeeperPaxos protocol (consensus)BigTable, HbaseHypertableMongoDB



A + P: clients may read inconsistent (old or undone) data

Examples:�Amazon DynamoCassandraVoldemortCouchDBRiakCaches


Problem with CAP Theorem

In practice, C + A and C + P systems are the same.C + A: not tolerant of network partitionsC + P: not available when a network partition occurs

Big problem: network partitionNot so big (how often does it happens?)

Pick twoAvailabilityConsistency

The forgotten: LatencyOr, how long the system waits before considering a partitioned network?


Problems - Real World

Every component may fail:Network failureHardware failureElectricityNatural disastersCode failure


Tips & Tricks


Tips & Tricks - Pyramid

Capacity (connections, operations, ...) Pyramid


Tips & Tricks - Reply Fast

FAIL FastBreak complex requests into smaller onesUse timeoutsNo transactionsBe aware that a single slow operation or component can generate contentionSelf-denial attack


Tips & Tricks - Cache

Cache: component location, data, dns lookups, previous requests, etcUse negative cache for failed requests (low expiration)Don't rely on cacheYour system must work with no cache


Tips & Tricks - Queues

Easy way to add asynchronous processing an decouple your system.


Tips & Tricks - DNS


Tips & Tricks - Logs

Log everythingUse several log levelsOn every log message

UserRequest hostComponent involvedVersionFilename and line

If log level not enabled do not process log messageAvoid lookup calls (gettimeofday)Di

stributed System Design

Tips & Tricks - Domino Effect

Make sure your load balancer won't overload componentsUser smart algorithms

Load BalanceResource Allocation


Tips & Tricks - (Zero) Configuration

No configuration filesUse good defaultsAuto-discovery (multicast, gossip, ...)Make everything configurable

Administrative commandNo need to stop for changes

Automatic self adjusts when possible


Tips & Tricks - STOP Test

With your system under load: kill -STOP <component>


Tips & Tricks - Know your tools

load average (uptime)stats tools

vmstatiostatmpstattcpstat, tcprstat, etc

tcpdump, nc, netstattunning

/proc/net/*ulimitsysctl

oprofiledebuging tools (gdb, valgrind)...


Tips & Tricks - Count

Count everythingConnectionsOperationsFailuresSuccessesRequest times (granularity)

Total, average, standard deviationMonitor counters


Tips & Tricks - Stability Patterns

Use TimeoutsCircuit BreakerBulkheadsSteady StateFail FastHandshakingTest HarnessDecoupling Middleware


Tips & Tricks - Don't Panic!


Learning More - Books

TCP/IP Illustrated, Vol. 1: The Protocols



Unix Network Programming, Vol. 1: The Sockets Networking



Pattern Oriented Software Architecture, Vol. 2



Release It!


Learning More - Papers

The Google File System Bigtable: A Distributed Storage System for Structured DataDynamo: Amazon's Highly Available Key-Value StorePNUTS: Yahoo!’s Hosted Data Serving PlatformMapReduce: Simplified Data Processing on Large Clusters

Towards robust distributed systemsBrewer's conjecture and the feasibility of consistent, available, partition-tolerant web servicesBASE: An Acid AlternativeLooking up data in P2P systems


Thanks!!! Questions?

lucindo.github.com - @rlucindo


Distributedsystems 100912185813-phpapp01

Technology

Transcript of Distributedsystems 100912185813-phpapp01