SDEC2011 Going by TACC
-
Upload
korea-sdec -
Category
Technology
-
view
3.745 -
download
11
description
Transcript of SDEC2011 Going by TACC
Going by TACC: Beyond Key-‐Value to Fault-‐Tolerant Stores with Easily Customizable
Semantics
Henk Goosen, CEO [email protected]
* Many applications only need primary key data access * Examples: catalogs, shopping carts, web session state * No need for the complexity, performance overhead, and lack of scalability of a full database * Hence: Key-‐value stores are everywhere * Dynamo, CouchDB, Cassandra, Project Voldemort, Riak,
Redis, memcached, MongoDB, …
Key-‐value stores rule the Web
OptumSoft, Inc. Proprietary and Confidential
2
Improving key-value stores is important
* Developing a key-‐value store from scratch using conventional languages is expensive: * scalability, performance, and fault tolerance * Conventional solution: use existing key-‐value store * Layer on get() and put() semantics * Mismatches between application requirements and library: either accept or extensively modify library code
Key-‐value stores in practice
OptumSoft, Inc. Proprietary and Confidential
3
Applications are more complex, performance suffers
* Use a very high-‐level language to specify the key-‐value store * Then customize the store, applying application-‐specific semantics * Benefits: * Simplifies the application business logic * Improves the performance of both store and application
TACC provides a different model
OptumSoft, Inc. Proprietary and Confidential
4
TACC model is better!
* User-‐defined type: a list of attributes (nouns) * Read or write attributes (there are no methods/verbs) * Logic primarily implemented via constraints * imperative code is also supported * Compact code * First class high level data types (eg, queues, hash tables) * Several design patterns directly supported in language
(eg observer pattern)
TACC is an object-‐oriented, strongly typed language
5
Compact code fewer bugs, quicker to market
* Reduce development time by a factor of 2x to 3x * Reduce lines of code by 10x or more * Eliminate most synchronization and concurrency bugs * High, predictable performance using optimized code generation * Fault-‐Tolerance built into the model, and easy to implement
TACC: efficient development of distributed systems
6
TACC is a general purpose language, focused on distributed systems
Stateful remote proxy objects
* Proxy: local copy of data * Writes are asynchronously copied to SysDB * SysDB changes are copied to “interested” agents * R/W access is local, fast * No remote access exceptions
OptumSoft, Inc. Proprietary and Confidential
7
LR 1 LR 2
1
1
1 Agents
SysDB
collection
object added to collection
Simple semantics, and fast
SysDB: a hierarchical in-‐memory object database
* Stores state (ideally no logic) * Minimizes risk of program
logic bugs, hence reliable
* Concise specification of user-‐defined types * TACC compiler automatically generates all required code for remote access * Agents receive automatic notification when values change
OptumSoft, Inc. Proprietary and Confidential
8
Agents
SysDB
* SysDB defines and exports an hierarchical name space (similar to a distributed file system) * Remote agents can “mount” remote directories into a local namespace * Each object is instantiated into a directory, state is made available remotely via proxy objects * Updates propagate asynchronously, notifications are delivered on changes
Distributed, hierarchical name space
OptumSoft, Inc. Proprietary and Confidential
9
Simple, powerful, proven way to provide large, structured name space
Fast recovery for high availability
Fault-‐tolerance is built in
* When an agent restarts, it recovers its state from SysDB * Agents implement invariants, therefore can be restarted at any time, on any server * Any number of backup SysDBs are supported
10
SP SB
A1 A2 A3 A4
* Application needs to track real-‐time location of user * User allowed in only one location at a time * Three operations: * ENTER <user id> <session id> <location id> * LEAVE <user id> * QUERY <user id> * Throughput > 10,000 requests/sec, latency < 1 ms
Example: Location Service as customized key-‐value store
OptumSoft, Inc. Proprietary and Confidential
11
High throughput, low latency required
Location Service Overview
* HTTP access to service * Application (GS) contacts any LR server via load balancer * LR servers replicated for scalability and for fault tolerance
OptumSoft, Inc. Proprietary and Confidential
12
GS
GS
GS
GS
GS
GS
LR
Load balancer
LR
LR
LR
Get location
Enter
Enter
Leave
Challenge: ensure responses from multiple LR servers are handled correctly
Key-‐value store tracks location for each user
OptumSoft, Inc. Proprietary and Confidential
13
GS
GS
GS
GS
GS
GS
LR
Load balancer
LR
LR
LR
Enter Smith,1
Enter Smith,2
get(), put()
Key-‐value store
get(), put()
Shard A-‐J
Shard K-‐R
Shard S-‐Z
Smith,1
Smith Smith,2
Has to be atomic
* Each partition stores a unique subset of the user state * We directly implement ENTER, LEAVE, and QUERY semantics, using a TACC Constrainer * No locking or inter-‐agent synchronization required * Requests and responses sent asynchronously * High performance: there is no waiting or blocking
TACC allows easy customization of key-‐value update semantics
OptumSoft, Inc. Proprietary and Confidential
14
Specializing the key-value store semantics simplifies the application and improves performance
LR
LR
Single-‐writer collections: no need for synchronization
OptumSoft, Inc. Proprietary and Confidential
15
LR
Shard A-‐J
Shard K-‐R
RS
RS
RS
RS
RS
RS
RS
RS
RS
RS
RS
RS
R
S
Request Collection
Response Collection
The Serializer Constrainer
OptumSoft, Inc. Proprietary and Confidential
16
Request Collection
Enter U1, R5
Enter U1, R5
Enter U8, R9
A
K
D
Response Collection
OK
NOT ALLOWED
OK
A
K
D
Status Collection
R5 U1
U8 R9
Notify
Logic
Update user status Write result
Really simple!
* Code for the Serializer constrainer defines three collections: * Input collection: requests * Output collections: responses and user status
* A dependency constraint causes imperative code to be executed when a new request arrives from LR server * The imperative code in the constrainer implements the application specific semantics
Details of Constrainer implementation
OptumSoft, Inc. Proprietary and Confidential
17
This code is a minor tweak on put() implementation
* Constraint handling code automatically inserted by compiler * No need to manually maintain invariants in many call sites * User-‐defined types organize constraint handling code and protect against mistakes * TACC coroutine further simplifies event handling
Constraints, strong typing improves event handling code
OptumSoft, Inc. Proprietary and Confidential
18
TACC changes event-handling spaghetti into well-structured, type-safe code
* Stress Agent and SysDB instrumented to collect timestamps (stored in memory, I/O after test) * tcpdump run on Stress Agent and SysDB servers * Correlate timestamps with tcpdump
Instrumentation and Measurements
OptumSoft, Inc. Proprietary and Confidential
19
* Network and TCP behavior * Many TCP settings have a dramatic and non-‐linear
performance impact
* Memory management * Memory allocation/deallocation * Avoid garbage collection
Low latency pitfalls to avoid
OptumSoft, Inc. Proprietary and Confidential
20
“The devil is in the details”
Zero-‐load Latency (μs)
OptumSoft, Inc. Proprietary and Confidential
21
SysDB Time Latency
Receive request 3 0.0
Notification 4 42.3 42.3
Response enqueued 5
75.1 32.8
Response packet 6 108.5 33.4
End-‐to-‐end Time Latency
Request created 1
0
Request packet 2
48 48
Response packet 7
248 200
Notification 8 288 40
Latencies are low and predictable
Latency, throughput vs SysDBs
OptumSoft, Inc. Proprietary and Confidential
22
High scalability under strict latency bound
Latency converges to zero-load latency
* Tacc enables developers to efficiently create predictably high performance, scalable, fault-‐tolerant distributed applications * Eliminates synchronization and locking bugs * Fewer lines of code * Faster to develop, shorter time to market * Easier to maintain * Fewer bugs
Summary
23
Contact me for more information about TACC and OptumSoft!
OptumSoft, Inc. Proprietary and Confidential
24