Hyperdex - A closer look

38

Transcript of Hyperdex - A closer look

Page 1: Hyperdex - A closer look
Page 2: Hyperdex - A closer look

HyperDex A Closer Look

Page 3: Hyperdex - A closer look

Dr. Stefan Schadwinkel Co-Founder / Analytics Engineer

[email protected]

–  DECK36 is a young spin-off from ICANS

–  Small team of 7 engineers

–  Longstanding expertise in designing, implementing and operating complex web systems

–  Developing own data intelligence-focused tools and web services

–  Offering our expert knowledge in:

–  Automation & Operations

–  Architecture & Engineering

–  Analytics & Data Logistics

Page 4: Hyperdex - A closer look

BACKGROUND *log: Storm-based Analytics RT

Page 5: Hyperdex - A closer look

BACKGROUND *log Our *log provides stream-based real-time analytics. We need a serious DB. We need to focus on servicing each request, scale easily & fast, throughput must be consistent, we need secondary indices, and the possibility to compute aggregations. MongoDB, Cassandra, Riak, MariaDB HyperDex: A Distributed, Searchable Key-Value Store. Robert Escriva, Bernard Wong and Emin Gün Sirer. In Proceedings of the SIGCOMM Conference, Helsinki, Finland, August 2012. http://hyperdex.org/papers/hyperdex.pdf

Page 6: Hyperdex - A closer look

WHY HYPERDEX? Next Generation K/V

Page 7: Hyperdex - A closer look

WHY HYPERDEX? Features. CAP - Common Buzz: Consistent, Available, Partition-tolerant – Pick any two. From http://hyperdex.org/FAQ/: HyperDex is designed to withstand a threshold of failures desired by the application. The level of fault-tolerance is tunable by the system administrator. HyperDex guarantees consistency, availability in the presence of less than f faults, and partition tolerance for partitions that affect less than f nodes, where f is a user-tunable parameter. -  Fully linearizable. Every ‘get’ always returns the latest ‘put’. -  Tolerates up to f failures. -  Query secondary attributes almost as fast as the primary key. -  Rich data types: Strings, Floats, Ints, Lists, Maps, Sets -  Atomic, multi-key transactions. (Commercial)

Page 8: Hyperdex - A closer look

HOW? Hyperspace Hashing

Page 9: Hyperdex - A closer look

HYPERSPACE HASHING Mapping Data into Euclidean Space Each object is mapped into space. Space is mapped onto servers. One hyperspace relates to one table. HyperDex can manage multiple independent hyperspaces.

Page 10: Hyperdex - A closer look

HYPERSPACE HASHING So far, so good. Aww, wait! The curse of dimensionality. The volume of the resulting hyperspace grows exponentially in the number of dimensions/attributes. For instance, a table with 9 dimensions requires 29

regions. That’s a minimum of 512 servers.

Page 11: Hyperdex - A closer look

HYPERSPACE HASHING Logarithms to the rescue! Subspaces. HyperDex splits the hyperspace into multiple lower dimensional subspaces. Thus, the volume of the space only grows linearly. Not only does this reduce the number of machines required to store the data, search becomes more efficient, because less machines need to be contacted. A key subspace is added to distinguish key lookup from single attribute searches. Each subspace stores a full copy of the object.

Page 12: Hyperdex - A closer look

HOW? Value-dependent Chaining

Page 13: Hyperdex - A closer look

VALUE-DEPENDENT CHAINING Consistency and Replication. We have copies of each object in each subspace. Value-dependent chaining keeps all copies consistent and provides strong consistency (linearizability) and fault tolerance in the presence of concurrent updates.

Page 14: Hyperdex - A closer look

VALUE-DEPENDENT CHAINING Consistency. HyperDex propagates each update deterministically to all relevant spaces.

Update u1: PUT (insert key) -  h1, h2, h3 Chains are executed from the end. Head = Point leader. The same for each key. The point leader knows all updates. Dependencies are embedded in the chain.

Page 15: Hyperdex - A closer look

VALUE-DEPENDENT CHAINING Replication. HyperDex inserts replicas for each region into the chain.

Consider Update u1: -  h1, h2, h3 -  h1, h1‘, h2, h2‘, h3, h3‘ -  h1, h1‘, h2‘, h2‘‘, h3, h3‘

Replicas are always updated first. Failures do not compromise strong consistency. Clients are only acknowledged after full replication is achieved.

Page 16: Hyperdex - A closer look

THE PARTS OF THE MACHINE HyperDex - Nuts and Bolts

Page 17: Hyperdex - A closer look

THE PARTS OF THE MACHINE The Slave Node. Everything is C++. The slave nodes are not particularly interesting.

Page 18: Hyperdex - A closer look

THE PARTS OF THE MACHINE The Coordinator & the Configuration. A logically centralized coordinator maintains global state. -  Own replicated state machine for the coordinator called “replicant”. This is what Zookeeper does for Hadoop et al. -  Global state is maintained as Configuration. -  The coordinator has no state of the stored objects, only mappings and servers. -  Instance: IP, Port, Instance ID. -  The coordinator creates new configurations based on changes and failures and distributes it to the client.

Page 19: Hyperdex - A closer look

THE PARTS OF THE MACHINE The Client. The client is part of the whole system, not just a customer. -  Client receives new configurations from the coordinator. -  Switching to a new configuration is atomic. -  Client only contacts relevant nodes. This is significant for performance.

-  Clients must be “intelligent”. No REST. -  A load-balancing proxy layer could help. But isn’t there.

-  Full support for C++, Python. -  Partial support for Java (uses the C++ driver through JNI), Node.JS, Ruby -  Using layers skips features. Java driver doesn’t support “count”.

Page 20: Hyperdex - A closer look

THE REAL WORLD™ HyperDex Tutorial

Page 21: Hyperdex - A closer look

THE REAL WORLD ™ Install. Pre-build packages. Supports CentOS, Debian, Fedora, Ubuntu. But not all versions. And not everything. Read: “No package for the Java driver”. Build from source. Good luck. Be super conscious of package versions. More on that in a minute.

Page 22: Hyperdex - A closer look

THE REAL WORLD ™ Start the Daemons. Coordinator. # hyperdex coordinator -f -l 127.0.0.1 -p 1982 Data Nodes. # hyperdex daemon -f --listen=127.0.0.1 --listen-port=2022 \

--coordinator=127.0.0.1 --coordinator-port=1982 \ --data=./data0/

# hyperdex daemon -f --listen=127.0.0.1 --listen-port=2032 \

--coordinator=127.0.0.1 --coordinator-port=1982 \ --data=./data1/

Page 23: Hyperdex - A closer look

THE REAL WORLD ™ Client Demo The Python client is the HyperDex shell. Create Hyperspace. # python

Page 24: Hyperdex - A closer look

THE REAL WORLD ™ Client Demo Create a client. Basic PUT/GET. Uses Key subspace. # python

Page 25: Hyperdex - A closer look

THE REAL WORLD ™ Client Demo Search. Uses further subspaces.

Page 26: Hyperdex - A closer look

THE REAL WORLD ™ Client Demo Updates and Range Query/Search.

Page 27: Hyperdex - A closer look

THE REAL WORLD™ Performance

Page 28: Hyperdex - A closer look

THE REAL WORLD ™ Bashing the Prophetess & the Giant. Performance Benchmarks use the YCSB against Cassandra and MongoDB. Dedicated cluster of 14 Nodes in the VICCI cloud. Take it with a grain of salt. I’m missing Riak.

Page 29: Hyperdex - A closer look

THE REAL WORLD ™ Throughput.

Page 30: Hyperdex - A closer look

THE REAL WORLD ™ Latency.

Page 31: Hyperdex - A closer look

THE REAL WORLD ™ Latency.

Page 32: Hyperdex - A closer look

THE REAL WORLD ™ Scaling.

Page 33: Hyperdex - A closer look

THE REAL WORLD™ Experiences & Findings

Page 34: Hyperdex - A closer look

THE REAL WORLD ™ Experiences. Findings. Minor versions are incompatible. -  hyperdex-1.0.rc4 vs. hyperdex-1.0.rc5 -  import hyperclient vs. hyperdex.admin, hyperdex.client -  (hyperdisk) vs. leveldb vs. hyperleveldb -  There goes my dream of using the PHP driver on github. -  Migration? No idea. -  Compile? Use VM to go.

Page 35: Hyperdex - A closer look

THE REAL WORLD ™ Experiences. Findings. It’s just a K/V store. -  No methods to do distributed computations. Python map/reduce is on the agenda.

No Dynamo Ring. But a chain to rule them all. -  Fault-tolerance with f dedicated nodes is fine, but what about multiple datacenters?

It’s a quite young project with few committers. Important internals change between minor versions. Not much sleep for them. How about your DevOps?

Page 36: Hyperdex - A closer look

REMEMBER? *log: Storm-based Analytics RT

Page 37: Hyperdex - A closer look

WHAT ABOUT? *log We chose Riak. -  Excellent Java driver. -  We don’t need transactions. -  During development, our schema will change often. -  Operational ease, easy to scale, excellent feedback. -  Map/reduce in Erlang and JS. Can use the result of a secondary index query. -  Solr Integration with Riak Search. Not at the moment, but we deal with content.

We like HyperDex. -  Really interesting concepts and advancements, but atm not the perfect fit. -  Implemented a storage backend abstraction layer. Easy to switch to HyperDex once its more mature.

Page 38: Hyperdex - A closer look

Thank You.