Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution...

7
EXECUTIVE SUMMARY Enterprises across industries struggle to manage massive quantities of data and data entering systems at a high velocity. While NoSQL has emerged as a solution in some cases, many applica- tions still rely on a relational database management system (RDBMS). To further complicate the management of these systems, the NoSQL space has been one of rapid change with many offer- ings emerging. While data scientists, architects, and developers are choosing the system that best matches their uses cases; it’s the administrators that are forced to manage all of these complex systems and meet business SLAs. Robin Systems has teamed up with DataStax so that administrators can deploy DataStax Enter- prise (DSE) on Robin’s application-aware infrastructure software which is optimized for container technologies. DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes developer tooling, administration and monitoring, search, operational analytics, and graph -- all in a unified, always-on data platform. By working with Robin Cloud Platform (RCP), an administrator can now also achieve: » Productivity improvement by simplified operations and user experience » Cost reduction by guaranteed performance, even in shared multi-tenant environments, to enable hardware consolidation » Risk reduction by repeatable and automated processes such as 1-click cluster provision and patch/upgrade » Agility optimization with full Application & Data Lifecycle Management with significantly reduced time and storage ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM Simplifying Data Management WHITE PAPER 1 DataStax and Robin Cloud Platform (RCP) with The power behind the moment.

Transcript of Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution...

Page 1: Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes ... Simplifying Data Management ... bare

EXECUTIVE SUMMARY

Enterprises across industries struggle to manage massive quantities of data and data entering

systems at a high velocity. While NoSQL has emerged as a solution in some cases, many applica-

tions still rely on a relational database management system (RDBMS). To further complicate the

management of these systems, the NoSQL space has been one of rapid change with many offer-

ings emerging. While data scientists, architects, and developers are choosing the system that best

matches their uses cases; it’s the administrators that are forced to manage all of these complex

systems and meet business SLAs.

Robin Systems has teamed up with DataStax so that administrators can deploy DataStax Enter-

prise (DSE) on Robin’s application-aware infrastructure software which is optimized for container

technologies. DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes

developer tooling, administration and monitoring, search, operational analytics, and graph -- all in

a unified, always-on data platform. By working with Robin Cloud Platform (RCP), an administrator

can now also achieve:

» Productivity improvement by simplified operations and user experience

» Cost reduction by guaranteed performance, even in shared multi-tenant environments, to enable hardware

consolidation

» Risk reduction by repeatable and automated processes such as 1-click cluster provision and patch/upgrade

» Agility optimization with full Application & Data Lifecycle Management with significantly reduced time and

storage

ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM

Simplifying Data Management

WHITE PAPER

1

DataStax and Robin Cloud Platform (RCP)with

The power behind the moment.

Page 2: Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes ... Simplifying Data Management ... bare

Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper

The power behind the moment.

A PARTNERSHIP TO ENABLE EFFICIENT MANAGEMENT OF DSE RCP is a container-based, application-centric, server and storage virtualization platform software which turns

commodity hardware into a high-performance, elastic, and agile big data & database consolidation platform. RCP

is designed to cater to not just stateless applications, but also to performance and data-centric applications such

as DataStax Enterprise Clusters. DataStax administrators were facing the following challenges:

» Low Server Utilization -Underlying hardware has to be sized for peak workloads, leaving large amounts of

spare capacity and idle hardware due to varying load profiles.

» Sizing Production Workloads During Development - To accurately size environments an organization must

estimate the read and write performance that is expected from the designed configuration. This requires

testing and experimentation, and yet the infrastructure might be over or under-provisioned.

» Availability Planning - While Cassandra is designed to withstand temporary node failures, permanent node

failures require resolution by addition of replacement nodes which causes additional load on the remaining

active nodes.

» Cloning Data for Dev/Test Environments - Typical scenarios where a subset of the production data is required

are – for debugging bugs, performance and stress testing, split read workload across multiple clusters, etc.

DataStax does not have an automated way to clone a subset of production data.

» Scaling Out vs Up - Administrators need the ability simply add and remove resources (scale up or down)

dynamically to their cluster, in real- time, to deal with temporary load variations.

» Patch & Upgrade Automation - Administrators have to periodically orchestrate updates across all nodes of

Cassandra without any downtime, and be able to rapidly revert changes in case of failures.

RCP dramatically simplifies application and data lifecycle management with features such as one-click database

deploy, snapshot, clone, time travel, dynamic IOPS control, upgrade and performance guarantees.

Consolidate with bare metal performance

SQL & No SQL Databases

ROBIN Application-Aware Compute

ROBIN Application-Aware, Storage and Data

Existing Commodity Hardware

Application Orchestration Qos Guarantee Agile Data Lifecycle Management

Cloud-Extend Compute & Storage

BIG DATA Stateful Distributed Apps

Deliver guaranteed QoS, elastic scaling

Enable agile, simple application management

ROBIN Fabric Controller

ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 3ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 2

Page 3: Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes ... Simplifying Data Management ... bare

Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper

The power behind the moment.

ROBIN CLOUD PLATFORM (RCP)

RCP - THREE KEY COMPONENTS

Container-Based Agile Compute

RCP’s container-based virtualization technology helps consolidate appli-

cations with complete runtime isolation and zero performance impact.

RCP achieves this by turning physical servers, either on premises or in the

cloud, into a container-based compute plane, that can easily grow based

on demand.

Just as a hypervisor abstracts Operating Systems from the underlying

hardware, RCP’s compute plane abstracts applications from OS and

everything underneath, leading to simplified application deployment and

seamless portability for all types of enterprise applications – including

highly performance-sensitive workloads such as databases and Big Data.

Container-Aware Scale-Out StorageRCP’s container-aware, software-defined, block storage is designed

from the ground up to support agile, sub-second volume creation for

containers. RCP converts any commodity hardware with HDD, SSD or

NVMe disks into an enterprise-class storage plane, which can easily grow

with demand. It delivers core services like thin provisioning, compression,

encryption, data protection, simplified data lifecycle management via

snapshots and thin clones, and rapid application restores.

Application-Aware Fabric ControllerThe application-aware fabric controller is the “brain” of the RCP platform.

Making application as the payload, the fabric controller automatically

decides the placement, provisions containers and storage for each

application component, and configures the application – thus enabling

single-click deployment of even the most complex applications. It also

continuously monitors the entire application and infrastructure stack to

automatically recover failed nodes and disks, failover applications, and

ensures that each application dynamically gets adequate disk IO and

network bandwidth to deliver the Application-to-Spindle QoS guarantee.

App

ROBIN Fabric Controller

ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 3

AppApp

App

App App

Page 4: Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes ... Simplifying Data Management ... bare

Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper

The power behind the moment.

THE UNIFIED SOLUTION: DATASTAX ENTERPRISE ON RCPTogether, DataStax and RCP provide advanced functionality designed to accelerate your ability to create intel-

ligent and compelling applications, using powerful indexing, search, analytics and graph functionality, coupled

with a smart infrastructure platform required to deploy and manage all your application and data components.

THE UNIFIED SOLUTION PROVIDES1. Improved utilization and predictable performance

2. Simplified cluster lifecycle management and improved availability

3. Agile data management

4. Comprehensive scaling strategies

ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 5ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 4

Page 5: Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes ... Simplifying Data Management ... bare

Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper

The power behind the moment.

Improved Utilization and Predictable PerformanceRCP uses containers to provide 1-click, rapid, self-service deployment of Cassandra and DataStax clusters. While

containers provide process isolation, RCP’s App-to-Spindle Quality of Service feature, combined with container

technology, provides complete performance isolation. This means, only RCP allows multiple applications to run

on the same server and storage without impacting each other, thus increasing the average hardware utilization

and significantly larger consolidation ratios. Typically, customers see over 40% reduction in hardware by adopting

Robin Cloud Platform.

RCP is designed to provide the benefits of virtualization, without sacrificing performance. The graph below shows

the results of the seven YCSB tests on a 3-node Cassandra cluster running on - first, bare metal servers with local

storage, and then running on the Robin Cloud Platform. The results clearly show that the introduction of RCP into

the environment had negligible impact to the overall performance of the Cassandra cluster.

Simplified Cluster Lifecycle Management and Improved Availability

RCP’s orchestration capabilities combined with DataStax OpsCenter make deployment of large and complex

clusters a breeze, while the App-to-Spindle Quality of Service control removes the pressure of right-sizing the

cluster for desired load and performance profile during initial deployment. Administrators can now dynamically

and in real-time make changes to the CPU, memory, and read and write IOPs assigned to the individual clusters.

RCP makes permanent node failures a thing of the past, and even temporary failures are reduced to a matter of

few seconds. RCP can seamlessly relocate containers from failed nodes to healthy ones, all while retaining the

0

20000

40000

60000

80000

100000

120000

A (LOAD) A (RUN) B C D E F

Cassandra YCSB Benchmark for 1 BILLION RECORDS

Bare Metal vs Robin vs Virtual Machines

Bare Metal (No containers + Local Storage)

Robin (Containers + Robin Storage)

VM (KVM + Local Storage)

ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 5

Page 6: Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes ... Simplifying Data Management ... bare

Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper

The power behind the moment.

same volumes and IP addresses. This means applications see only a small downtime, and administrators are

saved from the overhead of adding replacement nodes and rebalancing data.

Snapshots can also be used to create thin

clones of the entire database cluster. Similar to

snapshots, the clones are space-efficient as well,

and hence can be leveraged to create rapid

copies of large production clusters for develop-

ment and testing purposes, with no performance

penalties.

Thus, RCP offers cluster-wide automated backup,

restore, and cloning resulting in greatly simplified

storage planning and operations.

Agile Data ManagementRCP allows unlimited snapshots (point-in-time copy) of the complete database cluster, including OS, DB binary,

configuration, schema, and data. RCP snapshots are space efficient, online, and only capture the delta changes

since the last snapshot. Hence they can be utilized as an effective replacement for traditional database backups.

These snapshots can be used to restore or refresh database to the desired point-in-time without having to move

large backup files on and off the database cluster nodes.

Overview Chart

59

158

9591

Snapshot

3968037200347203224029760272802480022320198401736014880124009920744049602480

0

CPU Cores

Memory

Read IOPs

Write IOPs

Priority

1 8 10

0

1 GB 32 GB

20,000 50,000

50,000

Read IOPs (4KB)

Before applying QoS After applying QoSthat guaranteed `20K IOPs

ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 7ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 6

Page 7: Simplifying Data Management DataStax and Robin … · DataStax Enterprise is the best distribution of Apache Cassandra™ and also includes ... Simplifying Data Management ... bare

Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper Simplifying Data Management with DataStax & Robin Cloud Platform (RCP) - White Paper

The power behind the moment.

Comprehensive Scaling StrategiesAs discussed earlier, scaling typically means incremental addition of nodes. While RCP greatly simplifies this

activity by implicitly provisioning the infrastructure along with the software, it also adds the new paradigm of

scaling clusters up and down. This means you can dynamically and in real-time change resource allocation to

individual clusters. This makes it easy to cater to transient spikes in workload due to expected or unexpected

changes in application usage.

BENEFITSRobin Systems platform is architected from the ground up to deliver a complete shared platform for hosting all of

an enterprise’s data and data-driven distributed applications. Some of the key benefits are described below.

Operational Agility & Simplicity Lower Costs Better, Predictable Application Perfor-

mance

» Single-click provisioning of clusters and complex distributed applications

» Push-button cluster extend, applica-tion cloning and snapshots

» Improved Application Uptime with auto-failover

» CAPEX Reduction – Potential savings of up to 40% with lower HW footprint

» Lower software licensing cost through application consoli-dation on shared hardware

» Application consolidation with bare metal performance

» Automatic Application-to-Spindle performance SLA enforcement

SUMMARYMany companies have experimented with Docker and other container technologies but they have discovered

that these tools, along with their basic storage support in volume plugins, only solve the problem of deployment

and scale, but are unable to address challenges with container failover, data and performance management,

and the ability to take care of transient workloads, which are critical for distributed platforms like DataStax Enter-

prise.

Robin Systems together with DataStax, provide the complete package for data management, where the data

administrators and consumers can just focus on their use cases, while the tedious tasks of deployment, backups,

restore, clone, scaling, and performance management are completely automated. This greatly improves IT

productivity, and enables them to support and deliver on the promise of agility.

DataStax is a registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries. Apache Cassandra, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

ROBIN SYSTEMS | 224 AIRPORT PKWY SAN JOSE CA 95110 | (408) 216-0769 | ROBINSYSTEMS.COM 7