White Paper Improving Analytics Economics with Cray...Virtualization and containerization of...

© 2016 by The Enterprise Strategy Group, Inc. All Rights Reserved.

Comparisons to the Cray Urika-GX Agile Analytics Platform By Nik Rouda, ESG Senior Analyst and Mike Leone, ESG Senior Lab Analyst August 2016 This ESG White Paper was commissioned by Cray Inc. and is distributed under license from ESG.

Enterprise Strategy Group | Getting to the bigger truth.™

Improving Analytics Economics with Cray

White Paper

White Paper: Improving Analytics Economics with Cray 2


Contents

A New Focus on the Power of Analytics ............................................................................................................................... 3

Analytics Initiatives Are Necessarily Interdisciplinary ........................................................................................................... 3

Finding the Best Fit Model for Analytics in the Data Center ................................................................................................. 4

Modeling Costs of a Big Data Infrastructure with Hadoop ................................................................................................... 5

The Cost of a Big Data Infrastructure Using a Do-it-yourself, Build-your-own Approach ..................................................... 6

Lowering Capital and Operational Expenses with the Cray Urika-GX Platform ..................................................................... 8

Comparing the Cray Urika-GX Platform with a DIY, Build-your-own Approach .................................................................. 10

The Bigger Truth ................................................................................................................................................................. 10



A New Focus on the Power of Analytics Anyone paying attention to recent IT trends is well aware that analytics have become a top-tier priority. Indeed, there has

been much written in the technology industry and even popular media about the power of big data and analytics, with

truly amazing applications and outcomes spanning all industries and lines of business. It is possible to understand the world

at the macro and micro levels, in real time and over decades, in ways that simply weren’t feasible or affordable before

now. New technologies are changing the rules, with data environments like Hadoop, Spark, Mesos, and graph analytics

rapidly growing in popularity.

Sadly, many also now realize that achieving their own ambitions for analytics is often harder (and costlier) than expected.

While there is massive potential to glean more actionable insights from more data than ever before, the sheer scope of the

effort can be daunting. ESG research has shown that 77% of those responsible for their organizations’ big data and

analytics strategies and new projects believe it will typically take more than six months before meaningful business value is

seen.1 Finding ways to shorten this delay and have an impact sooner is imperative to satisfy the needs of the business.

Otherwise, there is the likelihood that big data analytics will be seen as over-hyped and under-productive, and investment

will be withdrawn from worthy initiatives.

When leveraging a pre-built, -integrated, and -tested big data analytics platform, organizations can significantly reduce the

time to gain eventual insight by reducing the time and cost of researching, testing, procuring, deploying, and managing a

complete analytics solution, but that usually come at a cost. With that in mind, ESG quantified the economic advantages of

the Cray Urika-GX analytics platform when compared to a do-it-yourself solution over a three-year period.

Analytics Initiatives Are Necessarily Interdisciplinary

A significant challenge in this space is that the analytics is only one piece of a much broader environment. The applications

that a business analyst or data scientist will directly employ in their work are intricately dependent on the underlying

technology stack. The dynamic and expanding nature of the number of tools emerging means that all the below is going to

continuously change and grow. There will be no static state, and that points to the need for openness and versatility in a

solution. This includes a range of technologies often including, but not limited to:

Big data platforms (i.e. Hadoop and Spark), analytics engines, and programming languages (like Python and R).

Applications, business intelligence (BI), visualization, and reporting applications.

Data warehouses and databases (for example, Cassandra).

Data ingestion, pipeline, integration, ETL, and governance software (including Kafka).

Security frameworks.

Virtualization and containerization of workloads (with Mesos, OpenStack, and Docker).

Servers, storage, and networking infrastructure.

The problem is that this list covers a surprisingly wide span of IT disciplines, domains, and skills, meaning that you will need

a very diverse team to collaborate in order for analytics initiatives to be successful. This is reflected in ESG survey results

showing how many different teams must be involved to build a whole solution; as shown in Figure 1, seven different areas

of competence were seen as crucial or important to have engaged. Just getting this many people into a conference room

or on a call isn’t easy, and it’s even harder to work through each group’s unique concerns and interests to arrive at a

consensus. Making matters worse, every area covered will likely have to go through its own process of defining specific

requirements, identifying possible vendors, evaluating products, negotiating price, and receiving, deploying, and

integrating the entire system. This leads to an increased likelihood of component mismatches, if not outright functional

gaps or compatibility conflicts, unless the entire process is managed very deliberately with careful attention to detail. No

wonder most new analytics initiatives get bogged down and take longer than six months to show results, as previously

noted.

1 Source: ESG Research Report, Enterprise Big Data, Business Intelligence, and Analytics Trends: Redux, July 2016. All ESG research references and charts in this white paper have been taken from this research report.

http://research.esg-global.com/reportaction/RR_EBDT07292016/Toc



Figure 1. Many IT Disciplines Required for a Successful Analytics Initiative

Source: Enterprise Strategy Group, 2016

One potential shortcut is the utilization of public cloud infrastructure-as-service (IaaS) offerings, which make sense, for

example, when applications and data are already hosted in the cloud, or when resource elasticity is critical. Yet clouds

don’t always meet enterprise requirements for predictability and performance. Many select cloud-based analytics hoping

to reduce the effort of provisioning a hardware environment, yet there are still demands on the wide-area network,

security requirements, and the complete software stack to be managed. While the elasticity sounds promising, the full

costs of cloud services are often less predictable, particularly for large-scale big data and analytics environments. In many

regulated businesses, using cloud may also make it harder to meet externally defined requirements for security, privacy,

and governance. Accordingly, current ESG research shows that less than 20% of organizations expect cloud to be their

primary deployment model for analytics. 2 For a large majority, the question will be about how to efficiently design and

deploy a capable and economical on-premises solution.

Finding the Best Fit Model for Analytics in the Data Center

Many interested in analytics might not see why the infrastructure matters all that much. Surely, tuning of the database and

the analytics models will be sufficient to improve performance, right? The reality is that a number of factors are relevant

here. While poorly written analytics will definitely slow response, increase resource demands, and reduce concurrency, so

will poorly matched hardware limitations around system processors, memory, and storage. And that’s just around

performance—there are additional requirements for enterprise operational quality like scalability, availability, reliability,

recoverability, and supportability. Most organizations will define “success” as meeting needs in all of these areas. Cost

itself has as many considerations: capital cost of acquisition, opportunity cost of delays, manpower costs of effort, and

ongoing costs of operation. The complete technology stack—hardware and software alike—is going to define the overall

outcomes.

2 ibid.

29%

29%

32%

33%

35%

43%

45%

48%

53%

47%

46%

48%

42%

40%

18%

12%

13%

15%

13%

11%

10%

3%

4%

6%

5%

2%

3%

3%

2%

1%

1%

1%

2%

1%

1%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Storage team

Applications team

Networking team

Infrastructure/cloud architects

Server/virtualization team

Database/BI/analytics team

Security/risk/governance team

How important is the involvement of the following IT disciplines for new initiatives and projects in the area of big data and analytics to be successful? (Percent of respondents,

N=475)Crucial Important

Nice-to-have, but not required Completely unnecessary

Don’t know / no opinion



While utilization of commodity hardware and open source software may sound like an effective strategy to having choice

and controlling costs of the combined infrastructure, they do nothing to reduce the inherent complexity. In fact, the total

cost of ownership may be higher than using vendor “proprietary” alternatives that offer simplified deployment,

management, and support. This is reflected in big data buying preferences; for example, only 24% expect to use purely

open source Apache Hadoop, with a majority using at least some vendor-backed distributions for the additional

advantages they bring.

Given that 1) analytics is a top priority, 2) time-to-value is generally too long, 3) quality will depend on having a well-defined and tightly integrated stack, and 4) significant effort and expenses may be incurred, how should enterprises proceed? One popular answer is to explore the possibilities of a pre-integrated analytics platform (or “engineered system,” if you prefer the term). Nearly a quarter of enterprises (23%) are indeed planning to use purpose-built, pre-integrated systems as their primary deployment. 3 There are a number of motivations for this practice, but one negative assumption about this approach is worth deeper exploration here, and that is a widespread impression that appliances are too expensive and therefore only intended for the most intensive analytics at the biggest companies and government labs. There is another common perception that appliances are “locked down” to the vendor defined set of software and configurations, and therefore not adaptable for anything but a specific need. That belief in itself may drive people to DIY - so they can customize and keep customizing as the needs change.

To explore this belief, ESG will now examine two approaches: developing your own environment versus selecting a ready-

made system. For the purpose of this comparison, we’ll look at the Cray Urika-GX platform versus an equivalent

commodity kit.

Modeling Costs of a Big Data Infrastructure with Hadoop

When modeling the cost of a big data infrastructure over a three-year period, both capital expenses (CapEx) and

operational expenses (OpEx) should be examined. A large portion of CapEx comes in the first year, due to the initial cost of

acquisition, which includes paying for all of the hardware and software required to make up a big data infrastructure. This

not only includes costs for compute blades (CPU, memory), storage, and networking, but also software licensing and

infrastructure support. Within the licensing and support category, sub categories must be included to factor in costs for

hardware support, core software licensing and support (OS and management software), and big data analytics software

licensing and support. For years two and three of the modeling exercise, additional capital expenses must be accounted for

to address continued licensing and support requirements across the whole infrastructure, including hardware, core

software, and analytics software.

For OpEx, there are two modeling phases. The first focuses on preparation, which includes technology research, shopping,

evaluating, procuring, and testing. This phase is difficult to model as the amount of quantitative variables related to time

are vast and fall in a wide range. Available technology, personnel expertise/competency, and budget are just a few of the

factors that impact this phase. The second phase focuses on deployment and management of the system. Hard costs can

be assigned to infrastructure deployment times, based on full-time employee salaries and expected deployment and

integration times of hardware, core software, and big data software. The full-time employee salaries can then be applied

to management and maintenance costs of the big data infrastructure based on the overall size of the infrastructure.

General Configuration Details and Assumptions

ESG completed a three-year, total cost of ownership (TCO) analysis of a Cray Urika-GX platform and compared it to a

similarly configured infrastructure using a do-it-yourself (DIY), build-your-own approach. Both CapEx and OpEx costs were

factored into the model, excluding the preparation phase (research, shop, evaluate, procure, test). The model of the Cray

Urika-GX platform was completed using internal pricing provided by Cray, while DIY pricing was determined by averaging

the cost of industry-leading vendor offerings at a component level configured to match the Cray Urika-GX offering. This not

only included the cost of core components, but also accounted for common discount pricing and associated support and

licensing costs over three years from leading vendors.

3 Ibid.



Figure 2. General Configuration Details and Assumptions


Both a small (16 nodes) and large configuration (48 nodes) were modeled. In both cases, two nodes were assigned as login

nodes and two nodes were assigned as I/O nodes to mimic the Cray Urika-GX hardware architecture. This means that the

remaining nodes in each configuration were compute nodes. Each compute node contained two Intel Xeon E5-2600 v4 18-

core processors, eight 32GB DIMMS (256GB), one .8TB SSD, and two 2TB HDDs. It should be noted that these

configurations are considered base configurations. Higher memory and storage configurations are available. For

networking, the Cray Urika-GX platform leverages their proprietary supercomputing interconnect across all nodes, called

Aries, while both DIY-modeled configurations were modeled with an InfiniBand switches and interconnect in an attempt to

mimic Cray’s performance.

Support and licensing costs were divided into two categories: core and analytics. Core support and licensing focused on the

hardware and core software, including the operating system and infrastructure management software for all nodes in the

infrastructure. Analytics licensing and support covered just the compute nodes running Hadoop, Spark, and other big data

ecosystem software projects. Finally, deployment, management, and maintenance was factored in based on estimated

deployment time and management and maintenance frequency. Though deployment is normally a one-time expense,

management and maintenance have the potential to grow over time, especially as more people look to use the

infrastructure and IT administrators look to update and optimize the underlying technology.

The Cost of a Big Data Infrastructure Using a Do-it-yourself, Build-your-own Approach

It is commonly thought that since big data is at the forefront of open-source technology offerings, costs therefore follow

suit. Unfortunately, this is not always the case. After factoring in preparation time, hardware, support, and full time

employees with big data expertise related to deployment and integration, costs can quickly add up. In terms of time, in

some instances, organizations have taken as long as a six months just to decide on an approach to address their big data

initiatives, never mind the actual time to get everything up and running. The timeframe can balloon into over a year before

big data insights start providing value.

The DIY approach priced servers that were configured with the same processing power, memory, and storage capacity as a

Cray Urika-GX node. One variation was the need to factor in the cost of an InfiniBand interconnect that served as the high

performance networking component in both DIY configurations. It should be noted that racks and cables were set as a

fixed cost with all components assumed to fit in a single rack for the smaller configuration and two racks for the larger

configuration.



With hardware support often coming with 3-year term options, these costs were spread across all three years. The

modeled hardware support provided next-day, onsite service with 5x9 technical support. Core software licensing (OS,

management) for compute and supporting nodes were the same year over year for three years and consisted of a Red Hat

Enterprise Linux premium license per node. DIY analytics licensing and support costs were directly related to the number of

compute nodes. Top Hadoop distribution providers such as Cloudera, Hortonworks, and MapR have licensing packages that

make it easy to license per node and cover all projects on the open source community. In terms of support, Hadoop, Spark,

and all other open source projects are usually supported through a single support cost and factored into the license. The

average licensing and support cost from the top Hadoop providers was used to calculate an annual cost.

Based on ongoing customer conversations, ESG estimated the time required to deploy a DIY solution was estimated in a

range of 8-10 man weeks for the small configuration and 16-20 man weeks for the large configuration, which is broken into

two clear phases: core infrastructure and analytics. The first phase focused on infrastructure deployment, including the

time to rack, stack, cable, install, and configure hardware, core OS, and management software. Two options exist that were

averaged together to give an average cost projection: deployment services from hardware vendors and full-time employee

costs with expertise in hardware deployment. Together, the average costs provide an ideal balance of time to deploy and

cost. The second phase focused on the deployment of big data technology, including the time to install Hadoop and Spark,

as well as fully integrate all supporting big data ecosystem projects. The average salary of experienced big data personnel

was used to calculate the cost of this phase. These same salaries were used to model the costs for ongoing management

and maintenance of the overall system. It should be noted that daily operations across both DIY and Cray were set as the

same, while maintenance costs were assumed to be slightly higher for the DIY approach due to the potential risk of issues

arising as opposed to the lower risk associated with a system pre-built and pre-tested by Cray’s experts.

Figure 3. 3-year TCO of a DIY Infrastructure


Figure 3 higlights the costs of both a 16-node and 48-node big data infrastructure using a DIY, build-your-own approach. In

both cases the first year costs are significantly higher due in large part to the initial hardware purchase. Other first-year

only costs came from deployment. Ongoing costs for the next two years were between 45-65%. This came from annual

costs associated with support, licensing, and ongoing management and maintenance. The total cost over three years was

$998,537 for the 16-node configuration and $2,312,458 for the 48-node configuration.



Lowering Capital and Operational Expenses with the Cray Urika-GX Platform

With Cray, organizations gain several decade’s worth of experience of an industry-leading supercomputing company that

has the expertise in handling massive data sets from some of the largest producers and analyzers of data in the world.

Their systems serve as some of the fastest supercomputers the world has ever seen to handle the most compute-intensive

problems in the world—from keeping people safe by analyzing government intelligence, through facilitating lifesaving

discoveries in the life sciences, to modeling the universe piece by piece.

The Cray Urika-GX platform combines Cray’s supercomputing expertise with open, enterprise-class big data standards to

deliver an agile, performance-centric analytics environment. The analytics solution leverages a pre-built, pre-tested big

data appliance and pre-integrated management and open source analytics software. This enables a quick onramp for

customers to begin using the solution in just days after procurement. Further, customers gain agility and flexibility through

its open framework. This is not only due to customizations related to addressing even the strictest of SLAs related to scale,

performance, and security, but also the fact that customers can run additional tools that are not pre-shipped with the

platform. Though every tool will not be supported, the underlying platform and software will continue to be supported.

ESG leveraged Cray-provided pricing of a Cray Urika-GX system to model the CapEx and OpEx costs over three years, which

lends itself to as close to an apples-to-apples comparison to the DIY pricing as possible. It should be noted that one

important advantage of the Cray Urika-GX system is the unique Cray Graph Engine software that comes standard with all

systems to support graph analytics—the iterative discovery of patterns and relationships within mixed data sets. Because

this specialized software is not available for DIY approaches, it was not part of the modeling exercise, but should be

considered an additional advantage for Cray as the performance is far beyond most graph solutions available today.

First from a CapEx standpoint is the hardware. The

appliance comes in a 42-u rack and includes powerful

compute nodes with SSDs and the Cray Aries

supercomputing interconnect for unparalleled levels of

performance across nodes. Small (16 nodes), medium (32

nodes), and large (48 nodes) configurations are available,

and for this exercise, ESG focused on the small and large

configurations. Due to the system being pre-integrated prior

to shipping, hardware and core software support costs are

combined. Though a premium support package is available,

ESG used pricing for the basic support package, which is a

cost-effective option to receive next-business-day, on-site

responses and includes updates to the operating system.

For remote technical coverage, the basic option offers 5x9

support with the option to extend coverage times to 24x7.

From a software standpoint, the Urika-GX platform comes

with CentOS and Cray System Management Workstation

software based on OpenStack technologies for core system

management. A comprehensive analytics framework covers

batch analytics with the Hortonworks Data Platform (HDP),

iterative and interactive analytics with Apache Spark,

relationship discovery with the Cray Graph Engine, resource

management (Apache Mesos and Marathon), an analytics

programming environment, productivity tools like Jupyter

Notebooks, and application monitoring software. All of the

technology comes pre-integrated in the Urika-GX platform.

Figure 4. Support Comparison




It should be noted that licensing and support of the analytics technology is provided by Cray, as opposed to requiring users

to go through Cray to the analytics vendor. This is an important detail to point out. If a customer was to go through

Hortonworks for licensing and support, it would cover all tools in HDP. As shown in Figure 4, the Urika-GX platform

currently supports a subset of the HDP tools and therefore the cost of licensing and support through Cray is reduced. This

of course will change over time as Cray looks to support even more open source projects in the coming year.

For OpEx costs, the Urika-GX platform eliminates customer deployment costs altogether. Cray experts pre-assemble, -

integrate, and -test all hardware and software prior to shipment. Cray experts also go to the customer site to handle all

procurement and deployment to not only ensure customer success with the platform, but to accelerate time to value by

enabling a fast ramp up time. For management and maintenance, it is expected that less overall time will be required to

complete most tasks due to the level of expertise in building and deploying the system. Though personnel with big data

analytics experience will make the customer transition easier, an expert-level big data engineer is not required.

Figure 5 shows the CapEx and OpEx costs of the Cray Urika-GX platform. As previously explained, hardware and core

software licensing and support are combined into a single, fixed cost over 3 years, while the deployment cost is completely

eliminated.

Figure 5. 3-year TCO of a Cray Urika-GX Platform


One area not included in the model is the cost of floor space, power, and cooling. The general assumption is that not only

is the cost a wash between the two configurations, but the cost to power and cool these systems would be so low in

comparison to the rest of the costs, that it would not serve as an underlying deterrent to purchasing one of these systems.

This is fully accurate for the smaller configuration. The larger configuration proves to have additional savings for Cray,

specifically in relation to floor space. The 48-node configuration from Cray comes in a single rack, as opposed to a DIY,

build-your-own approach, which would most likely be spread out across two racks for the larger configuration. This not

only doubles the cost of floor space, but also doubles the physical footprint, which could lead to management complexities

and performance impacts, especially in data centers that have limited floor space. As data is constantly moved in and out

of the system, a second rack could very well land in a different location within the same building or even a different

geographic location if space is a constraint, incurring even more costs related to networking hardware.



Comparing the Cray Urika-GX Platform with a DIY, Build-your-own Approach

ESG witnessed a 21% TCO savings with the 16-node Cray Urika-GX platform when compared to a DIY, build-your-own

approach over three years. The savings increased to 24% with the 48-node Cray Urika-GX platform. Though hardware costs

with Cray were slightly higher (6% with 16-node and 18% with 48-node), ESG feels this difference is more than justifiable

due to the proprietary Aries interconnect that delivers extremely high levels of performance for concurrent analytic

workloads across the entire cluster.

For hardware support and core software licensing and support, the 16-node Cray configuration cost 5% more than DIY,

while the 48-node Cray configuration yielded a savings of 11%. For analytics licensing and support, Cray offers a savings of

over 55%, mostly due to the support structure that organizations leverage through Cray. When something goes wrong with

the analytics software environment, Cray’s expert staff is responsible for support. With a DIY approach, organizations pay

for support through the big data vendor (i.e., Hortonworks, Cloudera, MapR, etc.). This includes additional projects found

in the open source big data ecosystem that are not covered in the Cray Urika-GX platform’s support matrix. It should be

noted that, though the Cray Graph Engine comes free, optional support is an additional annual fee and was not included in

this modeling exercise.

With the Urika-GX platform being deployed by Cray professionals at no additional cost, large savings are achieved

compared to DIY. With the 16-node configuration, organizations can save over $80,000, and with the 48-node

configuration, the savings increase to almost $200,000 in the first year. Further, organizations can minimize risk with the

Urika-GX platform since it is pre-configured, -integrated, and -tested before shipping, so ongoing management and

maintenance costs are reduced by over 5% in year one, and upwards of 25% and 35% the following two years. These

savings come from the fact that not only does the Urika-GX platform empower big data personnel with limited experience

to handle more of the management and maintenance tasks, but also the potential for issues is reduced knowing Cray

experts handled the initial configuration and deployment. The DIY, build-your-own approach not only requires experts in

deployment and integration, but is susceptible to a larger number of issues arising over the life of the environment.

The Bigger Truth

The worlds of supercomputing, big data, and analytics are starting to converge, and what was once the lofty domain of

only a few research labs (with impressive budgets) is now becoming accessible and affordable for any enterprise. The trick

is choosing an analytics environment that effectively balances goals around performance, flexibility, and total cost of

ownership. Not only can purpose-built platform like the Cray Urika-GX system be powerful platforms for big data and

analytics, but they can also be more economical than self-assembled clusters.

ESG compared the small and large configurations of the Cray Urika-GX system with a similarly configured, build-your-own

infrastructure using a do-it-yourself approach. Significant savings of as much as 24% were seen with the Cray solution over

a three-year period, including both CapEx and OpEx. With the Cray Urika-GX platform, CapEx costs were lower in hardware

support, core software licensing and support, and analytics software licensing and support. For OpEx costs, Cray eliminated

the deployment cost altogether by handling it for customers, while management and maintenance proved to be lower

over the three years.

Cray is merging the company’s long experience in supercomputing with the best of the latest open source software in big

data to offer a uniquely compelling value proposition. The Urika-GX platform brings the convenience and performance of a

tuned appliance with the flexibility of being able to choose between different options and not being locked into a single

approach or type of software stack. The simplicity of Cray’s Urika-GX system helps each stakeholder group focus on their

primary jobs: the data scientists build analytics models, the IT operations team facilitates the infrastructure, and DevOps

can be nimble in creating new application capabilities, as opposed to everyone needing to be involved in designing,

integrating, and maintaining a jumble of hardware and software products from different vendors. While there is no single

“best choice” for every business, those looking to gain real big data advantages would do well to reconsider their

assumptions about deployment models, and evaluate how Cray could be the right offering to meet their needs.



All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources. The

Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject

to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this

publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express

consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and, if applicable,

criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188.

www.esg-global.com [email protected] P. 508.482.0188

Enterprise Strategy Group is an IT analyst, research, validation, and strategy firm that provides actionable insight and intelligence to the global IT community.


White Paper Improving Analytics Economics with Cray...Virtualization and containerization of...

Documents

Transcript of White Paper Improving Analytics Economics with Cray...Virtualization and containerization of...