Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0...

11
IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Evolution to Revolution: Big Data 2.0 An ENTERPRISE MANAGEMENT ASSOCIATES® (EMA™) White Paper Prepared for Actian March 2014

Transcript of Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0...

Page 1: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

IT & DATA MANAGEMENT RESEARCH,INDUSTRY ANALYSIS & CONSULTING

Evolution to Revolution: Big Data 2.0An ENTERPRISE MANAGEMENT ASSOCIATES® (EMA™) White Paper Prepared for Actian

March 2014

Page 2: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Table of Contents

©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com

Evolution to Revolution: Big Data 2.0

Executive Summary .......................................................................................................................... 1

Big Data is Maturing Fast ................................................................................................................. 1

Drivers of Change ....................................................................................................................... 1

Evolution to Revolution ................................................................................................................... 2

Hybrid Data Ecosystems and Big Data 2.0 ....................................................................................... 3

Orchestration and Integration .......................................................................................................... 7

EMA Perspective ............................................................................................................................... 7

About Actian .................................................................................................................................... 8

Page 3: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Page 1 ©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com1

Evolution to Revolution: Big Data 2.0

Executive SummaryThe evolution and innovation surrounding Big Data is evolving quickly. Industry research indicates a new level of sophistication is required to meet these needs. Big Data 2.0 has arrived and early adopters of Big Data 1.0 strategies are challenged by poorly integrated traditional systems that are inflexible and difficult to manage. The Big Data landscape continues to shift towards more sophisticated workloads that go beyond simple analytics towards operational processes that drive deep businesses value. Diverse data sources and real-time demands are changing traditional architectures to include an array of purpose-built platforms presenting new opportunities and challenges.

Big Data is Maturing FastInnovation is a constant in the area of data management and analytics. Dating back to the 1970s when E. F. Codd created relational databases all the way to the innovative team at Yahoo who recently brought us Hadoop. It seems that in a blink of an eye technological advancements are driving our Big Data and analytics strategies further and faster than we initially imagined. This evolution is driven by a variety of trends all of which create a perfect storm of challenges and opportunities for innovative companies.

Drivers of ChangeBig Data adoption is spurred on by four major technical trends and it’s causing the industry to evolve at faster rate than many of us believed possible. These four trends are moving technology forward while opening the door for greater insight and innovation around enterprise data.

•Maturing User Communities have created a demand for more sophisticated and complex utilization of enterprise data. Highly complex workloads are the norm and traditional systems and architectures are challenged to meet these evolving needs. The democratization of data driven insights is empowering a wider user base by including line of business executives in the discussion and value proposition surrounding Big Data. Finance, Marketing and Sales are sponsoring Big Data projects nearly as fast as IT organizations.

• NewTechnologies – Innovative technologies, MPP environments, columnar databases, flash drives, in-memory computing, Hadoop and NoSQL databases are all contributing to the technology surge that is powering Big Data and its possibilities. Technology is allowing us to execute on workloads that were once impractical from a time and resources standpoint.

• Economics – The capital costs of working with vast data sets has dropped significantly over the past few years. Many areas of our analytic infrastructure are benefiting from commoditization. Servers, memory and disks are all less expensive than ever, allowing us to do more with less. Many of the new Big Data frameworks are based on open source technology creating a lower financial barrier to adoption.

• ValuableDataTypes – New and valuable data types have caught the imagination of companies who see a competitive edge in leveraging machine, sensor, appstream and social data to open new avenues of insight and execution for their companies. The Internet of Things is driving innovation and creating a flood of new data to our businesses. At the same time Big Data is supplying us with the tools to tap into unstructured enterprise information we were once forced to ignore due to the cost or lack of technology. As Big Data resources evolve companies are addressing the opportunity that these data types can deliver.

Big Data 2.0 has arrived and early adopters of

Big Data 1.0 strategies are challenged .

Page 4: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Page 2 ©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com2

Evolution to Revolution: Big Data 2.0

Evolution to RevolutionThese four trends act as catalysts for early adoption of Big Data projects. Research executed by EMA in its 2012 Big Data Comes of Age1 research report illustrated how early projects were being implemented. Early adaptors of Big Data focused on access to internal and external multi-structured data sets as their number one ranked technical driver to implement projects while 51% of respondents stated that their primary use case for Big Data was Online Archiving. Both of these data points illustrate how early stages of Big Data strategies were focused on wrangling information and working to leverage it. 45% of respondents ranked staging structured data as the second most popular use case. Data from EMA research shows that analytic workloads are a primary goal of companies looking to leverage Big Data and execute sophisticated analysis. Complex operational workloads are quickly becoming the norm as Big Data strategies mature.

Early stage projects opened the door for companies to experiment and address entry-level Big Data opportunities. These projects faced challenges from multiple directions. 41% of EMA research respondents indicated lack of skills to manage multi-structured data platforms such as Hadoop as a leading deterrent to their overall success. 44% of respondents planned to address the skill gap issue through internal training of staff – a time consuming and expensive task. Adding new platforms to an already complex data management landscape makes it difficult to orchestrate data and workloads. Implementing projects across these platforms demands a higher level of integration between solutions that most Big Data version 1.0 ecosystems don’t have. Overcoming a skill gap and adopting new technologies is difficult under the best of circumstances. As early projects gave way to next level initiatives new challenges surfaced for companies adopting Big Data.

There are significant trends from one year to the next as Big Data 1.0 projects accelerate to a more sophisticated set of requirements. In the 2013 EMA Big Data research, Operationalizing the Buzz: Big Data 20132, it became clear that a shift is taking place in the Big Data landscape and several themes have emerged that are driving Big Data to the next level.

•Complex operational workloads are driving greater value in Big Data projects.

•Real-time data demands have overshadowed batch style data.

• Sophisticated Big Data projects require diverse data sources.

•Companies are utilizing a multiple platforms to execute complex workloads.

In short Big Data has evolved to a mission-critical technology for enterprise companies. Data from 2013 EMA research demonstrates this shift in multiple ways. After surveying 600 active Big Data projects the most popular workloads are Fraud Analysis/Risk Management, CRM and Asset Optimization. Each of these project types is operational in nature, complex, real-time driven, includes diverse data assets, and reaches beyond a Hadoop only environment to leverage traditional platforms.

1 Big Data Comes of Age, EMA and 9Sight Consulting, November 2012. http://www.enterprisemanagement.com/research/asset.php/2409/Big-Data-Comes-of-Age

2 Operationalizing the Buzz: Big Data 2013, EMA and 9Sight Consulting, November 2012. http://www.enterprisemanagement.com/research/asset.php/2641/Operationalizing-the-Buzz:-Big-Data-2013

Complex operational workloads are quickly

becoming the norm as Big Data strategies mature.

Page 5: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Page 3 ©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com3

Evolution to Revolution: Big Data 2.0

0% 2% 4% 6% 8% 10% 12% 14%Percentage of Projects

Fraud Analysis, Liquidity Risk Assessment (e.g.,risk management)

Customer Relations Management (e.g., ad-hocoperational queries)

Staff Scheduling, Logistical Asset Planning (e.g.,asset optimization)

Billing, Rating (e.g., operational event and policyprocessing)

Campaign Optimization, Market Basket Analysis,Cross-sell/Up-sell Recommendation

Grouping and Relationship Analysis, GeographicOptimization (e.g. clustering, social graph)

Point of Sale, Customer Care (e.g., operationaltransaction processing)

Sentiment Analysis, Opinion Mining (e.g., naturallanguage processing, text analytics)

Social Brand Management Analysis (e.g., eventprocessing with text analytics)

Path Analysis, Customer churn (e.g., behavioralanalysis)

13.1%

12.6%

11.7%

11.2%

10.6%

10.1%

9.9%

7.5%

7.2%

6.2%

2013 Project Challenge

Figure 1: Big Data projects by type from EMA Operationalizing the Buzz: Big Data 2013 research.

To further make the case for maturity in Big Data, EMA research identified new focus on speed requirements from the 2013 research respondents. Technical and business drivers behind Big Data projects aligned across this topic. Respondents identified requirements for faster analytical or transactional processing of structured and unstructured data sets (54%) along with the need to react faster to real-time streaming data souces (51%) as the top drivers for Big Data projects. At the same time respondents selected faster response time for operational and analytical workloads as the primary business driver behind Big Data projects. It’s not often that IT/Technical drivers and business drivers align this well. The need for greater speed supports the findings that operational workloads are gaining prominence and overall project complexity is growing.

Hybrid Data Ecosystems and Big Data 2.0As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies. At the core of these strategies or best practices for Big Data is a paradigm shift away from a centralized enterprise data warehouse as the central data source for business intelligence and analytics to a more diverse landscape of data driven platforms. This Hybrid Data Ecosystem (HDE) is focused on matching data types and workloads with the best posible platform to meet the needs of the enterprise or a specific project. Every company’s ecosytem will be somewhat unique in make up but it will share commonality of requirements, management, integration, platforms, workloads and users.

Big Data 2.0 organizations are faced with new data,

new users, new workloads and new complex strategies.

Page 6: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Page 4 ©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com4

Evolution to Revolution: Big Data 2.0

LOAD

RESPONSE

STRUCTURE

COMPLEXWORKLOAD

ECONOMICS

AnalyticalPlatform (ADBMS)

Hadoop

NoSQL

SQLOperational

Systems

Cloud Data

REQUIREMENTS

Enterprise DataWarehouse (EDW)

DiscoveryPlatform

Data Mart (DM)

INFORMATION MANAGEMENT

DATA INTEGRATION

OPER

ATIO

NA

L P

RO

CESSIN

G

AN

ALY

TIC

S

OPERATIONAL ANALYTICS

EXPLORATION

Line of BusinessExecutives

BI Analysts

BusinessAnalysts

DataScientists

Developers

ExternalUsers

IT Analysts

Hybrid Data Ecosystems add power and agility to a companies analytic landscape. At the same time it can add complexity and new challenges. When choosing platforms it is important to investigate how well they will integrate and work with the other solutions your company has invested in. Leading vendors in this space are working to add orchestration and integration between solutions to abstract away the complexity and leverage the power of a Hybrid Data Architecture.

The movement towards Hybrid Data Ecosystems especially in support of Big Data initiatives has been underway for several years. EMA research has tracked this paragigm shift via our 2012 and 2013 Big Data research studies. The 2013 findings illustrate that 60% of Big Data projects are utilzing two or three of the eight HDE platforms.

Page 7: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Page 5 ©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com5

Evolution to Revolution: Big Data 2.0

EightPlatforms2.3%

SixPlatforms1.5%

Five Platforms3.5%

Four Platforms4.3%

Three Platforms27.8%

Two Platforms32.1%

One Platform28.2%

2013 Hybrid Data Ecosystem Platform Distribution

Over 11% of Big Data projects are relying on 4–8 individual platforms to execute on sophisticated workloads. Utilizing the best possible platform within a Hybrid Data Ecosystem creates several value propositions not generally available with traditional environments. Platform specific workloads allow the end users to align applications and to optimize their performance on the supporting platorms. A new level of agility is delivered as well, providing flexibility to how applications and work processes are delivered. Aligning to the proper platform increases performance and addresses the demands of real-time insghts and operational workloads. Allowing the system to support the speed of the business.

Each Platform in a Hybrid Data Ecosystem delivers unique value and abilities. They include:

• Operationalsystems: Business support systems such as website order entry applications, Point Of Sale (POS), Customer Relationship Management (CRM) or Supply Chain Management (SCM) applications. These platforms contain increasingly fine-grained information on transactions and demographics.

• Enterprisedatawarehouse: Centralized analytical environments where corporate-level, reconciled and historical information of an organization is stored. These platforms have structured data organizations (schemas) based on time rather than present information.

• Datamart: Often distributed analytical environments where a particular subject area or department level data set is stored for historical or other analysis. These platforms often have similar data organization to the enterprise data warehouse, but serve smaller user groups.

Utilizing the best possible platform within a Hybrid

Data Ecosystem creates several value propositions

not generally available with traditional environments.

Page 8: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Page 6 ©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com6

Evolution to Revolution: Big Data 2.0

• Analyticalplatforms: Specifically architected and configured environments for providing rapid response times for analytical queries. These platforms are generally developed to support high-end analysis via tuned data structures like columnar data storage or indexing.

• Discoveryplatform: Data discovery platforms support both standard SQL and programmatic API interfaces for iterative and exploratory analytics.

• NoSQL: NoSQL data stores use non-traditional organizational structures such as key-value, wide-column, graph or document storage structures. These data stores support programming APIs and limited SQL variants for data access.

• Hadoop: A specific variant of the NoSQL platform based on the Apache Hadoop Open Source project and its associated sub-projects. These platforms are based on Hadoop’s Distributed File System (HDFS) storage and the evolving MapReduce (MRv2 or YARN) processing framework.

• Cloud: Cloud data sources and computing platforms make information available via standardized interfaces (APIs) and bulk data transfers. Big Data in Cloud adoption is growing fast driven by lower capital costs and fast project implementations cycles.

As mentioned above, Big Data 2.0 workloads are complex, generally require an element of speed, incorporate multiple data souces and rely on a variety of platforms to execute the work. 2013 EMA research identified analytic databases as the most used platform in the 600 active projects surveyed. The chart below illustrates the diversity required to meet Big Data workloads. It is interesting to see that Analytical Platforms are at the top of the list at 42% utilization and Hadoop is utilized in only 16% of the projects.

0% 10% 20% 30% 40%Percentage Responses

Analytical database platforms/appliances

Operational data stores

Cloud-based data solutions

Enterprise or federated data warehouse

Data marts

NoSQL data store platforms

Data Discovery platforms

Hadoop and its subprojects

Other (Please specify)

42.1%

39.0%

18.1%

30.1%

33.6%

16.2%

21.6%

39.4%

0.4%

2013 Platforms Used in Big Data Ecosystem

Selecting the platforms that are right for your needs can be confusing. The EMA Hybrid Data Ecosystem references five requirements to assist in making this decision.

• Structure – It’s critical to understand the structure of the data to be utilized and how that data will be organized. Schema flexibility is a key value to the agility you can get from a Hybrid Data Ecosystem. Exploring the structure of the data will assist you in determining the best platform.

Page 9: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Page 7 ©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com7

Evolution to Revolution: Big Data 2.0

• Load – Most complex Big Data workloads leverage diverse data sources. The mix of data will determine the best platform as well as understanding the velocity of the data. Batch versus real-time is a critical decision point when exploring the best platform alternatives

• Economics – Big Data is enabled by economic factors. Many of the more innovative data driven processes companies are researching would have been economically prohibitive in the past. Selecting cost effective platforms is very important when researching solutions for ahybridenvironment. Unified platforms that feature multiple solutions within a single solution can positively impact the economic side of these decisions.

• Analytics – Complexity of workload is one of the most important requirements of a platform in a Hybrid Data Ecosystem. Operational processing, operational analytics, advanced data exploration and standard analytic needs must be taken into considerationwith choosing the best platforms.

• Response – Operating at the speed of business is critical to any application or operational process. Choosing a platform that matches the necessary speed to insight is non-negotiable when creating a responsive and agile Hybrid Data Ecosystem.

Orchestration and IntegrationApplying the requirements of a Hybrid Data Ecosystem to select the proper platforms to fit your needs is important, but at the same time building an ecosystem that is easily managed can be extremely difficult. The vendor community has recognized this gap and has started to deliver unified platforms that incorporate multiple platforms under a single solution stack. These unified offerings are highly integrated and can be more easily managed than systems that are cobbled together. These systems are adept at orchestrating Big Data workloads, operational processing, operational analytics, standard analytic workloads and many enable advanced data exploration features.

EMA PerspectiveIt’s clear that a significant shift is underway in the area of Big Data. Early opportunities to leverage new data types have fostered new levels of innovation making Big Data a critical component of enterprise strategies. As the technologies evolve, mature companies will need to invest in solutions that are designed to meet these new demands. To meet present and future needs consider the following when building your strategy around Big Data.

• Look to unified architectures that deliver the platform functionality required while including highly orchestrated data and management features.

• Systems that support collaboration and reuse will save time and allow you to be more agile.

• Ensure that your vendor partners can deliver enterprise level service including domain expertise to enable greater value from your Big Data investment.

• Investigate your present and future needs for Big Data speed of execution. Both business and IT are struggling to meet this new Big Data 2.0 challenge.

• Leading platforms will go beyond these features to include automated workload management and easy embedding of Big Data into applications and workflow processes.

It’s clear that a significant shift is underway in the

area of Big Data.

Page 10: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

Page 8 ©2014 Enterprise Management Associates, Inc. All Rights Reserved. | www.enterprisemanagement.com8

Evolution to Revolution: Big Data 2.0

About ActianThe Actian Analytics Platform accelerates the entire analytics value chain from connecting to massive amounts of raw big data all the way to running sophisticated analytics in real-time. The entire platform is built to bring convergence to a Hybrid Data Ecosystem:

•Connect any data or platform for greater precision• Prepare and enrich all data for increasing value• Share computing and data at runtime for real-time accuracy•Choose from hundreds of analytic building blocks •Rapidly assemble and reuse analytic workflows•Optimize response to events with lower latency•Continually increase the precision of automated decisions•Deliver real-time insight to anyone, anywhere

The current shift to Big Data 2.0 creates an opportunity to release the $15 trillion still trapped in enterprise data. The race is on to provide affordable access to the 88% of enterprise data that has proven impractical to leverage in the past. To move forward to Big Data 2.0, six next-generation capabilities of the Actian Analytics Platform help companies accelerate and stay ahead of the curve in the fast paced Big Data market:

1. Cooperative processing delivers faster time to value and better price performance

2. Analytic building blocks provide accessibility for non-skilled and less skilled workers

3. Moving processing to where the data lives operationalizes big data and pushes toward real-time

4. Combining non-relational and relational data enables a richer set of analytics

5. Service layers abstract away the complexity of underlying infrastructure

6. A unified platform provide modular approaches for entry points anywhere along the analytic process

Page 11: Evolution to Revolution: Big Data 2 · 2014. 6. 17. · As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies.

About Enterprise Management Associates, Inc.Founded in 1996, Enterprise Management Associates (EMA) is a leading industry analyst firm that provides deep insight across the full spectrum of IT and data management technologies. EMA analysts leverage a unique combination of practical experience, insight into industry best practices, and in-depth knowledge of current and planned vendor solutions to help its clients achieve their goals. Learn more about EMA research, analysis, and consulting services for enterprise line of business users, IT professionals and IT vendors at www.enterprisemanagement.com or blogs.enterprisemanagement.com. You can also follow EMA on Twitter or Facebook.

This report in whole or in part may not be duplicated, reproduced, stored in a retrieval system or retransmitted without prior written permission of Enterprise Management Associates, Inc. All opinions and estimates herein constitute our judgement as of this date and are subject to change without notice. Product names mentioned herein may be trademarks and/or registered trademarks of their respective companies. “EMA” and “Enterprise Management Associates” are trademarks of Enterprise Management Associates, Inc. in the United States and other countries.

©2014 Enterprise Management Associates, Inc. All Rights Reserved. EMA™, ENTERPRISE MANAGEMENT ASSOCIATES®, and the mobius symbol are registered trademarks or common-law trademarks of Enterprise Management Associates, Inc.

CorporateHeadquarters:1995 North 57th Court, Suite 120 Boulder, CO 80301 Phone: +1 303.543.9500 Fax: +1 303.543.7687 www.enterprisemanagement.com2859.031014