Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016
RESEARCH STRATEGY REPORT
analysysmason.com
PERFORMING BIG DATA ANALYTICS USING HADOOP: ITS
COMPLEX ECOSYSTEM IS LIMITING CSPs’ ADOPTION
JUSTIN VAN DER LANDE
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016
KEY QUESTIONS ANSWERED IN THIS REPORT
WHO NEEDS TO READ THIS REPORT
2
This report analyses the increasing use of the Apache Hadoop
technology by communications service providers (CSPs). Although
Hadoop is not the only big data analytics infrastructure available,
it illustrates the shift that CSPs and vendors are making towards
the adoption of new, low-cost and more powerful technology to
enable them to store, compute and analyse big data sets.
Hadoop is not yet the dominant technology in the data
infrastructure market, but CSPs and vendors increasingly consider
it to be production-ready for telecoms applications. In addition,
Hadoop has the support of a strong and active set of vendors, as
well as the wider open-source community, and its highly active
development teams can ensure that the technology’s future
development will be supported.
This report provides recommendations for vendors that use
Hadoop-based technology for analytics, and for CSPs that use
Hadoop as part of their big data infrastructure.
The report is based on several sources:
Analysys Mason’s internal research, which draws on vendor
engagements
interviews of stakeholders in the data infrastructure market.
About this report
How should Hadoop and related technologies be used to support big
data analytics use cases for CSPs?
Which core technologies does Hadoop use?
Which companies supply, distribute and support the technology, and
what do they provide?
Where have CSPs deployed Hadoop within their organisations?
Which business cases is Hadoop being used to address?
Vendors that are active in the provision of big data analytics systems,
and need to understand their market.
Vendors that provide systems to CSPs that may need to integrate
Hadoop-based systems to support their current applications.
CSPs considering big data systems and technology for analytics use
cases.
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016
CONTENTS CONTENTS
EXECUTIVE SUMMARY
HADOOP ARCHITECTURE AND COMPONENTS
KEY VENDOR SOLUTIONS
HADOOP IMPLMENTATIONS
ABOUT THE AUTHOR AND ANALYSYS MASON
3
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016
Hadoop
cluster
Figure 1: Flow diagram indicating how Hadoop is increasingly being used as part of CSPs’
data infrastructure, in conjunction with traditional data warehouse technology
4
CSPs should increase their use of Hadoop to support specific
data infrastructure use cases, but they must ensure that the
components that they select are suitable. In addition, vendors
can help CSPs to unlock Hadoop’s potential by productising
Hadoop integrations with their applications.
CSPs have been performing big data analytics for many years, and
Hadoop-based solutions have recently been deployed by CSPs
because of their low cost and scalability. However, Hadoop is only
suitable for certain business cases, and compared with more-
established and more-mature solutions, it requires additional
resources to support its complex components and rapidly
changing ecosystem.
As a result, CSPs hesitate to deploy Hadoop – or they limit its use
to only a part of their data infrastructure. Vendors’ solutions that
are based on Hadoop are therefore being delayed as a result of
CPSs’ slow deployments.
This report provides:
an understanding of the Hadoop technology and why it is
difficult for CSPs to adopt
an overview of how vendors have incorporated Hadoop
technology in to their solutions to address CSPs’ issues
a discussion of how – and why – CSPs are using Hadoop.
Executive summary
Source: Analysys Mason
Operational data is
increasingly stored
in Hadoop and
diverted from the
Enterprise Data
Warehouse (EDW)
OSS/BSS
application
data stores
Enterprise
data
warehouse
Data consolidation
Refinement and
enhancement of current
and new functionality and
insights, driven by
application vendors with
close coupling of data
Data consolidation
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016
Figure 2: the shift to a hybrid data infrastructure to encompass Hadoop technology, driven by
new data types and business requirements, is complex and slow
5
CSPs must store and analyse large volumes of data to remain
competitive, and Hadoop can help to address these needs with
its powerful and low-cost technology. However, Hadoop is
supplied as open-source software within a fragmented
ecosystem that is not considered to be as robust as traditional
technology.
Hadoop’s ecosystem consists of dozens of components that
complement and compete with each other. The different
requirements of different CSPs dictate which types of components
are required, and commercial considerations inform the selection
of the supplier that is used.
Vendors and CSPs should avoid selecting combinations of Hadoop
components that may effectively become proprietary and would
therefore restrict their ability to develop, support or purchase
solutions that run on these components.
Vendors understand that they need to help support and integrate
their solutions with CSPs’ changing data infrastructure. The
different permutations and combinations of Hadoop components
can potentially create a fragmented data infrastructure, which
would need to be supported by software solutions. As a result,
development and implementation costs would increase because
each variation requires porting and testing.
Hadoop adoption has been slow amongst CSPs because it is a complex
technology and has a fragmented ecosystem of over 120 projects
New data sources
(Web logs, email,
clickstream, social media,
sensor data
Traditional Data
sources (RDBMS, LLTP,
OLAP)
NoSQL Hadoop based
data stores Traditional data stores
Embedded analytics
tools and packed
solutions
Standalone analytics
tools
Applications
Data infrastructure
Data sources
Source: Analysys Mason
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016
Sqoop Flume
Chukwa
Pig
Hive
Hcatalog
Lucene Crunch
Avro
Thrift
Manhout
Ambari
Zookeeper
Hama
Oozie
Figure 3: Hadoop is an ecosystem of different projects that relate to each other
6
The Hadoop ecosystem can support any business case need, but
CSPs and vendors must understand the development history
and the functionality associated with each component in order
to create a combination of components that best supports their
business needs.
In February 2015, the Open Data Platform initiative (ODPi) was set
up to address the challenge of Hadoop’s fractured ecosystem, and
to create a certification programme to test the conformity of new
components. However, not all vendors participate in the initiative,
including some of the main distributors.
This report examines Hadoop’s utility from three perspectives:
We provide an overview of Hadoop, including the creation and
evolution of its ecosystem through different development
projects, as well as its open-software method of going to
market. We also explain the different components that make up
its ecosystem and their functionality.
We examine different vendor approaches to using Hadoop to
provide big data analytics systems. The report covers the three
main distributors of the software, and provides examples of
telecoms-specific vendors that use them.
We discuss how CSPs are using Hadoop within their
organisations through a series of case studies, and how they
have installed, purchased and integrated the technology.
CSPs and vendors must understand the Hadoop ecosystem in order to
implement the technology successfully
HDFS
MapReduce
YARN
Hadoop Core
Source: Analysys Mason
Cassandra
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016 7
Recommendations
1 Vendors that use Hadoop need to decide which of its components and distributions they will use for their
solutions in order to reduce development efforts and to provide the support required for a stable platform.
Vendors that adopt Hadoop as part of their product set must make informed decisions about which Hadoop
components to select. This will enable them to provide a stable and consistent development environment for
building, testing and deploying their product solutions. The choice of components must reflect the data
requirements and needs of their target use cases and be acceptable to the CSPs that they are targeting.
2 Vendors should only adopt established Hadoop components that are available from all of the major
distributions in order to ensure that they can address the widest possible market.
Many of Hadoop’s components are offered by multiple distributors. Some components are only available from a
single distributor, and this is particularly the case for newer components that address near-real-time data
requirements. Vendors should only select components that are available from all of the major distributions,
ensuring that their solutions can be installed on data infrastructure for the largest number of potential customers.
3 CSPs should establish their own Hadoop architecture to encourage vendors to meet CSPs’ requirements,
including products that integrate legacy data components into Hadoop.
Where CSPs have not established their own Hadoop architecture, different approaches will be deployed with every
vendors’ application or solution. This haphazard approach creates a complex environment that is difficult to
support and is less predictable in the way it performs when supporting different business requirements.
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016
CONTENTS CONTENTS
EXECUTIVE SUMMARY
HADOOP ARCHITECTURE AND COMPONENTS
KEY VENDOR SOLUTIONS
HADOOP IMPLMENTATIONS
ABOUT THE AUTHOR AND ANALYSYS MASON
25
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016 26
About the author
Justin van der Lande (Principal Analyst) leads the Analytics, Customer Experience Management and CSP IT Strategies research programmes,
which are part of Analysys Mason’s Telecoms Software research stream. He specialises in business intelligence and analytics tools, the
functionality of which cuts across all of the research programmes in this area. He also provides project management for large-scale projects
within our Telecoms Software research. Justin has more than 20 years’ experience in the communications industry in software development,
marketing and research. He has held senior positions at NCR/AT&T, Micromuse (IBM), Granite Systems (Telcordia) and at the TM Forum. Justin
holds a BSc in Management Science and Computer Studies from the University of Wales
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016 27
About Analysys Mason
Knowing what’s going on is one thing. Understanding how to take advantage of events is quite another. Our ability to understand the
complex workings of telecoms, media and technology (TMT) industries and draw practical conclusions, based on the specialist
knowledge of our people, is what sets Analysys Mason apart. We deliver our key services via two channels: consulting and research.
Consulting
Our focus is exclusively on TMT.
We support multi-billion dollar investments,
advise clients on regulatory matters,
provide spectrum valuation and auction
support, and advise on operational
performance, business planning
and strategy.
We have developed rigorous
methodologies that deliver tangible
results for clients around the world.
For more information, please visit
www.analysysmason.com/consulting
Research
We analyse, track and forecast the different
services accessed by consumers and
enterprises, as well as the software,
infrastructure and technology
delivering those services.
Research clients benefit from
regular and timely intelligence
in addition to direct access to
our team of expert analysts.
Our dedicated Custom Research
team undertakes specialised
and bespoke projects for clients.
For more information, please visit
www.analysysmason.com/research
27
Consumer and SME services
Digital economy
Regional markets
Network technologies
Telecoms software
Strategy and planning
Transaction support
Performance improvement
Regulation and policy
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016 28
Research from Analysys Mason
We provide dedicated coverage of developments in the telecoms, media and technology (TMT) sectors, through a
range of research programmes that focus on different services and regions of the world.
To find out more, please visit www.analysysmason.com/research
28
PROGRAMMES
Service Assurance
Customer Experience Management
Customer Care
Revenue Management
Analytics
Network Orchestration
Software-Controlled Networking
Service Delivery Platforms
Service Fulfilment
Telecoms Software Market Shares
Telecoms Software Forecasts
PROGRAMMES
Digital Economy Strategies
Digital Economy Platforms
Future Comms and Media
IoT and M2M Solutions
PROGRAMMES
Mobile Services
Mobile Devices
Fixed Broadband and Multi-Play
SME Strategies
PROGRAMMES
Fixed Networks
Wireless Networks
Spectrum
Consumer and SME services
Digital economy
Regional markets
Telecoms software
Network technologies
PROGRAMMES
Global Telecoms Forecasts
Asia–Pacific
The Middle East and Africa
European Country Reports
European Core Forecasts
European Telecoms Market Matrix
Research portfolio
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016 29
Consulting from Analysys Mason
For 30 years, our consultants have been bringing the benefits of applied intelligence to enable clients around the
world to make the most of their opportunities.
To find out more, please visit www.analysysmason.com/consulting
29
Consulting portfolio
Strategy and planning
Transaction support
EXPERTISE
Commercial due diligence
Regulatory due diligence
Technical due diligence
Regulation
EXPERTISE
Policy development and response
Margin squeeze tests
Analysing regulatory accounts
Expert legal support
Media regulation
Postal sector costing, pricing and regulation
Regulatory economic costing
Net cost of universal service
Performance improvement
EXPERTISE
Market research
Market analysis
Business strategy and planning
Market sizing and forecasting
Benchmarking and best practice
National and regional broadband strategy and implementation
EXPERTISE
Performance analysis
Technology optimisation
Commercial excellence
Transformation services
EXPERTISE
Radio spectrum auction support
Radio spectrum management
Spectrum policy and auction support
Performing big data analytics using Hadoop: its complex ecosystem is limiting CSPs’ adoption
© Analysys Mason Limited 2016
PUBLISHED BY ANALYSYS MASON LIMITED IN FEBRUARY 2016
Bush House • North West Wing • Aldwych • London • WC2B 4PJ • UK
Tel: +44 (0)20 7395 9000 • Email: [email protected] • www.analysysmason.com/research • Registered in England No. 5177472
© Analysys Mason Limited 2016. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means – electronic,
mechanical, photocopying, recording or otherwise – without the prior written permission of the publisher.
Figures and projections contained in this report are based on publicly available information only and are produced by the Research Division of Analysys Mason Limited independently of any
client-specific work within Analysys Mason Limited. The opinions expressed are those of the stated authors only.
Analysys Mason Limited recognises that many terms appearing in this report are proprietary; all such trademarks are acknowledged and every effort has been made to indicate them by the
normal UK publishing practice of capitalisation. However, the presence of a term, in whatever form, does not affect its legal status as a trademark.
Analysys Mason Limited maintains that all reasonable care and skill have been used in the compilation of this publication. However, Analysys Mason Limited shall not be under any liability for
loss or damage (including consequential loss) whatsoever or howsoever arising as a result of the use of this publication by the customer, his servants, agents or any third party.
Top Related