Blue Hill Research (Driven Product Review)

7
Copyright © 2016 Blue Hill Research Page 1 ANATOMY OF A DECISION Choosing Driven for Smarter Big Data Application Performance Management and Development Published: February 2016 Report Number: A0214 Analyst: James Haight, Principal Analyst Share This Report What You Need To Know As a fundamental catalyst to the Big Data revolution, Hadoop has deservedly captured significant mindshare among organizations looking to further their data initiatives. As a result, Hadoop has seen notable enterprise adoption and rapid innovation within its surrounding ecosystem. Many organizations have been eager to adopt Hadoop as a means to store the overwhelming volumes of data from clickstreams, customer transactions, IoT sensors, and other now-standard enterprise data sources. As Hadoop use cases mature, organizations are beginning to understand the challenges associated with moving Hadoop applications from proofs-of-concept to mission-critical production. Many organizations struggle, or are on the cusp of struggling, with standard Hadoop ecosystem tools that are inadequate for the level of performance monitoring, troubleshooting, and visibility required for enterprise production applications. To build awareness and provide guidance on such challenges, Blue Hill Research partnered with Driven, Inc., a California-based software provider for Big Data application performance management, to investigate common challenges associated with maturing Big Data applications. Blue Hill Research held in-depth qualitative interviews with three organizations and offers these findings so that business decision-makers can understand the challenges and opportunities of peer organizations in pursuit of optimizing their own Big Data environment. About the Subjects and Driven, Inc. In conducting this study, Blue Hill held in-depth qualitative interviews with three organizations. These organizations include: Hotels.com, a major online lodging marketplace; and two companies that wished to AT A GLANCE Business Challenges As businesses mature in their Big Data initiatives, they must transition from merely collecting and processing data to building production enterprise applications. However, while existing toolsets may provide visibility at the cluster level, they are inadequate at providing the visibility necessary to build proactive and efficient application performance management processes. Evaluated Software Solutions Driven Decision Points Studied organizations identified the following factors as salient decision points for choosing Driven: Superior visibility into Big Data application performance factors, presenting opportunity for more efficient application development, diagnostic, and optimization. Meaningful layers of abstraction from technical processes that allowed for a broader base of users to contribute to Big Data application management and development.

Transcript of Blue Hill Research (Driven Product Review)

Page 1: Blue Hill Research (Driven Product Review)

 

Copyright © 2016 Blue Hill Research Page 1  

ANATOMY OF A DECISION

Choosing  Driven  for  Smarter  Big  Data  Application  Performance  Management  and  Development  

Published: February 2016 Report Number: A0214

Analyst: James Haight, Principal Analyst

Share This Report

What You Need To Know As a fundamental catalyst to the Big Data revolution, Hadoop has deservedly captured significant mindshare among organizations looking to further their data initiatives. As a result, Hadoop has seen notable enterprise adoption and rapid innovation within its surrounding ecosystem.

Many organizations have been eager to adopt Hadoop as a means to store the overwhelming volumes of data from clickstreams, customer transactions, IoT sensors, and other now-standard enterprise data sources. As Hadoop use cases mature, organizations are beginning to understand the challenges associated with moving Hadoop applications from proofs-of-concept to mission-critical production. Many organizations struggle, or are on the cusp of struggling, with standard Hadoop ecosystem tools that are inadequate for the level of performance monitoring, troubleshooting, and visibility required for enterprise production applications.

To build awareness and provide guidance on such challenges, Blue Hill Research partnered with Driven, Inc., a California-based software provider for Big Data application performance management, to investigate common challenges associated with maturing Big Data applications. Blue Hill Research held in-depth qualitative interviews with three organizations and offers these findings so that business decision-makers can understand the challenges and opportunities of peer organizations in pursuit of optimizing their own Big Data environment.

About the Subjects and Driven, Inc. In conducting this study, Blue Hill held in-depth qualitative interviews with three organizations. These organizations include: Hotels.com, a major online lodging marketplace; and two companies that wished to

AT  A  GLANCE  

Business  Challenges  

As  businesses  mature  in  their  Big  Data  initiatives,  they  must  transition  from  merely  collecting  and  processing  data  to  building  production  enterprise  applications.  However,  while  existing  toolsets  may  provide  visibility  at  the  cluster  level,  they  are  inadequate  at  providing  the  visibility  necessary  to  build  proactive  and  efficient  application  performance  management  processes.    

Evaluated  Software  Solutions  

• Driven  

Decision  Points  

Studied  organizations  identified  the  following  factors  as  salient  decision  points  for  choosing  Driven:  

• Superior  visibility  into  Big  Data  application  performance  factors,  presenting  opportunity  for  more  efficient  application  development,  diagnostic,  and  optimization.  

• Meaningful  layers  of  abstraction  from  technical  processes  that  allowed  for  a  broader  base  of  users  to  contribute  to  Big  Data  application  management  and  development.    

Page 2: Blue Hill Research (Driven Product Review)

 

Copyright © 2016 Blue Hill Research Page 2  

ANATOMY OF A DECISION

remain anonymous: an international commercial data provider, and a consumer financial services organization with over $20 billion in annual revenue that wished to remain anonymous.

Driven, Inc. is a California-based software provider that offers solutions for Big Data application development and performance management. Driven, Inc.’s primary product offering is Driven, an application performance management solution for Big Data applications. Each of the studied organizations has invested time and resources in Driven within the last 24 months.

Business Drivers: Challenges of Bringing Big Data Applications into Production Initially, the ability to produce data, whether from social feeds, click streams, or machine sensors, outpaced the ability of organizations to collect and meaningfully process this data. As such, many organizations adopted Hadoop and other Big Data technologies in an effort to collect and store the vast amounts of new and unstructured data that they believed would prove valuable in the long run. Now, these organizations have undertaken the challenges of turning this data into actionable insights and information. Doing so requires processing billions of rows of data, normalizing and contextualizing the data, and performing complex analysis at scale. This, too, has become increasingly accessible. However, integrating such processes into mission-critical enterprise applications presents a unique set of challenges given the standards of availability and resiliency required for real-time decision support impacting core business outcomes.

The research participants are characterized by enormous data environments and highlight the associated maturation process that organizations experience. In the case of the financial services provider, not only do they analyze customer transaction data, but they process a far greater pool of customer interactions as well. The resulting commitment to this level of analysis demands that the organization routinely processes petabytes of data to generate their desired insights. In addition, Hotels.com collects and processes clickstream feeds from their websites, which generates over 1 million rows of data every hour.

At this level of processing, the organizations view Hadoop as a competitive necessity and a core tenet to success. However, each quickly recognized the challenges associated with bringing Big Data applications from a proof-of-concept phase to production environments. More generally, for those that wished to provide Hadoop as

Driven was the first data pipeline-centric performance management system where I received true enthusiasm and understanding from our business team, the main differentiator to other solutions is that Driven clearly visually shows how the data pipeline is progressing using intuitive graphical representation.

The other tools in the same space provide a lot of information, but they do not roll them up to a data pipeline view. Driven is great because it presents things in the way you think about them logically. For instance you can view jobs by business function rather than just the job name. It really helps us communicate.”

Performance Architect

International Commercial Data Provider

 

 

Page 3: Blue Hill Research (Driven Product Review)

 

Copyright © 2016 Blue Hill Research Page 3  

ANATOMY OF A DECISION

a shared service across the organization, they encountered additional challenges with meeting service level agreements and tracking usage amongst constituents.

Prior to their investment in Driven, the studied organizations noted considerable challenges in managing their applications with existing tools. While there were opportunities to monitor performance at the Hadoop cluster level, there was no easy way to gain visibility into the health or inner workings at the application level. Most notably, the organizations found it extremely time-intensive and inefficient to run diagnostic tests and understand root causes of application failure or slow performance. Those chartered with ensuring the success of such processes needed to go “log diving,” or use otherwise inadequate job managers. As a consequence, only the most technically-savvy operators had the ability to run diagnostics on these Big Data processes. This was a challenge that each organization recognized as an important bottleneck to overcome.

The excessive costs in time and efficiency that arise when only a small group of highly-specialized employees have the opportunity to manage large-scale initiatives quickly became apparent to the studied organizations. Further, there is also a two-fold challenge in building Big Data applications on the Hadoop ecosystem that interface with broader enterprise-wide applications. Not only is it exceptionally hard to find talent for writing directly in Hadoop-specific languages (as compared to more universally accessible languages such as Java), but the pace of innovation in the ecosystem with the advent of new technologies and new compute fabrics leaves organizations in a challenging position that requires constant evolution and adaptation. Without some means of abstraction to make application development and performance management more accessible to a broader base of business-oriented users, these organizations saw significant challenges in realizing the full potential of their Big Data initiatives.

Figure 1: Performance Management Challenges Through the Big Data Maturity Cycle

Stage Challenges

Data Collection at Scale • Overcoming storage capacity and data management constraints through investments in infrastructure and talent.

Processing & Application Development

• Stitching together multiple data sources and processes across compute engines and programming languages to build applications.

• Applying overlaying business logic, data management, and visibility into core development processes.

Production Applications

• Scaling internal resources to meet ongoing monitoring and optimization requirements associated with service level and performance demands of production applications.

• Providing tools and digestible information that can be disseminated across less-technical constituents to ensure alignment of business objectives and support long-term success.

Page 4: Blue Hill Research (Driven Product Review)

 

Copyright © 2016 Blue Hill Research Page 4  

ANATOMY OF A DECISION

Blue Hill Research observes that the studied organizations represent sophisticated Big Data operations. The challenges that are inherent from operating within the Hadoop ecosystem are not unique to the studied organizations, but rather a natural obstacle encountered by organizations as they progress along the Big Data maturity curve.

Choosing Driven The value that the studied organizations derive from their Big Data initiatives was too important to allow for the perpetuation of the bottlenecks that they were encountering. They identified serious needs to both bolster their application development as well as their application performance management capabilities. In assessing the viable solutions, the organizations considered the implications of expanding their hiring to bring on more highly-specialized human resources. Ultimately, however, they recognized that such a decision was not scalable without first addressing the underlying factors responsible for the resource bottlenecks. Secondly, organizations evaluated the various existing methods of expanding their visibility into their Big Data applications. Blue Hill observes that in doing so, the organizations found that existing job manager software could not provide a readily-digestible information feed to less-specialized employees. In addition, out-of-the-box standard tools tended to focus on cluster performance rather than on the individual applications themselves. As such, these organizations chose Driven for monitoring their applications because of the ability to abstract the challenges into a common framework for easier management and visual accessibility of the user interface and presentation of information.

Superior Visibility into Application Performance: Participants consistently cited the superior level of visibility into Big Data applications as their primary decision point. Specifically, they chose Driven because it presented performance characteristics of applications in a manner that was readily accessible to teams who were not as specialized in the underlying frameworks such as Spark, Hive, or MapReduce. Users can drill into dashboards to produce visual displays of data flows, usage rates, and individual processes, giving the appropriate level of detail to the relevant constituents. As applications fail or lose performance, organizations saw Driven as an opportunity to greatly enhance their ability to resolve issues. Because Driven provides an easier way to isolate cause and effect through capabilities such as metadata search and analysis of trends over time, it presents a more efficient process than indiscriminately digging through log files. Participants noted that Driven was compelling because it provided a dual capability of showing high-level accessibility while still allowing for a granular level of control.

We found that when something ran

on a small cluster it worked fine, but

when we moved it to the full scale,

things would run way too slowly.

The stock Hadoop interface doesn’t

give visibility to quickly solve the

issues. Driven let us see the

bottlenecks and see where the

problems actually were.

Now instead of complaining about

why things are a certain way, we

have the data to know why and to

go and fix it.

 

Technical Team Manager Hotels.com

 

 

Page 5: Blue Hill Research (Driven Product Review)

 

Copyright © 2016 Blue Hill Research Page 5  

ANATOMY OF A DECISION

Abstraction and Data Management: The studied organizations noted that Driven presented a layer of abstraction and management that is otherwise unavailable for Big Data applications. Driven introduces metadata concerning data flows and processes that allow for teams to provision data access and controls in a manner more typical with standard best practices. This abstraction also allows for superior analysis of individual operations, and enhances the opportunity for sharing insights and collaboration between team members. Of paramount importance to the studied organizations was the abstraction that Driven could provide regarding visibility into the entire business function of Big Data applications. Traditionally, this is a difficult challenge because applications span a number of inconsistent languages and frameworks, meaning that addressing performance issues requires a spectrum of expertise that is typically distributed among various specialized groups throughout an enterprise. Organizations saw Driven as a way to centralize the troubleshooting process into a common framework, and empower IT teams regardless of what the underlying technologies comprising the Big Data applications were. This became a central differentiator that the organizations perceived of Driven.

Resulting Business Outcomes The studied organizations realized a number of meaningful improvements to their core processes that resulted in substantial savings of time and operational efficiencies. Driven directly impacted the efficiency of the studied organizations’ overall management of their Big Data environments. Most notably, Driven provided a level of visibility that significantly reduced efforts of identifying breakdowns within data flows and the running of applications. In this sense, Driven provided a two-fold benefit in that it eliminated the opportunity cost associated with costly man-hours of highly-specialized resources troubleshooting applications, and it also allowed for greater uptime and confidence in running mission-critical applications in production environments.

Case  in  Point:  Hotels.com  

As  a  primary  online  destination,  Hotels.com  and  its  broader  portfolio  of  operations  garner  truly  enormous  

volumes  of  data.  To  achieve  performance  at  this  scale,  the  company  operates  several  Hadoop  clusters.  The  

main  cluster  holds  multiple  petabytes  of  data,  on  top  of  which  numerous  business  functions  are  based.  The  

organization  has  five  teams  developing  Hadoop  applications  and  a  central  team  to  ensure  efficiency  and  

efficacy  between  the  teams.    

Hotels.com  encountered  challenges  as  applications  that  ran  smoothly  on  small  clusters  experienced  

performance  issues  when  moved  into  full-­‐scale  production.  Existing  tools  provided  performance  metrics  at  

the  cluster  level,  but  failed  to  provide  insight  into  the  individual  processes  of  the  applications  themselves.  

Furthermore,  applications  were  often  developed  in  different  languages,  and  had  high  degrees  of  

customization,  both  of  which  made  identifying  the  root  causes  of  performance  failure  painfully  

time-­‐consuming.  This  prompted  Hotels.com  to  introduce  Driven,  which  provided  both  high-­‐level  visibility  and  

granular  access  into  the  applications’  performance.  From  this,  Hotels.com  was  able  to  shift  resources  away  

from  time-­‐intensive  diagnostic  exercises  and  move  them  towards  proactive  management  of  application  

development  and  optimization.    

Page 6: Blue Hill Research (Driven Product Review)

 

Copyright © 2016 Blue Hill Research Page 6  

ANATOMY OF A DECISION

The high-level visibility that Driven provides and the ability to allow for greater communication with the overall team paired well with broader application development efforts. Ultimately, the abstraction of highly-technical and fast-evolving environments to more commonly understood and readily-accessible interfaces allowed for significant streamlining of building and managing Big Data initiatives. In this way, Driven is able to become a catalyst for organizations as they develop their Big Data applications and create an environment where they accelerate into production faster. While the nature of the solution makes it difficult to draw direct parallels to top-line revenue and bottom-line cost savings, Blue Hill identifies these as key downstream implications of investments in time and resources in such areas. Participants noted the impact of personnel cost as well as the benefits of investing in more efficient underlying systems that created more scalable ways to augment the output of available internal resources.

Conclusions and Recommendations The voracious appetite of today’s companies for data as a source for increasing competitive differentiation has pushed innovations in the world of Big Data at a blistering pace. Organizations wishing to make the most of this opportunity have invested heavily in the infrastructure necessary to collect and process vast amounts of data. However, bringing Big Data initiatives to the next phase of impacting critical business processes requires an additional layer of investment into the development and management capabilities needed to ensure success.

Each of the studied organizations should be seen as an exemplary showcase in embracing Big Data and deriving value from their efforts. Organizations that have undertaken this step and are moving Big Data applications from proof-of-concept to production environments would do well to learn from the studied experiences, as the challenges that they encountered are common hurdles in the Big Data maturation process. Businesses that find themselves struggling with visibility into the drivers of their Big Data application performance, or that are spending more time trying to identify root causes and possible factors of application failure rather than on proactively optimizing future initiatives, should consider investments in Driven. Standard, out-of-the-box, tools are inadequate for the scale and complexity that Big Data application management demands. The rigidity of typical approaches produces time and resource pain-points that are exacerbated by the service and performance level demands associated with moving applications into production.

Given the value proposition associated with broadening the accessibility of Big Data application management to a broader suite of users, Blue Hill Research recommends Driven, especially for organizations running Big Data applications across a spectrum of languages and compute fabrics. Blue Hill Research identifies Driven as excelling in presenting digestible information about data flows and pipelines in a way that can be shared and acted upon across an IT organization, even for those who are not highly specialized in the specific underlying workings driving various Big Data applications.

As organizations progress along the Big Data maturity curve, they should strive to manage application performance and development with the same efficacy as already-established best practices, as opposed to relying on fragmented knowledge of disparate programming languages unevenly dispersed throughout the organization. In this way, Driven allows IT teams to move from reactive agents of unanticipated problems to proactive managers who optimize business outcomes.

Page 7: Blue Hill Research (Driven Product Review)

ABOUT THE AUTHOR

James Haight Research Analyst

James Haight is a research analyst at Blue Hill Research, focusing on analytics and emerging enterprise technologies. His primary research includes exploring the business case development and solution assessment for data warehousing, data integration, advanced analytics, and business intelligence applications. He also hosts Blue Hill’s Emerging Tech Roundup Podcast, which features interviews with industry leaders and CEOs on the forefront of a variety of emerging technologies. Prior to Blue Hill Research, James worked in Radford Consulting’s Executive and Board of Director Compensation practice, specializing in the high tech and life sciences industries. Currently, he serves on the strategic advisory board of the Bentley Microfinance Group, a 501(c)(3) non-profit organization dedicated to community development through funding and consulting entrepreneurs in the Greater Boston area.

CONNECT ON SOCIAL MEDIA

@James_Haight

linkedin.com/in/jamesthaight

bluehillresearch.com/author/james-haight/

For further information or questions, please contact us:

Phone: +1 (617) 624-3400 Fax: +1 (617) 367-4210

Twitter: @BlueHillBoston LinkedIn: linkedin.com/company/blue-hill-research Contact Research: [email protected]

Blue Hill Research offers independent research and advisory services for the enterprise technology market. Our domain expertise helps end

users procure the right technologies to optimize business outcomes, technology vendors design product and marketing strategy to achieve

greater client value, and private investors to conduct due diligence and make better informed investments.

Unless otherwise noted, the contents of this publication are copyrighted by Blue Hill Research and may not be hosted, archived, transmitted,

or reproduced in any form or by any means without prior permission from Blue Hill Research.

Copyright © 2016 Blue Hill Research