Introduction to Hortonworks Data Platform

18
© Hortonworks Inc. 2012 Introduction to the Hortonworks Data Platform Ari Zilka, Chief Products Officer June 20, 2012 Page 1

Transcript of Introduction to Hortonworks Data Platform

Page 1: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Introduction to the Hortonworks Data Platform

Ari Zilka, Chief Products Officer June 20, 2012

Page 1

Page 2: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Who is Ari

Ari Zilka Chief Products Officer •  Bi coastal •  Motorcycles •  Technology

Page 2

Page 3: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

1

•  Simplify deployment to get started quickly and easily

•  Monitor, manage any size cluster with familiar console and tools

•  Only platform to include data integration services to interact with any data source

•  Metadata services opens the platform for integration with existing applications

•  Dependable high availability architecture

Hortonworks Data Platform

Hortonworks Data Platform

Delivers enterprise grade functionality on a proven Apache Hadoop distribution to ease management,

simplify use and ease integration into the enterprise

The only 100% open source data platform for Apache Hadoop

Page 4: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Enabling Hadoop as Enterprise Big Data Platform

Page 4

DEVELOPER Data Platform Services & Open APIs

Hortonworks Data Platform

Applications, Business Tools, Development Tools, Data Movement & Integration, Data Management Systems, Systems Management, Infrastructure

Usability, Installation & Configuration, Administration, Monitoring, Data Extract & Load,

Metadata, Indexing, Search, Security, Management, HA, DR, Replication, Multi-

tenancy, ...

Page 5: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Management & Monitoring Svcs

Hortonworks Management Center – View the health of cluster operations,

server utilization and performance levels – Customizable dashboards – APIs for integration into 3rd party

monitoring tools – 100% open source management &

monitoring, powered by Apache Ambari, Puppet, Nagios and Gaglia – Simple wizard-based installation,

configuration & provisioning of any size Hadoop cluster

Page 5

Optimize performance for your Hadoop cluster

Simplify Installation and provisioning

Page 6: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Simple wizard-based installation, configuration & provisioning of any size Hadoop cluster

Simple Installation

•  Step-by-step install across multiple nodes

•  Automated compatibility and dependency checks

•  Analyzes/recommends optimal services configuration

•  Automatically configures mount points in the cluster

Page 7: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

HMC Architecture

Page 7

Page 8: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Demonstration

•  Hortonworks Management Center •  HCatalog & Data Integration Services •  High Availability

Page 8

Hortonworks Data Platform

Page 9: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

HCatalog

Table access Aligned metadata REST API

•  Raw Hadoop data •  Inconsistent, unknown •  Tool specific access

Apache HCatalog provides flexible metadata services across tools and external access

Metadata Services

•  Consistency of metadata and data models across tools (MapReduce, Pig, HBase and Hive)

•  Accessibility: share data as tables in and out of HDFS •  Availability: enables flexible, thin-client access via REST API

Shared table and schema management opens the platform

Page 10: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Data Integration Services

•  Intuitive graphical data integration tools for HDFS, Hive, HBase, HCatalog and Pig

•  Oozie scheduling allows you to manage and stage jobs

•  Connectors for any database, business application or system

•  Integrated HCatalog storage

Page 10

Bridge the gap between legacy data & Hadoop

Simplify and speed development

Page 11: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Hadoop Cluster

Existing Infrastructure

Metadata Services

metastore

Pig

HBase

Hive applications

data stores

visualization

REST •  ddl •  dml

DML

DML

DML

HCatalog

create describe

Page 12: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Demonstration

•  Hortonworks Management Center •  HCatalog & Data Integration Services •  High Availability

Page 12

Hortonworks Data Platform

Page 13: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

HA Cluster

Full Stack High Availability

•  Failover and restart for •  NameNode •  JobTracker •  Other services to come…

•  Open API allows use of Proven HA from multiple vendors

•  Minimized changes to clients and configuration

•  Server & Operating System failure detection and VM restart

•  Smart resource management ensures sufficient resources are available to restart VMs

HA

HA

Built on Stable proven Apache Hadoop release Complementary to Hadoop 2.0 HA efforts

Page 14: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Demonstration

•  Hortonworks Management Center •  HCatalog & Data Integration Services •  High Availability

Page 14

Hortonworks Data Platform

Page 15: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

What next?

•  Expert role based training •  Course for admins, developers

and operators •  Certification program •  Custom onsite options

Page 15

Download Hortonworks Data Platform hortonworks.com/download

1

2 Use the getting started guide hortonworks.com/get-started

3 Learn more… get support

•  Full lifecycle technical support

across four service levels •  Delivered by Apache Hadoop

Experts/Committers •  Forward-compatible

Hortonworks Support

hortonworks.com/training hortonworks.com/support

Page 16: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Hortonworks Support Subscriptions

Objective: help organizations to successfully develop and deploy solutions based upon Apache Hadoop

• Full-lifecycle technical support available – Developer support for design, development and POCs – Production support for staging and production environments

– Up to 24x7 with 1-hour response times

• Delivered by the Apache Hadoop experts – Backed by development team that has released every major

version of Apache Hadoop since 0.1

• Forward-compatibility – Hortonworks’ leadership role helps ensure bug fixes and patches

can be included in future versions of Hadoop projects

Page 16

Page 17: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Hortonworks Training

Objective: help organizations overcome Hadoop knowledge gaps

• Expert role-based training for developers, administrators & data analysts – Heavy emphasis on hands-on labs – Extensive schedule of public training courses available

(hortonworks.com/training)

• Comprehensive certification programs

• Customized, on-site courses available

Page 17

Page 18: Introduction to Hortonworks Data Platform

© Hortonworks Inc. 2012

Questions & Answers

TRY download at hortonworks.com

LEARN Hortonworks University

FOLLOW twitter: @hortonworks Facebook: facebook.com/hortonworks

MORE EVENTS hortonworks.com/events

Page 18

Further questions & comments: [email protected]