Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

20
Page 1 © Hortonworks Inc. 2014 Discover HDP 2.1 Using Apache Ambari to Manage Hadoop Clusters Hortonworks. We do Hadoop.

description

Apache Ambari is a single framework for IT administrators to provision, manage and monitor a Hadoop cluster. Hortonworks Data Platform 2.1. includes Apache Ambari 1.6.1. This presentation gives an overview of Apache Ambari and discusses recent Ambari innovations for operating at scale and extensibility, including: + Maintenance mode + Rolling restarts + Ambari Blueprints + Ambari Stacks

Transcript of Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 1: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 1 © Hortonworks Inc. 2014

Discover HDP 2.1 Using Apache Ambari to Manage Hadoop Clusters

Hortonworks. We do Hadoop.

Page 2: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 2 © Hortonworks Inc. 2014

Speakers

Justin Sears

Hortonworks Product Marketing Manager

Jeff Sposetti

Hortonworks Senior Director of Product Management and Committer for Apache Ambari

Mahadev Konar

Hortonworks Co-Founder, Committer and PMC Member for Apache Hadoop, Apache Ambari & Apache ZooKeeper

Page 3: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 3 © Hortonworks Inc. 2014

Agenda

• Overview of Apache Ambari

• New Ambari Features

• Demo

• Q & A

We’ll move quickly:• Attendee phone lines are muted

• Text any questions to Mahadev Konar using Webex chat

• Questions will be answered at the end of the presentation

• Unanswered questions and answers in upcoming FAQ/blog post

Page 4: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 4 © Hortonworks Inc. 2014

OPERATIONS TOOLS

Provision, Manage &Monitor

DEV & DATA TOOLS

Build & Test

A Modern Data ArchitectureA

PPLI

CATI

ON

SD

ATA

SYS

TEM

REPOSITORIES

RDBMS EDW MPP

Business Analytics Custom Applications Packaged

Applications

Go

ver

na

nce

&

In

teg

rati

on

ENTERPRISE HADOOP

Sec

uri

ty

Op

era

tio

nsData Access

Data Management

SOU

RCES

OLTP, ERP,CRM Systems

Documents, Emails

Web Logs,Click Streams

Social Networks Machine Generated

SensorData

Geolocation Data

Page 5: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 5 © Hortonworks Inc. 2014

HDP 2.1: Enterprise Hadoop

HDP 2.1Hortonworks Data Platform

Provision, Manage & Monitor

AmbariZookeeper

Scheduling

Oozie

Data Workflow, Lifecycle &

Governance

FalconSqoopFlume

NFSWebHDFS

YARN : Data Operating System

DATA MANAGEMENT

DATA ACCESSGOVERNANCE &

INTEGRATION OPERATIONS

Script

Pig

Search

Solr

SQL

Hive/Tez, HCatalog

NoSQL

HBaseAccumulo

Stream

Storm

Others

In-Memory Analytics,

ISV engines

1 ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° °

°

°

N

HDFS (Hadoop Distributed File System)

Batch

Map Reduce

SECURITY

AuthenticationAuthorization

AccountingData Protection

Storage: HDFSResources: YARNAccess: Hive, … Pipeline: Falcon

Cluster: Knox

Page 6: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 6 © Hortonworks Inc. 2014

HDP 2.1: Enterprise Hadoop

HDP 2.1Hortonworks Data Platform

Scheduling

Oozie

Data Workflow, Lifecycle &

Governance

FalconSqoopFlume

NFSWebHDFS

YARN : Data Operating System

DATA MANAGEMENT

DATA ACCESSGOVERNANCE &

INTEGRATION

Script

Pig

Search

Solr

SQL

Hive/Tez, HCatalog

NoSQL

HBaseAccumulo

Stream

Storm

Others

In-Memory Analytics,

ISV engines

1 ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° °

°

°

N

HDFS (Hadoop Distributed File System)

Batch

Map Reduce

SECURITY

AuthenticationAuthorization

AccountingData Protection

Storage: HDFSResources: YARNAccess: Hive, … Pipeline: Falcon

Cluster: Knox

Provision, Manage & Monitor

AmbariZookeeper

OPERATIONS

Page 7: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 7 © Hortonworks Inc. 2014

Agenda

Ambari Overview

New Ambari

FeaturesDemo Q & A

Page 8: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 8 © Hortonworks Inc. 2014

Driving Themes for Apache Ambari

Operate Hadoop at Scale

Integrate with the Enterprise

Extend for the Ecosystem

Page 9: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 9 © Hortonworks Inc. 2014

Apache Ambari

Apache Ambari is a 100% open source framework for provisioning, managing and monitoring Apache Hadoop clusters

AMBARI WEB

Others

compute&

storage. . .

. . .

. .compute

&storage

.

. EXTEND

AMBARI REST API

AMBARI SERVERPROVISION | MANAGE | MONITOR

Integration With Existing Operations Tools

OPERATE

AMBARI STACKS

Page 10: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 10 © Hortonworks Inc. 2014

100% Apache Open Source

2014

AprilApache Ambari 1.5 ReleasedAdds support for Hortonworks Data Platform 2.1

Apache Ambari Graduates to Top Level Project2013

Dec

2014

JuneApache Ambari 1.6 ReleasedAdds new operational and extensibility capabilities

Page 11: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 11 © Hortonworks Inc. 2014

Agenda

Ambari Overview

New Ambari

FeaturesDemo Q & A

Page 12: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 12 © Hortonworks Inc. 2014

New Ambari Features

Operating at Scale

• Maintenance Mode

• Rolling Restarts

• Ambari Blueprints

Extensibility

• Ambari Stacks

Page 13: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 13 © Hortonworks Inc. 2014

Maintenance Mode

• Silence alerts for services and hosts when performing maintenance

• Ability to put Service or Host “Out of Service”

• Retain full operational control during maintenance period

Page 14: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 14 © Hortonworks Inc. 2014

Rolling Restarts

• Minimize cluster downtime + service impact when making changes

• Ability to initiate a "rolling restart" of components across many hosts

• Optionally include only hosts with configurations changes

Page 15: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 15 © Hortonworks Inc. 2014

Ambari Blueprints

• API-driven method for consistent and rapid creation of clusters

• Enables automation for dev, test and short-lived cluster use cases

• Encapsulates “Best Practices” for cluster layout and configuration

STACKDEFINITION

LAYOUT& CONFIGS BLUEPRINT INSTANTIATE CLUSTER

Page 16: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 16 © Hortonworks Inc. 2014

Ambari Stacks

• Ambari operational control is dynamically driven by “Stacks”

• Defines a consistent lifecycle management interface

• Dynamically extend a “Stack”, bring complementary Services to Ambari

AMBARISERVER

Stacks

Command Scripts

Service Definitions

AMBARIAGENT/SAMBARI

AGENT/SAMBARIAGENT/S

pythonxml

Repos

Page 17: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 17 © Hortonworks Inc. 2014

Agenda

Ambari Overview

New Ambari

FeaturesDemo Q & A

Page 18: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 18 © Hortonworks Inc. 2014

Agenda

Ambari Overview

New Ambari

FeaturesDemo Q & A

Page 19: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 19 © Hortonworks Inc. 2014

Learn More About Hadoop Cluster Operations

Hortonworks.com/labs/operations/

Learn About It ambari.apache.org OR cwiki.apache.org/confluence/display/AMBARI/Ambari

Get It hortonworks.com/hdp/downloads/

Try It with Hortonworks Sandbox hortonworks.com/products/hortonworks-sandbox/

Page 20: Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

Page 20 © Hortonworks Inc. 2014

Thank you!