Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

21

Click here to load reader

description

This presentation was delivered by Cloudera systems engineer Matt Harris at the Chicago Cloudera User Group (CUG) meeting on December 3, 2013.

Transcript of Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Page 1: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

1

Cloudera  Manager  –  API’s  &  Extensibility    Ma#  Harris  –  Systems  Engineer  December  2013    

CONFIDENTIAL  -­‐  RESTRICTED  

Page 2: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Cloudera  Manager  

2

End-­‐to-­‐End  AdministraGon  for  CDH  

Manage  Easily  deploy,  configure  &  opGmize  clusters  1 Monitor  Maintain  a  central  view  of  all  acGvity  2 Diagnose  Easily  idenGfy  and  resolve  issues  3 Integrate  Use  Cloudera  Manager  with  exisGng  tools  4

©2013  Cloudera,  Inc.  All  Rights  Reserved.  

Page 3: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

IntegraGng  with  your  IT  Mgmt  tools  

3 ©2013  Cloudera,  Inc.  All  Rights  Reserved.  

Cloudera  Manager  

Installa;on,  Deployment  

tools  e.g.  Chef,  Puppet  etc.  

 

Monitoring  Tools  

e.g.  Orion,    Tivoli,  BMC  

etc.  

Aler;ng  Tools  

e.g  Nagios,  SNMP  etc.  

Hadoop  Opera*ons  

Datacenter  Opera*ons  Various  op*ons  of  integra*ng  Cloudera  Manager  into  your  exis*ng  Datacenter  Opera*ons/Tools  •  Cloudera  Manager  API  

•  Introduced  in  CM4  (June  2012)  •  Installa*on  &  deployment  •  Monitoring  

•  SNMP  Alerts  •  Introduced  in  CM4.5  (Feb  2013)  

•  And  more…  •  Monitoring  ‘tsquery’  (Feb  2013)  •  User-­‐defined  triggers/alarms  (new  for  C5!)  •  Service  extensibility  (new  for  C5!)  

Page 4: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Cloudera  Manager  (CM)  API    •  API  access  was  a  new  feature  introduced  in  Cloudera  Manager  4.0,  providing  programmaGc  access  to  

cluster  operaGons  (such  as  configuraGon  and  restart)  and  monitoring  informaGon  (such  as  health  and  metrics).    

•  The  CM  API  is  an  HTTP  REST  API,  using  JSON  serializaGon.  The  API  is  served  on  the  same  host  and  port  as  the  CM  web  UI,  and  does  not  require  an  extra  process  or  extra  configuraGon.  API  users  have  the  same  privileges  as  they  do  in  the  web  UI  world.  

 

 

©2013Cloudera,  Inc.  All  Rights  Reserved.  4

•  Docs  &  Examples  h#p://cloudera.github.io/cm_api/  h#ps://github.com/cloudera/cm_api  

•  Java/Python  clients  h#p://blog.cloudera.com/blog/2013/05/how-­‐to-­‐automate-­‐your-­‐hadoop-­‐cluster-­‐from-­‐java/      

Page 5: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Examples  of  integraGon  with  CM  API  •  Installa;on  &  Deployment  

•  Chef  •  Puppet  •  Dell  Crowbar  

•  h#p://blog.cloudera.com/blog/2013/08/how-­‐to-­‐deploy-­‐hadoop-­‐clusters-­‐automaGcally-­‐with-­‐dell-­‐crowbar-­‐and-­‐cloudera-­‐manager/  •  StackIQ  

•  h#p://web.stackiq.com/blog/bid/312064/StackIQ-­‐Cluster-­‐Manager-­‐now-­‐integrated-­‐with-­‐Cloudera  •  WANdisco  –  non-­‐stop  NN  setup  •  Several  other  customers/partners  leveraging  the  API’s  as  part  of    their  install  &  deployment    

process  •  Monitoring  &  Aler;ng  

•  Oracle  Enterprise  Manager  (via  Big  Data  Appliance)  •  Nagios  

•  h#ps://github.com/cloudera/cm_api/tree/master/nagios  •  h#ps://github.com/harisekhon/nagios-­‐plugins/blob/master/

check_hadoop_cloudera_manager_metrics.pl  •  SNMP  alerts  integraGon  with  IBM  Netcool  

©2013  Cloudera,  Inc.  All  Rights  Reserved.  5

Develop  &  Contribute  your  plug-­‐in’s  using  Cloudera  Manager  API    

Page 6: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Cloudera  Manager  –  Monitoring  via  ‘tsquery’  

6

   

©2013  Cloudera,  Inc.  All  Rights  Reserved.  

•  Introduced  as  part  of  CM4.5    release  (Feb  2013)  

•  Great  way  to  add  interesGng    charts  (above  &  beyond  what  is  provided  by  default)    and  monitor  metrics  that  are  relevant  to  your  clusters  

•  The  tsquery  language  is  used  to  specify  statements  for  retrieving  Gme-­‐series  data  from  the  Cloudera  Manager  Gme-­‐series  data  store  

•  Example:  How  do  I  compare  all  disk  IO  for  all  the  DataNodes  that  belong  to  a  specific  HDFS  service?  select  bytes_read,  bytes_wriZen  where  roleType=DATANODE  and  serviceName=hdfs1  

•  Retrieved  Gme-­‐series  data  can  be  plo#ed  via  various  opGons  –  line,  bar,  sca#er,    heat  maps,  table  list  etc.  

•  Extending  this  concept  to  create  user-­‐defined  triggers/alarms  (new  for  C5!).    

•  More  details  •  h#p://www.cloudera.com/content/cloudera-­‐content/cloudera-­‐docs/CM5/latest/Cloudera-­‐

Manager-­‐DiagnosGcs-­‐Guide/cm5dg_chart_Gme_series_data.html  

Page 7: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Examples  of  Cloudera  Manager  ‘tsquery’  

7

   

©2013  Cloudera,  Inc.  All  Rights  Reserved.  

Example1:  How  do  I  track  the  aggregate  Cluster  Disk  IO?  select  dt0(read_bytes_disk_sum),  dt0(write_bytes_disk_sum)  where  category  =  CLUSTER  and  clusterId  =  $CLUSTERID  

Example2:  How  do  I  compare  CPU  usage  across  hosts?  select  dt0(total_cpu_user)  /  getHostFact(numCores,  1)  *  100,  dt0(total_cpu_system)  /  getHostFact(numCores,  1)  *  100,  dt0(total_cpu_nice)  /  getHostFact(numCores,  1)  *  100,  dt0(total_cpu_iowait)  /  getHostFact(numCores,  1)  *  100,  dt0(total_cpu_irq)  /  getHostFact(numCores,  1)  *  100,  dt0(total_cpu_so`_irq)  /  getHostFact(numCores,  1)  *  100  

Create  &  Contribute  your  ‘tsqueries’!  h#ps://github.com/cloudera/cm_charGng_scrapbook  

Page 8: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Cloudera  Manager  –  Service  Extensibility  

•  Introduced  in  C5  •  SGll  in  Beta!  

•  Some  aspects  (espcially  Parcel  mgmt)  available  in  CM4.x  

•  Example:  CollaboraGon  with  Syncsort  to  deploy  DMX-­‐h  libraries  

•  Single  management  console  for  CDH,  non-­‐CDH  services  and  ISV  applicaGons  

•  Similar  look  and  feel  as  exisGng  services  

•  Easy  to  write  (Java-­‐free!)  

•  Flexible  

•  Independent  release  cycle  

©2013Cloudera, Inc. All Rights Reserved.

Page 9: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Analogy  from  OperaGng  Systems  (OS)  world  

9 ©2013Cloudera,  Inc.  All  Rights  Reserved.  

Core  OS  kernel  

Package  Mgmt  

Process/  Resource  Mgmt  

Security  Mgmt  

Data  Access  Mgmt  

   

ISV’s  view  of  OS    

Systems  Management  

Page 10: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Bringing  ISV  Apps  to  CDH  

10 ©2013Cloudera,  Inc.  All  Rights  Reserved.  

Core  Hadoop/CDH  kernel  

Parcels   Resource    Mgmt  

Security  Mgmt   CDK  API’s  

   

ISV’s  view  of  Hadoop    

Cloudera  Manager  

Page 11: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

IntegraGng  into  the  Cloudera  Product  Porpolio  

11 ©2013Cloudera,  Inc.  All  Rights  Reserved.  

Cloudera  Manager  

Features   Descrip;on   Examples  

Package  Mgmt  

-­‐  Ability  to  easily  package  and  distribute  binaries/jars  via  “Parcels”  

-­‐ InformaGca  -­‐ Syncsort  

Resource  Mgmt  

-­‐  Ability  to  deploy  applicaGons  as  stand-­‐alone  processes    or  via  YARN*  on  the  Hadoop  grid  

-­‐  Resource  isolaGon  of  cluster  resources      

-­‐ SAS  -­‐ 0xData  -­‐ Accumulo  

Security  Mgmt  

-­‐  Support  for  Kerberos  Mgmt  -­‐  Role  bases  access  control  for  Tables/Views  in  Hive/Impala  via  Sentry  

Data  Access  Mgmt  

-­‐  HDFS  and  HBase  API  abstracGon  and  simplificaGon  

Systems  Mgmt  

Manage   -­‐ Deploy  and  upgrade  (rolling)  services  and  pkgs  -­‐ Manage  configuraGons  

Monitor   -­‐ ProacGve  health  checks  -­‐ Track  resource  uGlizaGon    -­‐ Custom  metrics  charts  

Diagnose   -­‐ Distributed  log  collecGon  and  searching  -­‐ Tag  and  track  key  events  

Integrate   -­‐ Access  operaGonal  tools  via  API  -­‐ Surface  overall  cluster  metrics  to  ISV  dashboard    

Non-­‐CDH  Apps…  

ISV’s  

Accumulo,  Spark,  Giraph  etc.  

*  Support  for  YARN  planned  as  part  of  CM5.x  in  FY14  

Page 12: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

So..  How  does  it  work?  

 •  A  JSON  file  that  describes  your  service  •  Set  of  control  scripts  •  Packaged  as  a  JAR  file  •  As  promised,  Java-­‐free  

©2013Cloudera, Inc. All Rights Reserved.

Page 13: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Example:  Cloudera  Manager  Extensions  -­‐  Spark    

 

©2013Cloudera, Inc. All Rights Reserved.

Page 14: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Cloudera  Manager  Extensions    

 

©2013Cloudera, Inc. All Rights Reserved.

Page 15: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Cloudera  Manager  Extensions:  Spark  

©2013Cloudera, Inc. All Rights Reserved.

Page 16: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Cloudera  Manager  Extensions:  Spark  

©2013Cloudera, Inc. All Rights Reserved.

Page 17: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Cloudera  Manager  Extensions:  Spark  

©2013Cloudera, Inc. All Rights Reserved.

Page 18: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

 #!/bin/bash  CMD=$1  MASTER_PORT=<read  in  from  ./params.proper;es>  

case  $CMD  in      (start_master)      exec  $SPARK_HOME/scripts/spark-­‐start.sh  master"          ;;        (*)          echo  "$;mestamp  Don't  understand  [$CMD]"          ;;  esac  

 name  :  “spark”,  roles  :  [{            name  :  "master",            startRunner  :  {                    program  :  "scripts/control.sh",                    args  :  [    "start_master",                                                    "./params.proper;es"]                },                parameters  :  [{                      name  :  "master_port",                      type  :  "port",                      default  :  7077                  }],                configWriter  :  {                      generators  :  [{                            filename  :  "params.proper;es"                      }]  }]  

The  Code  

©2013Cloudera, Inc. All Rights Reserved.

Page 19: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Next  Steps  

• DocumentaGon  &  SDK  as  part  of  C5  Beta2  or  later  (definitely  before  GA!)  

• Working  with  select  ISV’s  (SAS,  Syncsort,  0xData  etc.)  as  part  of  Beta  to  further  fine-­‐tune  this  feature  

 

©2013Cloudera, Inc. All Rights Reserved.

Develop  &  Contribute  your    Cloudera  Manager  service  extensibility  plug-­‐in’s  !  

Page 20: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Vision  of  CM  Extensibility  

©2012Cloudera,  Inc.  All  Rights  Reserved.  20

CDH CM

Syncsort Informatica

Security ISV’s 0xData

Capacity Mgr SLA Mgr Cost

Optimizer

API

Horizontal Extension

Vert

ical

Ext

ensi

on

Serv

ice

Exte

nsib

ility

Ops Apps

SAS

Revolution

Spark Giraph Accumulo

Oracle OEM Dell Nagios

API SNMP

Chef/ Puppet

Page 21: Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

Q&A  

©2013Cloudera, Inc. All Rights Reserved.