Using Vagrant, Puppet, Testing & Hadoop

38
Hadoop in Box From Playground to Produc5on – Using Vagrant, Puppet, Tes5ng and Hadoop.

description

From PuppetCamp Southeast Asia 2012 in Kuala Lumpur, Malaysia. Hadoop in a box - from playground to production Desc: How Vagrant, Puppet and other tools can be used to move your manifest from test bed to production.

Transcript of Using Vagrant, Puppet, Testing & Hadoop

Page 1: Using Vagrant, Puppet, Testing & Hadoop

Hadoop  in  Box  

From  Playground  to  Produc5on  –  Using  Vagrant,  Puppet,  Tes5ng  and  

Hadoop.    

Page 2: Using Vagrant, Puppet, Testing & Hadoop

Who  am  I?  

•  Dennis  Matotek          Technical  Lead,  PlaForms          Experian  Hitwise        Co-­‐Author:      Pro  Linux  System  Administra5on:  Turnbull,  Lieverdink,  Matotek,  Apress  2009  

   Technical  Reviewer:      Pulling  Strings  with  Puppet:  Turnbull,  Apress  2008  

Page 3: Using Vagrant, Puppet, Testing & Hadoop

What  are  we  solving?  

•  We  have  a  group  of  developers...  

Page 4: Using Vagrant, Puppet, Testing & Hadoop

They  want  to  build  something  cool!  

Page 5: Using Vagrant, Puppet, Testing & Hadoop

We  don’t  want  to  end  up  with  this..  

Page 6: Using Vagrant, Puppet, Testing & Hadoop

So  let’s  get  together  early  in  the  design  

Page 7: Using Vagrant, Puppet, Testing & Hadoop

How  can  we  help?  

•  Don’t  put  implementa5on  plans  at  the  end  of  a  project.  

•  Everyone  gets  involved  in  wri5ng  infrastructure  code  

•  Infrastructure  code  should  be  included  in  the  development  build  pipeline  and  have  to  pass  tests.  

•  Push  infrastructure  code  from  playground  to  produc5on.  Design,  test  and  deploy  your  infrastructure  code  like  your  applica5on  code.  

Page 8: Using Vagrant, Puppet, Testing & Hadoop

How  can  we  do  it?  

•  As  administrators  we  can  help  build  the  development  environment  for  projects.    

•  Infrastructure  on  the  desktop  – A  lot  of  the  concep5on  phase  coding  work  can  be  done  on  the  desktop.  • What  packages  are  needed  for  the  project?    • What  configura5on  should  they  be  in?  •  How  can  you  share  your  ideas?  

Page 9: Using Vagrant, Puppet, Testing & Hadoop

Choices  •  Virtualiza5on  technologies  to  choose  from  

–  Virtual  Box  –  LXC  –  KVM/XEN  

•  Configura5on  management  tools    –  Puppet    –  Chef  –  SaltStack  

•  Tes5ng  tools  –  Cucumber-­‐puppet  –  Rspec  

•  CI  tools  –  Jenkins  

Page 10: Using Vagrant, Puppet, Testing & Hadoop

Let’s  look  at  Vagrant  

Page 11: Using Vagrant, Puppet, Testing & Hadoop

What’s  it  about?  •  A  project  on  Github  wriaen  in  Ruby  to  manage  Oracle’s  VirtualBox  virtual  

machines  (  originators:  Mitchell  Hashimoto  and  John  Bender,  2010).    •  You  can  build  and  distribute  projects  amongst  teams  or  colleagues.  

•  Download    ‘boxes’  and  build  project  environments  that  are  the  same  •  Boxes  are  reusable  testbeds.  When  you  are  ready,  push  your  

development  code  environment  to  others.  •  Take  those  environments  and  run  them  against  Jenkins  or  other  CI  

tools.  •  Sandbox,  develop,  test  and  push  your  infrastructure  code  into  

produc5on.  •  How  easy  is  it?  

 $  vagrant  box  add  base  hap://files.vagrantup.com/lucid32.box    $  vagrant  init    $  vagrant  up  

Page 12: Using Vagrant, Puppet, Testing & Hadoop

Vagrant  boxes  

•  What’s  a  box?  •  Boxes  come  from  standard  VirtualBox  instances.    With  specific  configura5ons  that  Vagrant  requires.    

•  What  ever  VirtualBox  supports,  so  does  Vagrant.  •  Boxes  are  basically  a  tar  of  an  exported  VirtualBox.  

•  Configured  harddisks,  CPU,  RAM,  Networks.  •  You  can  create  them  yourself  or  use  ones  that  others  have  created  and  distributed.    

•  How  to  build  a  box  is  documented  here:    hap://vagrantup.com/docs/base_boxes.html  

Page 13: Using Vagrant, Puppet, Testing & Hadoop

Launching  a  box  Install  VirtualBox,  install  Ruby,  install  vagrant.  Create  your  own  box  or  find  one  that  is  distributed  already  

$  mkdir  project  ;  cd  project  $  vagrant  box  add  <box_name>  <url  or  file_path>  

This  adds  and  makes  it  available  to  the  vagrant  init  command  

$  vagrant  box  list  hadoop_in_a_box  $  vagrant    init  <box_name>    

You  will  now  have  the  default  Vagran2ile  created  in  your  directory  

$  ls    VagranFile  $  vagrant  up  $  vagrant  ssh  

Page 14: Using Vagrant, Puppet, Testing & Hadoop

VagranFile  

Vagrant::Config.run  do  |config|        #  All  Vagrant  configura5on  is  done  here.  The  most  common  configura5on      #  op5ons  are  documented  and  commented  below.  For  a  complete  reference,      #  please  see  the  online  documenta5on  at  vagrantup.com.        #  Every  Vagrant  virtual  environment  requires  a  box  to  build  off  of.        config.vm.box  =  "hadoop_in_a_box"        config.ssh.private_key_path  =  "./.ssh/vagrant.key"        #  shared_folders  -­‐  this  folder  must  exist  in  your  project  directory      config.vm.share_folder("shared_folder",  "/shared",  "./shared_folder")    end  

Page 15: Using Vagrant, Puppet, Testing & Hadoop

Vagrant::Config.run  do  |config|        #  general  setngs:      #  config.vm.boot_mode  =  :gui          config.vm.customize  [  "modifyvm",  :id,                                                      "-­‐-­‐memory",  "512"                                                  ]        #  ssh  setngs:      #  Set  the  following  to  point  to  your  ssh  key      config.ssh.private_key_path  =  "./.ssh/vagrant.key"        #  Change  these  to  suit,  some5mes  it  takes  awhile  to  the  virtual  box  to  respond      config.ssh.max_tries  =  25      config.ssh.5meout  =  3            #  shared_folders  -­‐  this  folder  must  exist  in  your  project  directory      config.vm.share_folder("shared_folder",  "/shared",  "./shared_folder“)        #  Below  is  an  example  of  a  mul5ple  VM          config.vm.define  :node1  do  |base_config|            base_config.vm.box  =  "my_base"            base_config.vm.forward_port  22,  2102            base_config.vm.network  :hostonly,  "192.168.222.10"        end              #  config.vm.define  :node2  do  |base_config|        #      base_config.vm.box  =  "my_base"        #      base_config.vm.forward_port  22,  2103        #      base_config.vm.network  :hostonly,  "192.168.222.11"        #  end  end  

Page 16: Using Vagrant, Puppet, Testing & Hadoop

Provisioning  Your  Box  

•  Ruby  plugins  for  Vagrant  –  Build  your  own  specific  plugins  that  make  provisioning  easy  for  you  

•  Shell  provisioning  –  Bash  shell  scripts  or  commands  

 base_config.vm.provision  :shell  do  |shell|      shell.inline  =  "hostname  $1“      shell.args      =  “node1“  end  

•  Chef  Solo/Chef  Server    

config.vm.provision  :chef_solo  do  |chef|        chef.add_recipe("apache")        chef.add_recipe("php")    end  

•  Puppet/Puppet  Server    

config.vm.provision  :puppet,  do  |puppet|      puppet.manifests_path  =  “manifests"      puppet.manifest_file  =  "default.pp“  end  

Page 17: Using Vagrant, Puppet, Testing & Hadoop

Provision  with  Puppet  

Page 18: Using Vagrant, Puppet, Testing & Hadoop

•  You  can  use  a  Puppet  Master  or  locally  apply  Puppet  modules  and  manifests  to  provision  your  Vagrant  nodes.  – Locally  applied  Puppet  modules  and  manifests:  

 base_config.vm.provision  :puppet,                                  :module_path  =>  ["puppet_modules","puppet_modules_private"],                                  :op5ons  =>  "-­‐-­‐verbose"  do  |basepuppet|          basepuppet.manifests_path  =  "puppet_manifests“          basepuppet.manifest_file  =  "default.pp“          basepuppet.pp_path  =  "/tmp/vagrant-­‐puppet“  end  

Vagrant  and  Puppet  

Page 19: Using Vagrant, Puppet, Testing & Hadoop

Cont’d  

– Using  Puppet  Master  to  provision:  – Point  your  configura5on  at  your  local  Puppet  Master  

Vagrant::Config.run  do  |config|  ....  <snip>    ....      base_config.vm.provision  :puppet_server  do  |puppet|                puppet.puppet_server  =  "puppet.yourdomain.com"        end    end  

 

Vagrant  and  Puppet  

Page 20: Using Vagrant, Puppet, Testing & Hadoop

•  The  basic  manifest  is  made  up  of  the  following  components:  /etc/puppet  -­‐                                                -­‐  manifests/site.pp                                                -­‐  manifests/nodes.pp                                                -­‐  modules/<module_name>/manifests                                                -­‐  modules/<module_name>/files                                                -­‐  modules/<module_name>/templates                                                -­‐  modules/<module_name>/lib  

Puppet  Manifest  design  

Page 21: Using Vagrant, Puppet, Testing & Hadoop

Think  about  using  ‘environments’  

•  Puppet  allows  you  to  use  environments.  Environments  are  separate  namespaces  where  you  can  run  and  test  your  code  on  the  same  puppet  master.  – Namespaces  like  produc5on,  staging,  tes5ng,  etc  

•  The  puppet.conf  file  needs  the  following:  modulepath  =  /etc/puppet/environments/$environments  

Page 22: Using Vagrant, Puppet, Testing & Hadoop

Environments  cont’d  

•  Allows  you  to  checkout  code  under  the  /etc/puppet/environments/<checkout>  and  then  pass  the  following  to  the  client  $  puppet  agent  -­‐-­‐test    -­‐-­‐noop  -­‐-­‐environment  <checkout>  

•  Test  changes  against  systems  before  pushing  code  to  produc5on  

Page 23: Using Vagrant, Puppet, Testing & Hadoop

Things  to  think  about  in  Module  Design  

•  Puppet  modules  are  a  collec5on  of  resources  to  install,  configure  and  manage  a  specific  applica5on  or  perform  some  kind  of  func5on.    –  Eg,  install  and  configure  the  hapd  service  for  your  applica5ons.  

•  Keep  modules  separate.  Don’t  have  hapd  resources  being  managed  from  your  postgresql  module.  

•  Keep  data  separate  from  code.    –  Have  a  separate  class  that  contains  your  data  (  class  modname::data  

{  }  )  –  Use  an  external  node  classifier  (ENC).  That  is  a  CMDB  like  service  that  

Puppet  can  extract  and  build  configura5ons  from.  •  Keep  an  ear  on  the  Puppet  User  list  as  many  design  

ques5ons  are  asked  and  answered  there.  

Page 24: Using Vagrant, Puppet, Testing & Hadoop

•  Nodes  are  tedious  to  manage.  nodes.pp  node  base  {      include  yum  }  node  node1  inherits  base  {      include  hapd  }  

•  Just  have  this:  node  default  {    include  roles  }  

•  Group  nodes  based  on  Facts  or  other  data.    

Manage  nodes?  

Page 25: Using Vagrant, Puppet, Testing & Hadoop

Roles,  everything  has  a  role  

•  If  it  doesn’t  have  a  role,  it  has  a  default  role.  •  Roles  decide  what  the  node  has.  –  Easier  to  manage  than  node  and  doesn’t  rely  on  ‘inheritance’.  •  Commonly,  inheritance  is  not  like  programming  inheritance.    

–  Roles  with  Hiera.  class  roles  {      $my_role  =  hiera(‘my_role’)      if  $my_role  ==  ‘webservice’  {          include  roles::webservices      }  }  

Page 26: Using Vagrant, Puppet, Testing & Hadoop

Tes5ng  Modules  

Page 27: Using Vagrant, Puppet, Testing & Hadoop

Tes5ng,  phhhuu!  

•  Why  test?  – As  your  module  complexity  grows  you  need  to  make  sure  that  it  will  work.  

– Puppet  is  CONSTANTLY  changing      •  Ensure  your  code  is  keeping  up  with  new  puppet  versions  

– Your  infrastructure  code  is  code  –  why  not  test  it?  – Test  driven  code  is  beaer  code,  helps  to  think  about  what  the  outcome  should  be.  

Page 28: Using Vagrant, Puppet, Testing & Hadoop

Introducing  the  tools  •  RSpec  –  –  hap://rspec-­‐puppet.com/  

•  Cucumber-­‐Puppet  –  –  haps://github.com/nistude/cucumber-­‐puppet  

•  Both  tools  do  the  same  thing  and  are  based  on  common  tes5ng  frameworks.  

•  Both  tools  support  Business  Driven  Development  •  How  do  I  use  it?  –  RSpec  tests  the  modules  –  Cucumber  tests  the  manifests  as  a  whole  

Page 29: Using Vagrant, Puppet, Testing & Hadoop

RSpec    

class  hadoop::namenode::config  {      require  hadoop::config      include  hadoop::install::namenode      include  hadoop::namenode::cluster_config_files    #  realise  the  user  and  group  and  the  configfiles      Group  <|  tag  ==  'hadoop_node'  |>  -­‐>  <snip>      file  {  '/usr/lib/hadoop-­‐0.20/logs/SecurityAuth.audit':          ensure    =>  present,  <snip>          require  =>  Package['hadoop-­‐0.20-­‐namenode']      }  -­‐>      Exec  <|  tags  ==  'common_execs'  |>  -­‐>      hadoop::namenode::create_namenode_dirs  {$hadoop::config::hadoop_default_dirs:    }    -­‐>      class  {"hadoop::namenode::namenode_format":    }  }  #end  class  

Page 30: Using Vagrant, Puppet, Testing & Hadoop

RSpec  require  'spec_helper'    describe  'hadoop::namenode::config'  do      let(:facts)  {  {:hostname  =>  'node2',  :hadoop_node  =>  'namenode',  :role  =>  

'hadoop_namenode'  }  }        let(:5tle)  {  'config'  }        it  {  should  include_class('hadoop::install::namenode')  }        it  {  should  contain_file('/usr/lib/hadoop-­‐0.20/logs/SecurityAuth.audit')  }      it  {  should  contain_file('/etc/hadoop-­‐0.20/conf.default/core-­‐site.xml')  }        it  {  should  contain_service('hadoop-­‐0.20-­‐namenode').with_ensure('present')  }    end  

Page 31: Using Vagrant, Puppet, Testing & Hadoop

Cucumber-­‐Puppet  

•  Does  the  catalog  compile  for  your  nodes?  – Tests  run  on  the  master  (or  alterna5ve)  – When  nodes  check  in,  Puppet  creates  a  yaml  file  in  /var/lib/puppet/yaml/node  

– cucumber-­‐puppet  uses  the  output,  the  node  cache  file,  from  the  last  puppet  run  

–  In  Puppet  v3  this  changes  some  what  as  you  can  use  the  puppet  node  find  interface  to  retrieve  the  same  informa5on.    

Page 32: Using Vagrant, Puppet, Testing & Hadoop

Cucumber  Basics    Feature:  General  policy  for  all  catalogs      In  order  to  ensure  applicability  of  a  host's  catalog      As  a  manifest  developer      I  want  all  catalogs  to  obey  some  general  rules        Scenario  Outline:  Compile  and  verify  catalog          Given  a  node  specified  by  "features/yaml/<hostname>.mylocal.yaml"          When  I  compile  its  catalog          Then  compila5on  should  succeed          And  all  resource  dependencies  should  resolve            Examples:              |  hostname    |              |  puppet  |              |  node2  |              |  node3  |              |  node4  |    

Page 33: Using Vagrant, Puppet, Testing & Hadoop

Cucumber-­‐Puppet    Then  /^service  "([^\"]*)"  should  be  "([^\"]*)"$/  do  |name,  state|      steps  %Q{          Then  there  should  be  a  resource  "Service[#{name}]"      }      if  state  ==  "disabled"          steps  %Q{              Then  the  service  should  have  "enable"  set  to  "false"          }      elsif  state  ==  "running"          steps  %Q{              Then  the  state  should  be  "#{state}"          }      end  end  

 

Page 34: Using Vagrant, Puppet, Testing & Hadoop

Automa5ng  Tes5ng  

Page 35: Using Vagrant, Puppet, Testing & Hadoop

Jenkins  

•  Helps  maintain  build  pipelines  •  Push  your  infrastructure  into  the  so�ware  project  pipelines.  

•  Con5nuous  integra5on  used  main  by  so�ware  projects,  not  o�en  by  infrastructure  

•  Get  greater  certainty  of  your  infrastructure  deployments.  

Page 36: Using Vagrant, Puppet, Testing & Hadoop

Useful  plugins  

Page 37: Using Vagrant, Puppet, Testing & Hadoop

Demonstra5on  

Page 38: Using Vagrant, Puppet, Testing & Hadoop

Success