ganetimgr A platform to simplify management of Ganeti instances

23
http://www.noc.grnet.gr ganetimgr A platform to simplify management of Ganeti instances George Kargiotakis – System Administrator ([email protected]) Leonidas Poulopoulos – Developer ([email protected])

description

ganetimgr A platform to simplify management of Ganeti instances . George Kargiotakis – System Administrator ([email protected]) Leonidas Poulopoulos – Developer ([email protected]). GRNET NOC ?. Servers Team aka SysAdmins Managing Servers + Services Depend on Virtualization ( Ganeti ) - PowerPoint PPT Presentation

Transcript of ganetimgr A platform to simplify management of Ganeti instances

Page 1: ganetimgr A platform to simplify management of Ganeti instances

http://www.noc.grnet.gr

ganetimgrA platform to simplify management of

Ganeti instances

George Kargiotakis – System Administrator ([email protected])

Leonidas Poulopoulos – Developer ([email protected])

Page 2: ganetimgr A platform to simplify management of Ganeti instances

2

GRNET NOC ?

GanetiCon 2013, Athens ganetimgr – GRNET NOC

Servers Team aka SysAdmins•Managing Servers + Services•Depend on Virtualization (Ganeti)•Currently:• 2 platforms (ViMa/ganetimgr & ~okeanos/synnefo)

• > 20 Ganeti clusters • > 230 HW nodes• > 6000 VMs• NOC & Client VMs run exclusively on Ganeti•Debian 6/7•Extensive use of puppet

Page 3: ganetimgr A platform to simplify management of Ganeti instances

3

Some history...

GanetiCon 2013, Athens ganetimgr – GRNET NOC

February 2010GRNET NOC

Ganeti has a REST API??? Neat!! Let’s

develop a web platform

September 2010OSU OSL

End of 2010GRNET

ganetimgr Ganeti Web Manager ~okeanos

https://code.grnet.gr/ganetimgr

Page 4: ganetimgr A platform to simplify management of Ganeti instances

4ganetimgr – GRNET NOC

Our Motivation

“Clients should be able to apply for instances and manage them through a simple

environment”

GanetiCon 2013, Athens

“KISS Principle”

+

Page 5: ganetimgr A platform to simplify management of Ganeti instances

5ganetimgr – GRNET NOC

Ganetimgr @ GRNET NOC

Our deployment is called

GanetiCon 2013, Athens

VirtualMachines

https://vima.grnet.gr

Target audience are mainly our clients, not us!Don’t expose unneeded info for them, they get confused!

(and ask questions you have to answer...)

Managing through CLI is always faster for us...But some visualization is always nice to have :)

Page 6: ganetimgr A platform to simplify management of Ganeti instances

6ganetimgr – GRNET NOC

Development + Clients

Development:• Demand driven development process• Add features as clients ask for them

GanetiCon 2013, Athens

Clients:• University NOCs/Labs• Research institutions• Governmental organizations• Ministries• European Projects• Ourselves

Our clients need a VPS service that:• is very stable• provides long-running VMs• is simple to use• caters to different needs (science/services)

Page 7: ganetimgr A platform to simplify management of Ganeti instances

7ganetimgr – GRNET NOC

Commit Habits

GanetiCon 2013, Athens

Love to commit on Thursdays right before or after lunch :)

HG commits before 04-

2011

Page 8: ganetimgr A platform to simplify management of Ganeti instances

8ganetimgr – GRNET NOC

Milestones

GanetiCon 2013, Athens

Mar 2010• Simple Web GUI (instance info)• Multi-cluster support• Shutdown, Reboot, Console• HTTP boot for instances• RAPI calls via urllib

Feb 2011• GUI redesign, support for mobile view

Began as Internal Admin Tool

Backend developer: @apoikosFrontend developer: @leopoul

Page 9: ganetimgr A platform to simplify management of Ganeti instances

9ganetimgr – GRNET NOC

Milestones

Summer 2011• Convert RAPI calls to ganeti’s native client• Switch to Django auth• User Registration, User Profile• Collect user instances from multiple clusters (user instance listing)• Redis Caching – Cache cluster state and user access rights on Redis• South support/migrations• Async notifications for start/stop/reboot via beanstalk• SSH key management• Instance Applications• Multi-network (link) support for clusters• i18n support

GanetiCon 2013, Athens

HOT SUMMER OF CODE

Backend developer: @apoikosFrontend developer: @leopoul

Page 10: ganetimgr A platform to simplify management of Ganeti instances

10ganetimgr – GRNET NOC

Milestones

GanetiCon 2013, Athens

Apr 2013• Multi-Layered Caching mechanism (7-8x faster!)

Backend developers: @faidonl, alexFrontend developer: @leopoul

Sep 2011 – Dec 2012• Usability Fixes• Code cleanup• Minor UI Enhancements

Heavily used in production by our clients

Page 11: ganetimgr A platform to simplify management of Ganeti instances

11ganetimgr – GRNET NOC

MilestonesSummer 2013• UI: Boostrap Theme• New instance actions: Reinstall, Destroy, Rename (via email

confirmation)• Per Instance CPU and Network graphs (via collectd)• Statistics for: Users, Clusters, Nodes, Instances• Information on Cluster Nodes (mem/disk usage, #VMs, role)• Email Notifications mechanism• Idle accounts management• Modify Instance owners through UI (tagging)• Admins can lock instance state (tagging)• Integration with Jira + Internal Server Hardware informational tool (

ServerMon)

GanetiCon 2013, Athens

HOT SUMMER OF CODE

Whip holder: @kargig* Developer (aka slave): @leopoul

Page 12: ganetimgr A platform to simplify management of Ganeti instances

12ganetimgr – GRNET NOC

Interaction with Ganeti-Devel

• Several patches were sent upstream• Merged:• Shared block & file storage• Cluster-wide default iallocator

• Passed-down to ~okeanos and merged upstream:• gnt-network support• IP Pool management

• Still Unmerged:• Boot from HTTP

GanetiCon 2013, Athens

Page 13: ganetimgr A platform to simplify management of Ganeti instances

13ganetimgr – GRNET NOC

Stateless Architecture

GanetiCon 2013, Athens

Ganeti RAPI client

cache

django

Web GUI

ganetimgr-watcher

beanstalkDB

Clusters,Users,

Groups,Applications,

Networks

NO Instance info

stored in DB!

Ganeti clustersVery few

components -> Easy monitoring ->

Stability

Page 14: ganetimgr A platform to simplify management of Ganeti instances

14ganetimgr – GRNET NOC

Key Components

GanetiCon 2013, Athens

Cluster instances

User instances

Instance locks

Cache (redis/memcache) Put/Get jobs

for every instance action

Clear cluster/user cache on job completion via watcher

Async notifications

Beanstalk

Usage of instance tags to determine user rights

Page 15: ganetimgr A platform to simplify management of Ganeti instances

15ganetimgr – GRNET NOC

Instance Lifecycle

GanetiCon 2013, Athens

New user registration/Login

Apply for a new instance

Mail sent to admins for instance creation

approval/rejection

Approve/Create Instance (admin chooses resources)

Instance appears in ‘user instance listing’

Instance View: Info, Stats, Actions (Start, Shutdown, connect via

Console, Reboot, Rename, Destroy, Reinstall)

Add SSH key on first login to web page

Administrators can perform every action (BOFH mode)

Helpdesk View: can view all instances but can perform no action

Page 16: ganetimgr A platform to simplify management of Ganeti instances

16ganetimgr – GRNET NOC

WorkFlow Example

GanetiCon 2013, Athens

Action: Shutdown

Cache: clear user instances

Cache: delete instance key

Ganeti RAPISend ShutdownInstance

Get JobID

Cache: set instance lock

Beanstalk: put JobId, instance details

Poll instanceInstance View page Fetch from cache or RAPI, if not in cache

Beanstalk: Get JobId

Poll cluster with increasing intervals

Cache: del instance lock

Cache: clear cluster instances

If instance has ‘locked’ set in cache then disable further

actions

User cannot perform actions while shutting

down

Status updated via Ajax (Instance View page)

WatcherUI

Job Ends

Page 17: ganetimgr A platform to simplify management of Ganeti instances

17ganetimgr – GRNET NOC

ViMa usage

GanetiCon 2013, Athens

Page 18: ganetimgr A platform to simplify management of Ganeti instances

18ganetimgr – GRNET NOC

ViMa usage

GanetiCon 2013, Athens

We’ve got ~1100 VMs on it and we’ve tried adding clusters of >1000+ VMsNo slow down observed

Page 19: ganetimgr A platform to simplify management of Ganeti instances

19ganetimgr – GRNET NOC

Features• Support• 2.4, 2.5, 2.6 ganeti versions (2.7+ still untested)

• Instances Listing Multi-cluster• Notifications• Statistics• User Actions• Blah blah blah...

GanetiCon 2013, Athens

Yay!! Demo Time!!

Page 20: ganetimgr A platform to simplify management of Ganeti instances

20ganetimgr – GRNET NOC

Future Development• Instance Network Lockdown (almost done)• Improve search, add filters (Q3 2013)• NoVNC (Q3 2013)• Selectable CDROM images to boot from (Q3 2013)• Ownership transfer (through tags) (Q3 2013)• Selectable OS on reinstall (Q4 2013)• Add/Remove/Modify Instance NICs (Q4 2013)• Batch actions on selected instances (Q4 2013)• Resource quotas (Q1 2014)• User editable VM resources following quotas (Q1 2014)• Cluster classes based on storage backend, mem/cpu (Q1 2014)• Custom KVM settings (Q1 2014)• Multiple storage backends per cluster (Q2 2014)• Admin view cluster status: cluster details & node status (Q2 2014)• API access (Q2 2014)

GanetiCon 2013, Athens

Hopefully before Q3 2014

Page 21: ganetimgr A platform to simplify management of Ganeti instances

21ganetimgr – GRNET NOC

Desirable Features from Ganeti• OS params in instance (RAPI)

• Would help us remain stateless (needed: img_id)

• Tag add/delete hook• Would ease network lockdown (abuse reports)

GanetiCon 2013, Athens

Page 22: ganetimgr A platform to simplify management of Ganeti instances

22ganetimgr – GRNET NOC

The end

Questions ?

GanetiCon 2013, Athens

Page 23: ganetimgr A platform to simplify management of Ganeti instances

23ganetimgr – GRNET NOC

Thank you (e charist )f̱ ó̱̱

Get the code / Report problems / Ask for features

George Kargiotakishttps://void.gr/kargig/[email protected]

GRNET NOC

GanetiCon 2013, Athens

https://code.grnet.gr/projects/ganetimgr