Advanced Auto-Scaling And Deployment Tools For...

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Aharon TwizerCo-Founder & CTO

Spotinst

Advanced Auto-Scaling And Deployment Tools For K8S


• Spotinst Introduction• From zero to a running K8S cluster• Anatomy of k8s• “Heterogenous” or “Tetris” scaling• Scaling K8S - the old school way• The problems of old school scaling• k8s autoscaler concepts and implementation

Agenda


Spotinst Snapshot

Founded ~3 years ago San Francisco

Tel AvivLondon

New York

$17Mraised

Intel Capital

90+Employees

2000+ Customers


Spotinst is a Virtual Cloud Infrastructure Provider which enables companies to run and manage mission-critical applications

at 60-80% discount on all types of Compute resources

Virtual Machines Containers Functions


~ 45% of Current Cloud Computing Capacity is Unused

What is ‘Spare Capacity’?

• No guarantee on availability• Prices are 80-90% less


Spotinst uses predictive analytics, algorithms and historical data to identify and predict Spot Instances that are about to be “interrupted”.

Spotinst Technology


Spotinst Elastigroup


The More Efficient We Become In Consuming a Resource, The More Of That Resource We Consume

Spotinst Effect


From Zero To A Running Kuberenets Cluster


KUBERNETES

MASTERS

NODES

CONTROLLERKUBELET

SCHEDULER

PROXYAPISERVERS

SERVICE REPLICASET

NAMESPACE

PODDEPLOYMENYT


IT’S COMPLICATEDToo many buzzwords!


KOPSKubernetes Operations


• KUBECTL - Just for clusters• $ kubectl get pods

• $ kops get clusters

• Automates the provisioning

• Deploys Highly Available (HA) clusters

• Generates Terraform configuration

• Supports dry-run mode

What is KOPS?


• CREATE production-grade clusters

• kops create cluster

• MAINTAIN production-grade clusters

• kops edit ig —name $NAME nodes

• UPGRADE production-grade clusters

• kops upgrade cluster

• DESTROY existing clusters

• kops delete cluster

What Can We Do With KOPS?


• Any major Cloud Provider

• AWS

• GCP

• Azure

• Including Spotinst :)

• Utilizing spot instances on top of AWS

Where Can We Use KOPS?


Ok, Now I Have a Cluster.How Do I Scale it according to its needs?


K8S Anatomy

Ingress Controller

NodePod

Container

PodContainer

NodePod

Container

PodContainer

NodePod

Container

PodContainer

Kubernetes Scheduler


2 Layers Heterogenous Autoscaling

c3.large c3.2xlarge m3.med

Distributed Cluster

Containers

Infrastructure

Infrastructure auto scaling

Service auto scaling


• When total Memory / CPU Reservation & Utilization meet a certain threshold

How do you scale (Infrastructure Layer) today?


Scenario 1 - Deceiving Reservation

c3.large c3.2xlarge m3.med

Distributed Cluster

Containers

Infrastructure

= 1 vCPU & 512mb RAM


Scenario 2 - Wrong Instance Type

c3.large c3.2xlarge

Distributed Cluster

Containers

Infrastructure


m3.med


Scenario 3 - Wrong Scale Down

Distributed Cluster

Containers

Infrastructure


m3.medm3.med c3.2xlarge


• No scaling policies required

• Scale according to cluster needs (events)

• Fast scaling - don’t wait for thresholds, satisfy your tasks needs

• Scale down when instances are fragmented (But!)

• Keep headroom for incoming pods• Head room is not reservation, but rather units of work

k8s auto-scaler


Catching the right events

No nodes are available that match all of the following predicates:: Insufficient memory, PodToleratesNodeTaints.

No nodes are available that match all of the following predicates:: Insufficient cpu, PodToleratesNodeTaints.

• Insufficient Memory

• Insufficient CPU


According to the pending Pods, sum the required amount of resources - CPU and memory (e.g 10 cores, 28 GB RAM)

Determine the instance type that can handle the most resource-consuming task

•

Scale Up - Logic


• Look for idle instances - CPU & memory below 40%• 3 consecutive periods of 1 min

• Make sure the pods running on this instance can run on other instances• CPU

• RAM

• All pods’ ports are available on other instances

• Drain the instance

• Wait for the pods to be rescheduled on other instances

• Terminate the instance

Scale down - Defragmentation


Scale down - Defragmentation


• A buffer of spare capacity that makes sure that when we want to scale more tasks, we don't have to wait for new instances.

• Headroom is defined as follows:• unit : CPU and RAM.

• number of units.• Automatic headroom - detect the must scheduled task and

set the head room to that task’ resources.

Headroom


Headroom - Example

Distributed Cluster

Containers

Infrastructure m3.medm3.med c3.2xlarge

• Unit: 1024 CPU units & 512 MB RAM• requested units: 6

m3.med


Utilized Cluster


Thank [email protected]

mailto:[email protected]

Advanced Auto-Scaling And Deployment Tools For...

Documents

Transcript of Advanced Auto-Scaling And Deployment Tools For...