Advanced Auto-Scaling And Deployment Tools For...
Transcript of Advanced Auto-Scaling And Deployment Tools For...
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aharon TwizerCo-Founder & CTO
Spotinst
Advanced Auto-Scaling And Deployment Tools For K8S
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Spotinst Introduction• From zero to a running K8S cluster• Anatomy of k8s• “Heterogenous” or “Tetris” scaling• Scaling K8S - the old school way• The problems of old school scaling• k8s autoscaler concepts and implementation
Agenda
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Spotinst Snapshot
Founded ~3 years ago San Francisco
Tel AvivLondon
New York
$17Mraised
Intel Capital
90+Employees
2000+ Customers
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Spotinst is a Virtual Cloud Infrastructure Provider which enables companies to run and manage mission-critical applications
at 60-80% discount on all types of Compute resources
Virtual Machines Containers Functions
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
~ 45% of Current Cloud Computing Capacity is Unused
What is ‘Spare Capacity’?
• No guarantee on availability• Prices are 80-90% less
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Spotinst uses predictive analytics, algorithms and historical data to identify and predict Spot Instances that are about to be “interrupted”.
Spotinst Technology
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Spotinst Elastigroup
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The More Efficient We Become In Consuming a Resource, The More Of That Resource We Consume
Spotinst Effect
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
From Zero To A Running Kuberenets Cluster
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
KUBERNETES
MASTERS
NODES
CONTROLLERKUBELET
SCHEDULER
PROXYAPISERVERS
SERVICE REPLICASET
NAMESPACE
PODDEPLOYMENYT
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
IT’S COMPLICATEDToo many buzzwords!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
KOPSKubernetes Operations
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• KUBECTL - Just for clusters• $ kubectl get pods
• $ kops get clusters
• Automates the provisioning
• Deploys Highly Available (HA) clusters
• Generates Terraform configuration
• Supports dry-run mode
What is KOPS?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• CREATE production-grade clusters
• kops create cluster
• MAINTAIN production-grade clusters
• kops edit ig —name $NAME nodes
• UPGRADE production-grade clusters
• kops upgrade cluster
• DESTROY existing clusters
• kops delete cluster
What Can We Do With KOPS?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Any major Cloud Provider
• AWS
• GCP
• Azure
• Including Spotinst :)
• Utilizing spot instances on top of AWS
Where Can We Use KOPS?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Any major Cloud Provider
• AWS
• GCP
• Azure
• Including Spotinst :)
• Utilizing spot instances on top of AWS
Where Can We Use KOPS?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ok, Now I Have a Cluster.How Do I Scale it according to its needs?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
K8S Anatomy
Ingress Controller
NodePod
Container
PodContainer
NodePod
Container
PodContainer
NodePod
Container
PodContainer
Kubernetes Scheduler
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
2 Layers Heterogenous Autoscaling
c3.large c3.2xlarge m3.med
Distributed Cluster
Containers
Infrastructure
Infrastructure auto scaling
Service auto scaling
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• When total Memory / CPU Reservation & Utilization meet a certain threshold
How do you scale (Infrastructure Layer) today?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scenario 1 - Deceiving Reservation
c3.large c3.2xlarge m3.med
Distributed Cluster
Containers
Infrastructure
= 1 vCPU & 512mb RAM
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scenario 2 - Wrong Instance Type
c3.large c3.2xlarge
Distributed Cluster
Containers
Infrastructure
= 1 vCPU & 512mb RAM
m3.med
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scenario 3 - Wrong Scale Down
Distributed Cluster
Containers
Infrastructure
= 1 vCPU & 512mb RAM
m3.medm3.med c3.2xlarge
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• No scaling policies required
• Scale according to cluster needs (events)
• Fast scaling - don’t wait for thresholds, satisfy your tasks needs
• Scale down when instances are fragmented (But!)
• Keep headroom for incoming pods• Head room is not reservation, but rather units of work
k8s auto-scaler
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Catching the right events
No nodes are available that match all of the following predicates:: Insufficient memory, PodToleratesNodeTaints.
No nodes are available that match all of the following predicates:: Insufficient cpu, PodToleratesNodeTaints.
• Insufficient Memory
• Insufficient CPU
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
According to the pending Pods, sum the required amount of resources - CPU and memory (e.g 10 cores, 28 GB RAM)
Determine the instance type that can handle the most resource-consuming task
•
Scale Up - Logic
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Look for idle instances - CPU & memory below 40%• 3 consecutive periods of 1 min
• Make sure the pods running on this instance can run on other instances• CPU
• RAM
• All pods’ ports are available on other instances
• Drain the instance
• Wait for the pods to be rescheduled on other instances
• Terminate the instance
Scale down - Defragmentation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scale down - Defragmentation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• A buffer of spare capacity that makes sure that when we want to scale more tasks, we don't have to wait for new instances.
• Headroom is defined as follows:• unit : CPU and RAM.
• number of units.• Automatic headroom - detect the must scheduled task and
set the head room to that task’ resources.
Headroom
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Headroom - Example
Distributed Cluster
Containers
Infrastructure m3.medm3.med c3.2xlarge
• Unit: 1024 CPU units & 512 MB RAM• requested units: 6
m3.med
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Utilized Cluster
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank [email protected]