Anynines - Running Cloud Foundry for 12 months - An experience report

Running Cloud Foundry An Experience Report

About this talk

• Receive an opinion about running Cloud Foundry (CF)

• How to shoot your own leg with CF and overcommitment settings

• How to perform CF updates

• How to harden CF

• Wise words about CF services

Introduction

about.me/fischerjulian

Running a public Cloud Foundry

for more than a year.

It works.

In order to run Cloud Foundry smoothly …

… refer to the package leaflet for risks and side effects and consult pivotal, cloudcredo or anynines.“

The details

The anynines Stack

Hardware

OpenStack

Cloud Foundry

VMware

We migrated from a Rented VMware to a

self-hosted OpenStack.

For more details on this: http://rh.gd/a9vmw2sos

Proof point made…

Cloud Foundry saves investments into software development

by being infrastructure agnostic.

Running Cloud Foundry. What happened.

Security Issues

• Pivotal informs partners early about issued

• Usually along with fixes

OpenStack Issues

• Ext4 vs. Ext3

• DEA MTU

• rsyslogd command not found

CF Gotchas

DEA evacuate & Bosh timeout race-condition

• Removing a DEA → Apps will be evacuated→ DEA will be stopped

• Bosh deployment will fail when evacuation takes longer than the Bosh timeout

• Set your Bosh timeout accordingly!

DEA over-commitment

Default overcommitment factor = 4

RAM peaks may cause random errors

• Failures during staging

• Random application crashes

• No meaningful log information

Reducing over-commitment

• Native strategy

• Reduce over-commitment factor

• Bosh deploy

• 8 GB VM, OC factor 4 → Announces 32 GB (V)RAM

• 8 GB VM, OC factor 2 → Announces 16 GB (V)RAM

• When evacuating a 32 GB (V)RAM host, another 32 GB (V)RAM host will be preferred (more free space)

Evacuation Wave

= maximum impact on running apps!

New DEAs (OC 2) will receive apps when old DEAs

(OC 4) have been stopped.

• Create 2nd resource pool for new DEAs

• Deploy the 2nd resource pool before startup to stop old DEAs

• (-) Needs more resources

• (+) Smoother transition

Updating Cloud Foundry

Required: Staging System

• Structurally identical

• Less VMs

1. Determine new features

since last release

2. Study

deployment manifest changes

3. Apply

deployment manifest changes

4. First staging attempt

5. Debug and Fix it!

6. Simulate the live-upgrade

7. Schedule maintenance on

status.anynines.com

8. Perform the upgrade

and cross fingers.

CF Hardening

Accept that VMs are ephemeral

VM Failover Strategies

Resurrect

• Monitor VM

• Re-Build VMs automatically

• e.g. using Cloud Foundry Bosh

• + Easy

• - Takes long (minutes not seconds)

• - Open Stack doesn’t release persistent disks automatically

Failover to Standby VM

Distribute CF components across availability zones

• Build disjunct networks, racks, etc.

• Each disjunct zone = availability zone

• Tell your IaaS about availability zones

• On provision choose the AZ

• Build Bosh releases accordingly

• Provide stand-by VM

• Monitor VM and perform failover

• IP failover using Pacemaker

• + Fast failover (seconds)

• - Pacemaker not easy to use (& boshify)

• - Increased resource usage by stdby VM(s)

• 2 * UAA

• 2 * CC

• 2 * n * DEAs

• 2 * Health Manager

• …

UAA & CC DB =

HA Postgres

• UAA and Cloud Controller database

• Single point of failure for Cloud Foundry

• Postgres not inherently clusterable > failover with standby vm

• Master/slave replication

• Pacemaker/corosync

• IP-Failover using NIC-reattachment

That’s half way towards a PostgreSQL CF Service

• Add a V2 Service Broker

• Add a provisioning logic

• Provision 2-node db cluster on cf create service postgres medium-cluster

Services

“The best way to find yourself is to lose yourself in the service of others.”

― Mahatma Gandhi

Wardenized Services (community services)

are cute for pet projects.

Not suitable for production.

• Implementations are outdated

• One size doesn’t fit all!

No production CF without high quality services.

CF Service Design

• Use clusterable services if possible

• Implement automatic failover if not

• Autoprovisioning using Bosh

• Organize self-healing

• (Semi-)Automatic recovery from degraded mode

Summary

• Bosh & the CF release are powerful, yet you can cut yourself.

• HA Services are very necessary.

• CF is ready to be used in production.

Questions?

Thank you!

Anynines - Running Cloud Foundry for 12 months - An experience report

Technology

Transcript of Anynines - Running Cloud Foundry for 12 months - An experience report

Foundry Defects

FOUNDRY PRODUCTS CATALOGUE - Molten Metalmoltenmetalproducts.com/products/foundry/foundry-products.pdf · FOUNDRY PRODUCTS CATALOGUE Innovating tomorrow’s solutions . . . today.

Anynines - Building a European PaaS

Foundry Practice Magazine Foundry Practice 264 Junho 2015 ...

Foundry Es

Hargreaves Foundry Drainage...The Hargreaves Foundry group of companies includes Hargreaves Foundry Drainage, Hargreaves Foundry and Hargreaves Lock Gates. Hargreaves Foundry is a

CloudCamp. Julian Fischer Anynines - migrating a cloud foundry from vm ware to open-stack

a Running Tour of Cloud Foundry

WFO WORLD FOUNDRY SUMMIT World Foundry Summit...CEO’s Reshaping the Foundry Industry The World Foundry Organization is delighted to be hosting a World Foundry Summit in Rome, Italy,

VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private Cloud

Honeywell Your Full Circle My SOI Foundry Partner Foundry ... · Honeywell’s My SOI Foundry is a complete foundry solution for all your RFIC manufacturing needs. We ... ·A foundry

DATA SHEET - Icecatobjects.icecat.biz/objects/mmo_76848051_1575029424_3985_30657… · Foundry Katana • • Foundry Mari • Foundry Modo • Foundry Nuke • VR Authoring. Adobe

Foundry Sand: Charactereistics, Specifications ... - Foundry... · Photos Courtesy of AFS-FIRST What is Foundry Sand? • Foundry sand is a high-quality uniform silica sand that is

Pivotal Web Services - a Real World Example of Running Cloud Foundry at Scale (Cloud Foundry Summit 2014)

TIBCO Microflow Container Edition User's Guide€¦ · Pivotal Cloud Foundry environment. See Pushing and Running Microflow Apps on Pivotal Cloud Foundry and Testing Microflow Apps

Rheumatism Foundry Workersoem.bmj.com/content/oemed/23/1/42.full.pdf · Rheumatism in Foundry Workers ... Foundry Expected Foundry Expected Foundry ... significantly less frequent

Cloud Foundry Roadmap (Cloud Foundry Summit 2014)

Gießerei-Industrie Foundry Industry€¦ · foundry industry, we offer machines and systems for the ‘green sand foundry’ and ‘no-bake sand foundry’ processes like coolers,

Foundry Products (Ideal Foundry

Director of Technology Elkem Foundry DivisionElkem Foundry Division Kristiansand, Norway€¦ · · 2012-11-20Elkem Foundry DivisionElkem Foundry Division Kristiansand, Norway Elkem