OpenStack & OpenContrail in Production

38
OpenStack & OpenContrail in Production: Best Practices from a SaaS leader

Transcript of OpenStack & OpenContrail in Production

Page 1: OpenStack & OpenContrail in Production

OpenStack & OpenContrail in Production:Best Practices from a SaaS leader

Page 2: OpenStack & OpenContrail in Production

IntroductionsEdgar Magana, PhD------Workday, Inc.Cloud Operations ArchitectTwitter: @emaganap

- Member since April 2011 – Santa Clara Summit- OpenStack Board of Directors- OpenStack Foundation User Committee- OpenContrail Advisory Group- Neutron core former member and founder (Quantum)- Open-source developer enthusiastic!- Futbol Club Barcelona Fan!

Page 3: OpenStack & OpenContrail in Production

Outline

Operations Challenges

Architecture Overview

CI – Pipeline

CI – Environments

– Desktop

– Virtual Machines (OOO)

– Bare-Metal

HA Design

OpenContrail & Containers

Q & A

Page 4: OpenStack & OpenContrail in Production

OpenStack Logical Architecture

Three levels:– API– Messaging Bus– Backend

Source: OpenStack Docs

Page 5: OpenStack & OpenContrail in Production

OpenStack Components

Source: OpenStack Docs – Training Guide

Page 6: OpenStack & OpenContrail in Production

OpenStack Reference Architecture

Source: OpenStack Docs – Networking Guide

Page 7: OpenStack & OpenContrail in Production

Let’s do it!

Page 8: OpenStack & OpenContrail in Production

Workday Production Requirements Automation

Idempotent

Scalable

Secure– SSL on End Points– SSL on RabbitMQ– SSL on MySQL– IPTables

Stable

Production Readiness– Logging– Monitoring

Bonded physical interfaces per server

Multi-tenant

High Availability

Page 9: OpenStack & OpenContrail in Production

Let me take one more look!

Source: OpenStack Docs – Training Guide

Page 10: OpenStack & OpenContrail in Production

Source: http://www.opentcpcloud.org/en/documentation/reference-architecture/

TCP Cloud – HA Reference Architecture

Page 11: OpenStack & OpenContrail in Production

Source: https://docs.mirantis.com/openstack/fuel/fuel-7.0/reference-architecture.html

Mirantis – HA Reference Architecture

Page 12: OpenStack & OpenContrail in Production

Source: Internet - Indepent

Page 13: OpenStack & OpenContrail in Production

Let’s … Plan it better!

Page 14: OpenStack & OpenContrail in Production

OpenStack @ Workday - Now

Page 15: OpenStack & OpenContrail in Production

OpenStack @ Workday - Tomorrow

Page 16: OpenStack & OpenContrail in Production

CI/CD @ Workday for OpenStack

Page 17: OpenStack & OpenContrail in Production

How it started: Prototype #1

Controller Compute Tempest

Build once and reuse

SDN

NOTE: https://github.com/openstack/openstack-chef-repo

Rally

Page 18: OpenStack & OpenContrail in Production

Drivers for Containers

Lightweight – many containers on a single VM

Re-usable – Build once and deploy

Shareable –share common components (Chef server, Tempest etc.)

Page 19: OpenStack & OpenContrail in Production

Chef Development Framework

Host (Fedora 20) Virtual Machine

DN

S

LDA

P

Controller Compute TempestSDN

Docker Engine

Page 20: OpenStack & OpenContrail in Production

Iteration #1: With Neutron

Development Workflow

Page 21: OpenStack & OpenContrail in Production

Development Workflow

Iteration #2: (OpenContrail)

Page 22: OpenStack & OpenContrail in Production

Development Environment - Network Diagram

Page 23: OpenStack & OpenContrail in Production

Building CI/CD on OpenStack and OpenContrail

Page 24: OpenStack & OpenContrail in Production

OpenStack @ Workday Environments

CI on Virtual Machines (OoO)–Reproducible disposable test environment

integrated with Workday’s Jenkins/Gerrit build pipeline.

–Runs on OpenStack Icehouse–Ruby Fog Library

Page 25: OpenStack & OpenContrail in Production

CI Environment on Virtual Machines Workflow

1 2

3

4

5

6

Page 26: OpenStack & OpenContrail in Production

OoO - CI Workflow

Jenkins Openstack Controller

Git repo

Chef

Launch Chef Server

Fetch Chef artifacts

Create OS controllerSDN, ComputeTempest

Controll

er

SDN

Com

pute

Tempest

Run Chef Clients

Run Tempest

Page 27: OpenStack & OpenContrail in Production

Road to production

Dev• Build and test

Virtual

• Create Gerrit review• VM CI passes

Bare Metal

• Promote cookbooks to production

Page 28: OpenStack & OpenContrail in Production

Visibility

Nagios checks– Over 200 in total

Page 29: OpenStack & OpenContrail in Production

Visibility

Wavefront integration

Page 30: OpenStack & OpenContrail in Production

Workday Production Requirements Automation

Idempotent

Scalable

Secure– SSL on End Points– SSL on RabbitMQ– SSL on MySQL– IPTables

Stable

Production Readiness– Logging– Monitoring

Bonded physical interfaces per server

Multi-tenant

High Availability

Page 31: OpenStack & OpenContrail in Production

OpenContrail Data Plane

Source: Internet - OpenContrail

Page 32: OpenStack & OpenContrail in Production

Key Take Away

It took a number of iterations

Docker on Vagrant proved to be a very powerful Chef development environment– Rapid development and prototyping– Containers are very lightweight– You can share container images across teams

Increased development agility by building CI on OpenStack Predictable deployment outcome Super User 2016 Finalist:

http://superuser.openstack.org/articles/austin-superuser-awards-finalist-workday-inc

Page 33: OpenStack & OpenContrail in Production

How it all started..

Build OpenStack cloud with community cookbooks and packages Started with openstack-chef, a community project Realized its limitations Consistent and repeatable environment for developers, operations Share common components Test framework Continuous integration framework Scalability & Benchmarking tests (Rally)

NOTE: https://github.com/openstack/openstack-chef-repo

Page 34: OpenStack & OpenContrail in Production

Container Networking Roadmap

Product Management, OpenContrail

DP [email protected]

Page 35: OpenStack & OpenContrail in Production

PRODUCT ROADMAP feature AREAS

Routing & Switching

(IPv4, v6)

Network Services (IPAM, DNS, DHCPSNAT, FIP, BGPaaS,

QoS)

Load Balancing (customizable ECMP,

LBaaS)

Security & Policies

(Policy Enf.,Distributed FW,

Sec Grp, XMPP Encryp.)

Perf & Scale(DPDK / SRIOV, Smart NIC, Infra.

scale)

Gateway Services

(L2, L3, vCenter GW)

Rich Analytics, (Alerts, Overlay-

Underlay Correlation, multi-region)

Service Chaining (PNF, VNF, containers,

v6, 3rd party / TAP, Health-check, failover,

policy-based)

HA, Upgrades (Infra Failover, ISSU)

API Services(multi-vendor Orch.,

Global Controller, OpenStack, K8s,

vCenter)Source: OpenContrail/Juniper

Page 36: OpenStack & OpenContrail in Production

N a m e s p a c e - B

N a m e s p a c e - A

Containers

C1

C2

Containers

POD 2

C1

C2

K8S COMPONENTS & TERMINOLOGY

Containers

POD 1 (POD1-IP allocated by Contrail)

C1

C2

Service – S1

Application 1(Load balancing across multiple PODs

done using ECMP-LB)

Service – S2

Containers

C1

C2

Containers

POD 6

C1

C2

Containers

POD 5

C1

C2

Service (VIP, Port)… Service IP allocated by K8s IPAM External IP Service (VIP,

Port)

Service – S3

Service (VIP, Port)

(does not have any PODs)

Minion / NodeMinion / Node …

Application 2(Load balancing across multiple

PODs using ECMP-LB)

Repl. Ctrl

Repl. Ctrl …

Accessing an end-point outside of the cluster

Source: OpenContrail/Juniper

Page 37: OpenStack & OpenContrail in Production

DIFFERENT LEVELS OF ISOLATION

N a m e s p a c e - B

S3

S4

POD 9

…POD 13…

N a m e s p a c e - A

S1

S2

POD 1

…POD 5

……

N a m e s p a c e - D

S7

S8

POD 25…

POD 29…

N a m e s p a c e - C

S5

S6

POD 17…

POD 21…

N a m e s p a c e - F

S11

S12

POD 41…

POD 45…

N a m e s p a c e - E

S9

S10

POD 33…

POD 37…

…… …

DEFAULT CLUSTER MODE NAMESPACE ISOLATION POD / SERVICE ISOLATION This is how K8s networking works

today Flat subnet where -- Any workload

can talk to any other workload

In addition to default cluster, operator can add isolation to different namespaces transparent to the developer

In this mode, each POD is isolated from one another

Note that all three modes can co-exist

Source: OpenContrail/Juniper

Page 38: OpenStack & OpenContrail in Production

We Are Hiring!