High Availability for Enterprise Clouds
Transcript of High Availability for Enterprise Clouds
High Availability for Enterprise Clouds:Oracle Solaris Cluster and OpenStack
Eve Kleinknecht Principal Product Manager
Thorsten Früauf Principal Software Engineer
November 18, 2015
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
3
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Agenda
OpenStack on Oracle Solaris
Oracle Solaris Cluster for OpenStack
HA for OpenStack cloud controller on Oracle Solaris – two main topologies to achieve HA
• fine grained approach • blackbox approach
– pros/cons for those topologies
Discussion - Q / A
1
2
3
4
4
4
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OpenStack Overview
• Open source cloud software – Generic solution for IaaS, PaaS and SaaS
• Oracle OpenStack optimized for – Database as a Service, Java as a Service
• Combines compute, network and storage resources – Self-service dashboard – Services exposed through REST APIs
What is OpenStack?
Single Management Pane
VM VM VM
Virtualized Data Center Resources
5
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OpenStack Services
Component Description Component Description
Nova Compute virtualization Glance Image management and deployment
Cinder Block storage Swift Object storage
Neutron Software defined networking Heat Application and VM orchestration
Keystone Authentication between cloud services Murano Application catalog
Horizon Web based dashboard Trove Database as a Service
Overview of Core Components
6
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OpenStack Across Oracle’s Portfolio
Horizon Centralized Cloud Management
Zones and Kernel Zones
Nova / Ironic Self-Service Compute
and Bare Metal
Elastic Virtual Switch and Open vSwitch
Neutron Software Defined
Networking
ZFS File System
Cinder / Swift Cloud Scale Storage
Unified Archives
Heat / GlanceMurano / Trove
Platform as a Service
Built into the Infrastructure
7
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Benefits of Running OpenStack on Oracle Solaris
• Engineered for security and compliance – Minimal privileges for cloud services
– Lock down infrastructure with immutability
• Assured reliability and scale – Automatic service restart and node
dependencies – Guaranteed data integrity
• Seamless upgrade, instant roll-back
OS. Virtualization. SDN. OpenStack. Complete.
8
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Agenda
OpenStack on Oracle Solaris
Oracle Solaris Cluster for OpenStack
HA for OpenStack cloud controller on Oracle Solaris
Discussion - Q / A
1
2
3
9
4
9
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Mission-Critical Cloud RequirementsIf you need:
• Mission-critical service level
• Minimal downtime for maintenance
• Business Continuity
Oracle Solaris Cluster delivers:
• Local, fast, automatic failover for application and services
• Managed switchover of applications and resources among servers or sites
• Safe, reliable, orchestrated recovery from site failure
10
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Oracle Solaris Cluster Functions• Monitor health of all cluster
components: – Servers, storage, network, OS, virtual
machines, applications
• Deliver resiliency to failures through – Hardware redundancy – Robust cluster protection algorithms – Policy-based cluster infrastructure
and applications recovery procedures
• Enable low-impact maintenance
11
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
• Data services: failover, scalable • Storage services: global file
system, failover, scalable • Network services: logical
hostname, load balancing • Dependencies management • Monitoring services
Oracle Solaris Cluster Services
12
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 13
Applications High Availability• Built-in application agents
• Fine-grained control of application: specific start, stop and probing procedures
• Do not require any change in application
• Fully tested in physical and virtualized environment
• Build-your-own agent toolkit for easy creation of custom agents
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
• Choice of VM or application centric model
• Choice of technology: Oracle VM for SPARC domain or zone
• Built-in asset optimization with load balancing, affinity and dependency management at application or VM level
14
Oracle Solaris Cluster and Virtualization
Application Failover Fine-grained control of application inside zone or domain
app
web
db
VM
Workload Failover: Zone or domain is blackbox
VM
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
• Managed zone switchover with cold, warm or live migration (kernel zone)
• Automatic zone restart or zone failover upon node failure
• No modification of workload
• Dependencies and load management at zone level
Failover Zones : VM HA
Planned Maintenance: Workload migration
Unplanned Outage: Immediate workload restart or failover
VM
VM
15
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
• Application specific protection: policy based management and fault isolation
• Ease of use : configuration and administration across virtual cluster
• Security isolation: delegated administration and security model extended across cluster
• Dependencies and load management at application level
16
Zone Clusters: Application HA with Virtualization
app
db
Solaris 11 Solaris 11
Solaris 11
Solaris 11
Solaris 11
zone cluster
zone cluster
zone cluster
web
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Agenda
OpenStack on Oracle Solaris
Oracle Solaris Cluster for OpenStack
HA for OpenStack cloud controller on Oracle Solaris
Discussion - Q / A
1
2
3
17
4
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
HA approaches for the OpenStack cloud controllerA) fine grained control over OpenStack services by Solaris Cluster
● best practices as found in other Oracle Optimized Solutions for multi-tiered applications and the approach taken on Linux (OpenStack HA guide)
● published white paper describes this approach with specific example ● prioritize fast failure detection and recovery time of individual services
B) blackbox approach by using HA failover kernel zones ● prioritize simplicity of administration ● Solaris Cluster manages the kernel zones to protect against global node
failures
18
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
• Example HA OpenStack node deployment: – Clustered Cloud controller nodes
with Oracle Solaris Cluster (OSC) – Clustered Oraccle ZFS storage
appliance (ZFS SA) • shared storage for OSC • quorum device for OSC • Cinder driver for iSCSI targets provided
to nova compute
– Swift storage nodes (optional) • configure HA Swift ring
19
Example HA node deployment
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
HA for OpenStack cloud controllerfine grained approach (white paper)
• all OpenStack cloud controller components are under cluster control (start, stop, probe)
• IP addresses and shared file systems used by services under cluster control
• usage of the cluster load balancer for scalable services
• define inter-component dependencies on the specific service level – orchestration of service start/stop across zones – fast failure detection and failover times
20
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
HA for OpenStack cloud controller - HA SMF proxy (1)
• The HA SMF proxy data service is a central component for HA OpenStack in the fine grained topology: – implements a dedicated cluster SMF restarter – enables/disables SMF services on behalf of cluster – ability to specify resource dependencies to other cluster services running in
different resource groups, within different zones or nodes for orchestration – comes in three flavors: failover, multi-master and scalable
21
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
HA for OpenStack cloud controller - HA SMF proxy (2)• OpenStack components are deeply integrated with SMF on Solaris
– get started as dedicated non-root UNIX users – some with additional or reduced set of privileges configured – some making use of a variety of SMF method tokens, to expand SMF
properties as option variables for the method script – OpenStack components are implemented through Python
• even the Python method scripts import SMF functions, thus require to be started within an SMF context
• SMF is also used to catch the sometimes verbose Python messages and stack traces into the dedicated SMF service log file
22
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
HA for OpenStack cloud controller - HA SMF proxy (3)• Generic approach to provide HA for OpenStack SMF services:
– failover services (stateful active/passive) • configure HAStoragePlus/ScalMountPoint resource to store dynamic FS content • configure SUNW.LogicalHostname resource for service endpoint • configure SUNW.Proxy_SMF_failover resource for SMF service
– scalable services (stateless active/active) • ensure static content is identical across nodes/zones • configure failover RG with SUNW.SharedAddress resource for service endpoint • configure scalable RG with SUNW.Proxy_SMF_scalable resource for SMF service
• OpenStack service configuration specify corresponding IP-address and storage managed by cluster
23
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Fine grained approach - pros and cons• Pro:
– fast failure detection per service • option to further improve by adding
OpenStack service specific probes
– fast takeover time in case of unplanned outages
– usage of cluster load balancer allows to configure stateless services in a scalable way out of the box (rabbitmq, OpenStack api, Horizon, etc)
– matches industry wide approach to provide HA for OpenStack on Linux
24
• Con: – interdigitation with OpenStack
installation more involved • order of install and some pre-setup and
post-setup tasks required for cluster – small changes in administration
• svcadm vs. clrs for OpenStack services • zone cluster
– strict change management required • OpenStack upgrade procedure • configuration files to be kept in sync
across cluster nodes – not easy to apply to already existing
non-HA OpenStack deployments
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
HA for OpenStack cloud controllerblackbox approach with failover zones
• cluster does only manage (start, stop probe) the failover kernel zones – optional monitoring of suri used in KZ config
• individual OpenStack services and IP addresses not managed by cluster
• inter-component dependencies can only be configured on the kernel zone granularity – though there is an option with sczsmf
• ability to distribute kernel zones across global cluster nodes
25
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Blackbox approach - pros and cons• Pro:
– separation of cluster and OpenStack installation and upgrade
– administration and upgrade of OpenStack services near identical to non-HA setup
– on S11.3 onwards live migration can be used for failover kernel zones to reduce planned downtime considerably
26
• Con: – longer takeover time after node
failure (KZ boot in addition) – individual OpenStack service failure
can't trigger failover • rely purely on SMF to detect service • in case sczsmf is used, conflict with live
migration – scalability of services requires extra
external HA load balancer (hard or software)
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Flexibility through mix and match of topologies• HA approaches are not either-or - they can be combined
– start out with blackbox HA – separation in tiers allows to adapt each tier as required – ability to use e.g. MySQL cluster within a zone cluster without changing the
overall architecture
• both topologies have security isolation between tiers by design • scalability can be addressed by component as needed by specific
use cases – some need to scale horizon as users bang on the BUI – some may not require BUI, instead focus on usage of OpenStack CLI or Heat – option to use cluster load balancer, but also switch to hardware load balancer
27
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 28
Discussion - Q / A
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
References• Oracle Openstack for Oracle Solaris
http://www.oracle.com/technetwork/server-storage/solaris11/technologies/openstack-2135773.html
• Oracle Solaris Clusterhttp://www.oracle.com/technetwork/server-storage/solaris-cluster/overview/index.html
• Oracle Solaris Cluster technical resourceshttp://www.oracle.com/technetwork/server-storage/solaris-cluster/documentation/cluster-how-to-1389544.html
• White Paper: Providing High Availability to the OpenStack Cloud Controller on Oracle Solaris with Oracle Solaris Clusterhttp://www.oracle.com/technetwork/server-storage/solaris-cluster/documentation/ha-for-openstack- cloud-2537455.pdf
29
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 30