Conducting a HANA Symphony - SUSECON · Conducting a HANA Symphony SAP HANA Scale-Out Automation...
Transcript of Conducting a HANA Symphony - SUSECON · Conducting a HANA Symphony SAP HANA Scale-Out Automation...
Conducting a HANA SymphonySAP HANA Scale-Out Automation
Fabian Herschel Peter SchinaglSenior Architect SAP LinuxLab Senior Architect
2
Agenda
SUSE Linux Enterprise Server Overview
Automate SAP HANA System Replication
SAPHanaSR Scale-Out
Scenarios and Use-Cases
3
SUSE Linux Enterprise Server for SAP Applications
- Evolution -
SLES
• SAP's development platform
• Certified• SAP specific
packages• Priority
Support for SAP
2006SLES for
BAiO Fast Start
• Installation Wizard
• Hardware bundles
2008SLES for
SAP Enterprise Search
• Installation Framework with Cluster Support
• ESPOS• SLES 10 SP3
2010SLES for
SAP Applications
• Installation Framework supports generic SAP installations
• SLES 11 SP1
2011SLES for
SAP Applications
• SAP HA certification
• ClamSAP• SLES 11 SP2
2012SLES for
SAP Applications
• SAP BOne HANA
• SLES 11 SP3
20131999 2014
• SAP HANA System-replication automation
• SAP HANA security
SLES forSAP
Applications
4
Simplify Linux for SAP WorkloadsSUSE Linux Enterprise Server for SAP Applications 11
Reliable, Scalable and Secure Operating System
SUSE Linux Enterprise Server
High AvailabilitySAP NetWeaver & SAP HANA
Page Cache
Management
AntivirusClamSAP
SAP HANASecurity
SimplifiedOperations
Management
InstallationWizard
Faster Installation
Extended Service Pack Support18 Month Grace Period
24x7 Priority Support for SAP
24x7 Priority Support for SAP
SAP HANA HA
Resource Agent
5
The recommended and supported operating system for SAP HANA
SUSE Linux Enterprise Server
6
Example Hardware Evolution
2 Sockets 4 Sockets 8 Sockets
2 Cores Up to 8 Cores Up to 15 Cores
No HT HT HT
= 4 CPUs = 32 CPUs = 120 CPUs
= 64 HT CPUs = 240 HT CPUs
8
#1 Platform for SAP HANASUSE Linux Enterprise Server for SAP Applications
Automate SAP HANA System Replication
10
Disaster recovery (DR)between Datacenter
HighAvailability (HA) per Datacenter
SAP HANA Business Continuity
HWSAP
Business Continuity
SAP HANA Host Auto Failover(scale out with standby)
SAP HANA System Replication SAP HANA System Replication
SAP HANA Storage Replication
SAP
HW
SAP
11
SAP HANA SystemReplication
“sr_takeover” is a Manual process
Automate SAP HANA System Replication
12
SAP HANA SystemReplication
Automates“sr_takeover”
Automate SAP HANA System Replication
SUSE High Availability Solution
13
Automate SAP HANA System Replication
SUSE High Availability Solution
SAP HANA SystemReplication
Improves
Service Level Agreement
14
SAP HANA System ReplicationPowered by SUSE High Availability Solution
resource failover
active / active
node 1 node 2
N M
A B
N M
A B
HANADatabase
HANAmemory-preloadA B
SystemReplication
HANA PR1primary
HANA PR1secondary
Performance optimized Secondary system completely used for the preparation of a possible take-over Resources used for data pre-load on Secondary Take-overs and Performance Ramp shortened maximally
15
From Concept to ImplementationSUSE High Availability Solution for SAP HANA
SAP HANAPrimary
SAP HANASecondary
vIP
SAPHana Master/Slave ResourceMaster Slave
SAPHanaTopology Clone Resource
Clone Clone
suse01 suse02
Cluster Communication
Fencing
16
Four Steps to Install and Configure
Install SAP HANA
Configure SAP HANA System Replication
Install and initialize SUSE Cluster
Configure SR Automation using HAWK wizard
17
SAPHanaSR HAWK Wizard
18
What is the Delivery?SUSE Linux Enterprise Server for SAP Applications
The package SAPHanaSR● the two resource agents
● SAPHanaTopology● SAPHana
● HAWK setup Wizard (as technical preview)
The package SAPHanaSR-doc● the important SetupGuide
Use-cases
Allowed Scenarios
• Scale-Up performance-optimized (syncron =>)A => B
• Scale-Up in a chain or multi tier (asyncron ->)A => B -> C
• Scale-Up in a cost-optimized scenario (+)A => B + Q
• Scale Up in a mixed scenario A => B -> C + Q
• Now all with multi tenancy (%) - here cost optimized%A => %B + %Q
Single-tier System Replication
Pacemaker
System Replication
node 1 node 2
SAP HANAPR1primary
SAP HANADEV / PR1 secondary
SystemPR1
vIP
SystemPR1
Performance optimized ( A => B )● Secondary system completely used for the preparation of a possible take-over ● Resources used for data pre-load on Secondary● Take-overs and Performance Ramp shortened maximallyon-prod usage load
starting with version 0.149
Single-tier System Replication and DEV / QAS
Pacemaker
System Replication
node 1 node 2
SAP HANAPR1primary
SAP HANADEV / PR1 secondary
SystemPR1
vIP
SystemPR1
SystemDEV
Cost optimized ( A => B + C )●Operating non-prod systems on Secondary●Resources freed (no data pre-load) to be offered to one or more non-prod installations●During take-over the non-prod operation has to be ended●Take-over performance similar to cold start-up●Needs another disk stack for non-prod usage load
Multi Tier System Replication – Cascading Systems ( A => B -> C )
Datacenter Datacenter
asyncronsyncron
Production Local standbywith data preload
Remote standby systemwith or without preload(mixed usage with non-prod.)
Available since SAP HANA SPS7
(Three cascading systems)
Multi Tenancy (MCOD)Synchronizing multiple Databases within one System Replication
Multiple Components One Database (MCOD)Performance optimized %A => %BCost optimized %A => %B -> %CMulti tier %A => %B + %Q
Pacemaker
System Replication
node 1 node 2
SAP HANAPR1primary
SAP HANAPR1secondary
SystemPR1
vIP
SystemPR1
beginning with version 0.151
Sys
A B
Tenants are databases within the SAP HANA database systemSystem replication only replicates the complete database
Sys
A B
Actual development
Available with SP1
SAP HANA Scale-Out
Pacemaker
System Replication
Cluster 1
vIP
Cluster 2
A SAP HANA scale-out database consists of multiple nodes and SAP HANA instances.
Each worker(l) node has it's own data partition.
Standby(l) nodes do not have a data partition.
SAPHanaSR for Scale-Out
37
SAP HANA Scale-Out ExplainedWorker and Standby Nodes
A SAP HANA scale-out database consists of multiple nodes and SAP HANA instances.
Each worker node has it's own data partition.
Standby nodes do not have a data partition.
38
SAP HANA Scale-Out ExplainedMaster and Slave Nodes
A SAP HANA scale-out database consists of several services such as master nameserver (M).
The active master nameserver takes all client connections and redirect the client to the proper worker node. It always has data partition 1.
Master candidates could be worker or standby nodes.
Typically there are 3 nodes which could get active master name-server
39
SAP HANA Scale-Out – Worker FailureFailing Worker Node or Instance
If a normal worker node failed, client could still connect to the SAP HANA database.
However answers which needs data from the failed node could not be processed.
SAP HA tries to repair this situation using a standby node.
40
SAP HANA Scale-Out – Worker FailureFailing Worker Node
First of all the SAP HANA HA storage API must guarantee, that the old node does not longer have access to the data (SAP STONITH).
After the data partition is “free” the failover could be processed.
41
SAP HANA Scale-Out – Worker FailureFailing Worker Node
Any available standby node could take the “lost” data partition.
The standby node is now a worker node and loads the data.
The active master nameserver will now redirect clients to the new node.
The old worker will be a standby once again available.
42
Summary
SAPHanaSR detects all failovers of worker nodes.
SAPHanaSR checks the over-all landscape status of the SAP HANA database.
SAPHanaSR “follows” the decision of the SAP HA and checks, if the failover is successful.
SAP HANA Scale-Out – Worker FailureFailing Worker Node
43
SAP HANA Scale-Out – Master FailureFailing Master Node
The active master nameserver is failing. All client connections are blocked.
As the active master nameserver is also a worker node SAP HA needs to failover the active master role including the worker part.
44
The data partition 1 needs to be released (SAP STONITH).
One of the master nameserver candidates try to failover the active master nameserver role.
In best this should be a standby node, because otherwise it´s data partion would be need to failover.
SAP HANA Scale-Out – Master FailureFailing Master Node
45
One of the master name server candidates wins, mounts the data partition 1 and loads the data.
In the SAP HANA landscape this new node is shown as active master nameserver.
SAP HANA Scale-Out – Master FailureFailing Master Node
46
Summary
SAPHanaSR detects the failover of the active master nameserver and migrates the virtual IP address to that node.
SAPHanaSR allows clients to process a transparent reconnect and do not need to be configured for multiple access addresses.
SAPHanaSR enables also high availability for software which is not able to connect to different IP addresses.
SAP HANA Scale-Out – Master FailureFailing Master Node
47
SAP HANA Scale-Out – Standby FailureFailing Standby Node or Instance
A SAP HANA standby fails. It could be either a master nameserver candidate or a “plain” standby.
SAP HA does typically not repair this situation.
The running SAP HANA database is not directly influenced, but the HA capacity of the site gets degraded.
48
SAP HANA Scale-Out – Standby FailureFailing Standby Node
Summary
SAPHanaSR detects the outage of the SAP HANA standby node or instance.
SAPHanaSR restarts the failed SAP HANA standby instance, if the node is still part of the pacemaker cluster or rejoining the cluster.
SAPHanaSR takes care of the SAP HA failover “capacity” and increases the build-in SAP high availability.
SAPHanaSR checks, if the situation allows the restart of the standby or not.
49
SAP HANA Scale-Out System replicationScale Out with System Replication (SR)
A Scale-Out SR scenario consists of two SAP HANA Scale-Out database systems
50
SAP HANA Scale-Out System replicationNodes and Services
Synchronisation of a Scale-Out is done pairwise by all worker nodes and services like tenants.
system replication status SOK
51
SAP HANA Scale-Out System replicationFailing Synchronization
Each single replication could failSAPHanaSR detects such failures and excludes the secondary from site takeover.
SFAIL
52
SAP HANA Scale-Out System replicationFailing Primary
SOK
SAPHanaSR detects the failing primary. Depending on the configuration and the system replication status a takeover is processed
53
SAP HANA Scale-Out System replicationFailing Primary
SOK → SFAIL
SAPHanaSR processes the takeover to the secondary site and switches the virtual IP address so clients could transparently reconenct.
54
SAP HANA Scale-Out System replicationFailing Primary
SFAIL → SOK
SAPHanaSR could process a registration of the failed primary, depending on the configuration and checks if the new SR pair gets in sync.
55
SAP HANA Scale-Out System replicationFailing Secondary
SOK → SFAIL
SAPHanaSR detects failing secondary sites and handels the tracking of the system replication status to prevent sub-optimal takeovers.
56
SAP HANA Scale-Out System replicationFailing Secondary
SFAIL → SOK
SAPHanaSR processes the restart of the secondary site and checks the system replication status to allow optimal takeovers.
57
SAPHanaSR Scale-Out ConductingTypical Failures and Reactions
Failure SAPHanaSR
Worker fails -node or instance
SAP HA processes failover. If SAP HA fails, SAPHanaSR processes a takeover or restart.
Active master nameserver fails -node or instance
Like the worker failure. In addition SAPHanaSR migrates the virtual IP address to the new active master nameserver.
Standby fails -node or instance
SAPHanaSR processes a instance restart to re-establish the full SAP HA capacity.
Primary site fails SAPHanaSR processes a takeover on secondary or restart of the failed primary depending on configuration and system replication status.
Standby site fails SAPHanaSR processes a database system restart to re-establish SAP HANA system replication.
58
SUSE SAPHanaSR in 3 Facts
Reduces complexity- provides a wizard for easy configuration with just SID, instance number and IP address- automates the sr-takeover and IP failover ("bind")
Reduces risk- includes always a consistent picture of the SAP HANA topology- provides a choice for automatic registrations and site takeover preference
Increases reliability- provides short takeover times in special for table preload scenarios- includes the monitoring of the system replication status to increase data consistency
Thank you.
59
Start your SAPHanaSR projecttoday and visitwww.suse.com/products/sles-for-sap/
Corporate HeadquartersMaxfeldstrasse 590409 NurembergGermany
+49 911 740 53 0 (Worldwide)www.suse.com
Join us on:www.opensuse.org
60
Unpublished Work of SUSE LLC. All Rights Reserved.This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE LLC. Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.
General DisclaimerThis document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for SUSE products remains at the sole discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.