Post on 22-May-2015
description
DB2 pureScale
DB2 pureScale: Installation & Instance Management
October 2010
Jessica Rockwood (jessicae@ca.ibm.com)
© 2010 IBM Corporation
Agenda
� Easy installation of DB2 pureScale� Starting and stopping the pureScale cluster� Maintaining the pureScale cluster
– Managing the clustered file system
– Growing/shrinking the cluster
DB2 pureScale © 2010 IBM Corporation2
– Growing/shrinking the cluster
– System maintenance
� Configuring the cluster caching facilities and buffer pools� Monitoring tools
� A system meeting OS, hardware, software and communications requirements
– Standard OS image & pre-requisite components
Typical DB2 ESE installation preparation
System A
/opt
/home/db2inst1
DB2 pureScale © 2010 IBM Corporation3
– Local (internal) storage for installation image
– Filesystems on external storage for database
– Ethernet network
– Instance and fenced userid
– Root access for installation
/db2autostoragepaths
/db2logs
� Multiple systems meeting OS, hardware, software and communications requirements
– Standard OS image & pre-requisite components
– Local (internal) storage for installation image
DB2 pureScale installation preparationSystem B
/opt
/home/db2sdin1
System D
/opt
/home/db2sdin1
System C
/opt
/home/db2sdin1
System A
/opt
/home/db2sdin1
DB2 pureScale © 2010 IBM Corporation4
– Local (internal) storage for installation image
– Direct SAN connected disk storage with shared disk space (for instance shared directory, database data/logs, and tiebreaker disk)
– Ethernet and infiniband network
– Instance and fenced userid with same uid, gid, home directory on all systems
– Root access for installation
– Ssh passwordless access for root userid across all systems
/dev/hdisk1
/dev/hdisk2
/dev/hdisk3
…
/home/db2sdin1/home/db2sdin1
Installation & instance setup
� DB2 pureScale installation will automatically install:
– DB2 pureScale feature
– Reliable scalable cluster technology (rsct)
– Tivoli System Automation for Multi Platform (SA MP)
– General Parallel File System (GPFS)
DB2 pureScale © 2010 IBM Corporation5
� Instance setup will automatically:
– Create and configure GPFS cluster and instance shared filesystem on all hosts
– Create and configure SA MP peer domain on all hosts
– If required, create the instance user on all hosts
– Setup instance environment on all hosts ($HOME/sqllib)
– Setup ssh passwordless access between all hosts for the instance user
– Create DB2 pureScale instance and define cluster caching facilities (CFs) and members
Installation & instance setup options
� db2setup utility– Launched from the install-initiating host (IIH) which is any system that
will be part of the DB2 pureScale cluster
– With a single invocation:
– Installs DB2 pureScale on all hosts in the cluster
DB2 pureScale © 2010 IBM Corporation6
– Creates the DB2 pureScale instance with all CFs and members
– Can be run either in GUI mode (as a wizard) or with a response file
– GUI: db2setup –l /tmp/db2setup.log –t /tmp/db2setup.trc
– Response file: db2setup –r db2dsf.rsp –l /tmp/db2setup.log –t /tmp/db2setup.trc
db2setup wizard – DB2 Cluster File System
DB2 pureScale © 2010 IBM Corporation7
db2setup wizard – Host List
DB2 pureScale © 2010 IBM Corporation8
Installation & instance setup options
� Manual installation & instance creation– db2_install utility needs to be run on the install-initiating host (IIH) to
install DB2 pureScale
db2_install –p ese_dsf –b /opt/IBM/db2/V9.8 –t
/tmp/db2_install.trc –l /tmp/db2_install.log
– Create the DB2 pureScale instance with the primary CF and first
DB2 pureScale © 2010 IBM Corporation9
– Create the DB2 pureScale instance with the primary CF and first member (run from host 1 or host 2)
db2icrt –d –instance_shared_dev /dev/hdisk1 -tbdev
/dev/hdisk2 –cf host1:host1_ib -m host2:host2_ib
-u db2sdin1 db2sdin1
– Add the secondary CF
db2iupdt -add –cf host4:host4_ib –d db2sdin1
– Add additional members
db2iupdt –add –m host3:host3_ib –d db2sdin1
Starting and stopping DB2 pureScale
� Starting the cluster
– Global DB2 pureScale start will start the CFs first and then the members
db2start
– Individual CF or member start
db2start [member] 0 or db2start [cf] 128
– Note: If starting a member and no CFs are started, the CFs will be started
DB2 pureScale © 2010 IBM Corporation10
– Note: If starting a member and no CFs are started, the CFs will be started before the member is started
� Stopping the cluster
– Global DB2 pureScale stop will stop the members first and the CFs
db2stop [force]
– Individual CF or member stop
db2stop [cf|member] [identifier] [force] [quiesce
[timeout_in_min]]
Instance and Host Status
Instance statusDB2 DB2
Single Database View
coralpib132 coralpib133
Clients > db2start08/24/2008 00:52:59 0 0 SQL1063N DB2START processing was successful.
08/24/2008 00:53:00 1 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
> db2instance -list
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT
-- ---- ----- --------- ------------ -----
0 MEMBER STARTED coralpib132 coralpib132 NO
1 MEMBER STARTED coralpib133 coralpib133 NO
128 CF PRIMARY coralpib131 coralpib131 NO
DB2 pureScale © 2010 IBM Corporation11
0 coralpib132 0 coralpib132_ib0 - MEMBER
1 coralpib133 0 coralpib133_ib0 - MEMBER
128 coralpib131 0 coralpib131_ib0 - CF
db2nodes.cfg
Host statusCF
Shared Data
coralpib131 HOST_NAME STATE INSTANCE_STOPPED ALERT
--------- ----- ---------------- -----
coralpib131 ACTIVE NO NO
coralpib132 ACTIVE NO NO
coralpib133 ACTIVE NO NO
Shared Data
Automatic restart of members and CFs
� If a member fails, the cluster services will automatically restart the member
– If it cannot restart on its home host, cluster services will move the member to another host for restart light
� If a CF fails, the cluster services will automatically restart the CF
– If it is a primary CF that fails, the secondary CF will take over as primary as
DB2 pureScale © 2010 IBM Corporation12
– If it is a primary CF that fails, the secondary CF will take over as primary as long as it was in Peer state
� If cluster services cannot automatically restart a member or CF, it will raise an alert
– Alert state is visible in ALERT column of ‘db2instance –list’ output
– Alerts can be viewed with ‘db2cluster –list –alert’ command
– Alerts must be cleared before cluster services can retry automatic restart using ‘db2cluster –clear –alert’
Managing the clustered file systems
� Add disk to an existing filesystem (and optionally rebalance)
db2cluster –cfs –add –filesystem db2fs1 –disk /dev/hdisk3
db2cluster –cfs –rebalance –filesystem db2fs1
DB2 pureScale © 2010 IBM Corporation13
� Create a new clustered file systemdb2cluster -cfs –create –filesystem db2fs2 –disk
/dev/hdisk3 –mount /db2_logs
� Remove a filesystem and its diskdb2cluster -cfs –remove –filesystem db2fs2 –disk
/dev/hdisk3
Managing the clustered file systems
� List clustered file systemsdb2cluster -cfs -list –filesystem
FILE SYSTEM NAME MOUNT_POINT
-------------------------- -------------------------
db2fs1 /db2sd_20100223191453
DB2 pureScale © 2010 IBM Corporation14
� List disks in a given clustered file systemdb2cluster -cfs -list -filesystem db2fs1 –disk
PATH ON LOCAL HOST OTHER KNOWN PATHS
------------------------- -------------------------
(*) /dev/hdisk2
Growing/shrinking the DB2 pureScale cluster
� A maximum of two CFs and 128 members can be defined in a DB2 pureScale clusterdb2iupdt -add –cf host4:host4_ib db2sdin1
db2iupdt –add –m host3:host3_ib db2sdin1
DB2 pureScale © 2010 IBM Corporation15
� To shrink the DB2 pureScale cluster by removing members or removing the secondary CFdb2iupdt –drop –m host3 db2sdin1
db2iupdt –drop –cf host4 db2sdin1
System maintenance in DB2 pureScale
� Stop the individual CF or member on the host where maintenance will be applied
db2stop member 1 quiesce 1
– Note: If the primary CF is stopped, then the secondary CF will take over
� Stop the instance on that same host to mark the host as unavailable
DB2 pureScale © 2010 IBM Corporation16
� Stop the instance on that same host to mark the host as unavailable for failover
db2stop instance on host3
� Apply the system maintenance
� Restart the instance on the host and restart DB2 on the hostdb2start instance on host3
db2start member 1
Configuring cluster caching facilities (CF)
� In the database manager configuration:CF Server Configuration:
Memory size (4KB) (CF_MEM_SZ) = AUTOMATIC(158720)
Number of worker threads (CF_NUM_WORKERS) = AUTOMATIC(1)
Number of connections (CF_NUM_CONNS) = AUTOMATIC(14)
Diagnostic error capture level (CF_DIAGLEVEL) = 2
Diagnostic data directory path (CF_DIAGPATH) =
DB2 pureScale © 2010 IBM Corporation17
Diagnostic data directory path (CF_DIAGPATH) =
� In the database configuration:CF Resource Configuration:
CF database memory size (4KB) (CF_DB_MEM_SZ) = AUTOMATIC(4616448)
Group buffer pool size (4KB) (CF_GBP_SZ) = AUTOMATIC(3673856)
Global lock memory size (4KB) (CF_LOCK_SZ) = AUTOMATIC(692480)
Shared communication area size (4KB) (CF_SCA_SZ) = AUTOMATIC(248320)
Catchup target for secondary CF (mins)(CF_CATCHUP_TRGT) = AUTOMATIC(15)
Configuring buffer pools
� Group buffer pool (GBP) is always sized in 4k pages, regardless of local buffer pools (LBP) page size
– Note that the GBP by default has 20% of memory set aside for managing LBP page registrations
� GBP size is generally larger than a single LBP size
– Making LBP size too large can actually decrease GBP hit ratio, since each LBP
DB2 pureScale © 2010 IBM Corporation18
– Making LBP size too large can actually decrease GBP hit ratio, since each LBP page consumes some GBP memory which can’t then be used for data pages
� GBP size = 35-40% of (sum of all LBP sizes)
– e.g. LBP size = 1M 4k pages, 4 members => GBP = ~1.5M pages
– The heavier the write activity of the workload, the more modified pages the GBP will contain, so the more benefit from a larger GBP.
– This RoT predominantly applies in clusters of 3 or members
DB2 pureScale
DB2 pureScale: Monitoring
DB2 pureScale © 2010 IBM Corporation19
October 2010
Andrew Murchison
© 2010 IBM Corporation
Monitoring Overview
� Expands on V97 monitoring functionality� Workload table functions – column and XML (_DETAILS suffix)
� MON_GET_UNIT_OF_WORK & _DETAILS
� MON_GET_CONNECTION & _DETAILS
� MON_GET_WORKLOAD & _DETAILS
� MON_GET_SERVICE_SUBCLASS & _DETAILS
� MON_GET_PKG_CACHE_STMT & _DETAILS
� MON_GET_ACTIVITY_DETAILS
Bufferpool table functions - column
DB2 pureScale © 2010 IBM Corporation20
� Bufferpool table functions - column� MON_GET_TABLEPACE
� MON_GET_BUFFERPOOL
� Event monitors� Activity
� Statistics
� Package Cache
� pureScale specific table functions– MON_GET_CF
� Useful summary views
Monitoring - Locking
� pureScale-specific locking metrics for lock contention across members� Workload metrics
� Available in the workload table functions and the event monitors
� LOCK_WAITS_GLOBAL, LOCK_WAIT_TIME_GLOBAL
� Number of and milliseconds spent in waits for locks held by applications on other members
� Are component elements of the existing LOCK_WAITS and LOCK_WAIT_TIME elements
� LOCK_TIMEOUTS_GLOBAL
� Number of times a timeout occurred while waiting for a lock held by an application on another
DB2 pureScale © 2010 IBM Corporation21
� Number of times a timeout occurred while waiting for a lock held by an application on another member
� LOCK_ESCALS_MAXLOCKS, LOCK_ESCALS_LOCKLIST, LOCK_ESCALS_GLOBAL
� Number of lock escalations occurring due to exceeding the maxlocks, locklist, and cf_lock_szdatabase configuration parameters, respectively.
� All sum to equal the existing LOCK_ESCALS element
� Locking event monitor� No new information, but adapted for cross-member lockwait and timeout events
� MON_GET_APPL_LOCK, MON_GET_APPL_LOCKWAIT� Report on current locks on the system
� Both adapted to work correctly in pureScale environments
� no new columns
Monitoring – Database Objects
� Global page statistics� Available in the workload table functions and the event monitor
� reclaim_wait_time, spacemappage_reclaim_wait_time
� Distinct values describing time spent performing page reclaims from other members.
� A reclaim means a page has been updated by another member and must be refreshed on the local member
� A spacemap page is a special data page describing where other pages are located.
DB2 pureScale © 2010 IBM Corporation22
� Cluster caching facility communication� Available in the workload table functions and the event monitor
� cf_waits, cf_wait_time
� The number of and milliseconds spent in communications with the Cfs.
� Time is from when the connection is opened to when it is closed
� Only includes time to send until acknowledgment of receipt, not time to fulfill any request (ie, lock_wait_time_global)
Monitoring – Group Bufferpool� Monitoring of the group bufferpool activity� Independent of related local bufferpool metrics
� Example: One POOL_DATA_L_READS will result in 0 or more POOL_DATA_GBP_L_READS
� Workload metrics� Included in the workload and bufferpool table functions, and the event monitors
� NOTE: XDA (ie, XML) elements are new for V98 FP3
� POOL_DATA_GBP_L_READS, POOL_INDEX_GBP_L_READS, POOL_XDA_GBP_L_READS
� Number of requests for the given page type from the group bufferpool (logical reads)
� The page in the local bufferpool was either not present or invalid
� POOL_DATA_GBP_P_READS, POOL_INDEX_GBP_P_READS, POOL_XDA_GBP_P_READS
DB2 pureScale © 2010 IBM Corporation23
� Number of times a disk access was required due to the page not being present in the group bufferpool (physical reads)
� Less than or equal to the the number of logical reads
� POOL_DATA_LBP_PAGES_FOUND, POOL_DATA_LBP_PAGES_FOUND, POOL_DATA_LBP_PAGES_FOUND
� Number of times a page was found in the local bufferpool, either valid or invalid.
� Always less than or equal to the number of logical reads.
� Example: POOL_DATA_LBP_PAGES_FOUND <= POOL_DATA_L_READS
� POOL_DATA_GBP_INVALID_PAGES, POOL_DATA_GBP_INVALID_PAGES, POOL_DATA_GBP_INVALID_PAGES
� Number of times an invalid page was found in the local bufferpool, requiring a logical read from the group bufferpool
� May occur after an otherwise successful group bufferpool logical read
� Bufferpool metrics for prefetchers� Included in the bufferpool table functions
� Same as above, but with ASYNC qualifier. Example: POOL_ASYNC_DATA_GBP_L_READS
� Are components of the non-async total in the bufferpool table functions
Monitoring – Cluster Facilitator
� MON_GET_CF table function
� Retreives configured, target and current memory information for cluster caching facilities on the system, in number of 4K pages
� Current – the amount of memory currently allocated
� Configured – the maximum amount of memory that will be used, per configuration
� Target – during a dynamic resize, the amount that will be allocated when it completes
Columns
DB2 pureScale © 2010 IBM Corporation24
� Columns
� HOST_NAME, ID, DB_NAME
� CURRENT_CF_LOCK_SIZE, CONFIGURED_CF_GBP_SIZE, TARGET_CF_GBP_SIZE
� Group bufferpool memory
� CURRENT_CF_LOCK_SIZE, CONFIGURED_CF_LOCK_SIZE, TARGET_CF_LOCK_SIZE
� Global lock memory
� CURRENT_CF_SCA_SIZE, CONFIGURED_CF_SCA_SIZE, TARGET_CF_SCA_SIZE
� Shared communications area memory
� CURRENT_CF_MEM_SIZE, CONFIGURED_CF_MEM_SIZE
� Total memory usage
Monitoring – Summary and other views� Summary views provide useful statistics based on data returned by the
table functions
� List of summary views
� MON_PKG_CACHE_SUMMARY, MON_SERVICE_SUBCLASS_SUMMARY, MON_WORKLOAD_SUMMARY, MON_CONNECTION_SUMMARY, MON_DB_SUMMARY, MON_BP_UTILIZATION, MON_TBSP_UTILIZATION
� Some useful columns
DB2 pureScale © 2010 IBM Corporation25
� TOTAL_BP_HIT_RATIO_PERCENT, TOTAL_GBP_HIT_RATIO_PERCENT, CF_WAIT_TIME_PERCENT
� ENV_CF_SYS_RESOURCES
� Returns cluster caching facility system resource information, in row format
� Columns
� ID – CF identifier
� NAME – Resource name
� VALUE – the value of the name resource
� DATATYPE – The data type of the value (example: BIGINT)
� UNIT – the data unit of the value, if applicable (example: 4K)
Monitoring – Miscellaneous
� Write to table event monitors restart automatically in member-down scenarios
� MONREPORT modules will work in pureScale� Unformatted event table formatters
� Formats binary data from the event table into consumable information.
� EVMON_FORMAT_UE_TO_XML
� EVMON_FORMAT_UE_TO_STREAM
DB2 pureScale © 2010 IBM Corporation26
� EVMON_FORMAT_UE_TO_TABLES
� XML Column to Row formatters� Each Metric is presented as a single row of a table
� FORMAT_XML_WAIT_TIMES_BY_ROW
� FORMAT_XML_COMPONENT_TIMES_BY_ROW
� FORMAT_XML_TIMES_BY_ROW
� FORMAT_XML_METRICS_BY_ROW