IA B47 - NetBackup Performance Tuning: Lessons From the Fieldvox.veritas.com › legacyfs › online...

34
1 IA B47 - NetBackup Performance Tuning: Lessons From the Field David Smiley Senior Principal Business Critical Engineer NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Transcript of IA B47 - NetBackup Performance Tuning: Lessons From the Fieldvox.veritas.com › legacyfs › online...

1

IA B47 - NetBackup Performance Tuning: Lessons From the Field

David Smiley Senior Principal Business Critical Engineer

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Agenda

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013 2

NetBackup Architecture and Scaling 1

Server Choices 2

NetBackup Tuning 3

OS Tuning 4

Partnership with BCS 5

3

Business Critical Service Plans – At a Glance

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

3

Business Critical Services

Advanced Access

Top of queue rapid reactive response

Remote Product

Specialist

Direct access to a named technical guru

Premier

A Customized comprehensive mission critical service solution delivered by a dedicated

support team

Dedicated Residency Services

Dedicated onsite technical expert

Managed Enterprise Vault

End to end management of Enterprise Vault technology and data

Managed Back Up

End to end management of backup technology and data

4

Master

Media

Clients

86400 jobs per day (based on 1 job per second) Catalog size of 750GB +-

150+ per Master No real limit on disk STU’s 256 tape drives per Media Server LAN/HBA/PCIe limits are variable Number of Clients based on bandwidth

No hard limit per Media Server Limits based on bandwidth and backup window

NetBackup – Basic Architecture – How Big?

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• Master specifications are based on: – Number of Clients – Amount of data being backed up – Number of tape drives – Amount of disk in DSU’s – Number of Media Servers being managed – Number of jobs per day

5

As Always – “It Depends”

Rule of Thumb: Multiple Physical CPU’s and Cores Match RAM to Cores (2GB per Core)

Server Choice Overview - Master

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• Media specifications are based on: – Bandwidth coming into server (normally LAN) – Amount of data being moved – What else is running – Tuning goals – Bandwidth out to tape drives/disk

6

Again – “It Depends”

Rule of Thumb: Multiple Physical CPU’s and Cores is still good More RAM for tuning buffers PCIe slots are critical for I/O Fast LAN and/or HBA’s are critical

Server Choices – Media Server

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• Master or Media Servers:

– Windows/Linux - HP ProLiant DL580G8 • 4 CPUs with up to 8 cores on each • Up to 1TB RAM • 11 PCIe Expansion Slots

– SUN Unix – M Series • Configuration based on need • Better than early T-Series for Master • T4/5 can be used effectively with OS tuning

7

Most modern servers work great as a Master or Media Server. Before choosing hardware, test the server for Sybase

performance if possible.

Basic NBU Server Choices - Examples

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• What if you want to run MSDP? – Same CPU recommendations as a normal media server BUT you need

1GB RAM for each TB of disk storage for cache – Dedupe hashing does not really put much load on a modern server – Avoid T2 and T3 Niagara chip servers as MSDP media servers

• What if you want to run MSEO – MSEO puts a great deal of load on the CPU’s – If you want to use it, make sure to use fast CPUs and as many of them as

you can allocate

8

Advanced NBU Servers – Recommendations

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

9

SAN Media Server

SAN Client

FT Media Server

10GbE Link

Media Server

How To Move A LOT of Data?

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• Tuning is more than playing with buffers – It is about making sure that the path – end to end – is adequate

• What works for you may not work for someone else

• What if You Don’t Tune Correctly? – Performance can actually be reduced from NBU defaults – Performance issues where the speeds do not match the expectations – Incorrect hardware purchases to solve problems

10

The bottom line? It is all about making use of the available bandwidth

Tuning – What does it mean?

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• Out of the box, NBU is partially tuned, but it needs more • No exact recommendation. Testing is needed for high

performance • Think of Data as a Pool of Water and Buffers as Buckets

11

Tuning – NetBackup

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• Increase the size and number of the “buckets” – SIZE_DATA_BUFFERS – NUMBER_DATA_BUFFERS – SIZE_DATA_BUFFERS_DISK – NUMBER_DATA_BUFFERS_DISK – NET_BUFFER_SZ – NET_BUFFER_SZ_REST – A few others, but these are the “bang for the buck” settings

• Use at least 5GB of “real” data for tuning • NBU Performance Tuning Guide: http://www.symantec.com/docs/doc4483

• Buffers and how they work: http://www.symantec.com/doc/TECH1724

• The defaults in 7.5 are 256KB for SIZE_DATA_BUFFERS and 30 for NUMBER_DATA_BUFFERS

12

Tuning – NetBackup – How To

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

•SIZE_DATA_BUFFERS

•NUMBER_DATA_BUFFERS

13

NetBackup will dump more data into the bucket before “emptying” it to the tape drive. On a fast system with fast drives, this makes streaming better

NetBackup will have more buffers to fill so while one is “dumping” it can be filling others. Very important when using MPX and Multi-Streaming. Tune the Size first, then work on the Number

Tuning – NetBackup – Closer Look

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• Use log files, not Activity monitor – bptm log can be used to grep Kbytes against log file

• Unix – grep/awk/sed are very nice tools

• Windows – Get Textpad to handle larger log files and parse them – Find command is similar to grep

14

Tuning – NetBackup – Options to check speeds

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

– grep –i "kbytes/sec" in bptm log (or a bperror log)

– grep –i "waited for" in bptm log

15

Tuning – NetBackup – Actual Tuning

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

NBU Buffer Tuning – Increasing VM Backup Performance

NUMBER_DATA_BUFFER_DISK SIZE_DATA_BUFFERS_DISK Avg. VM

Throughput (MB/Sec)

Total VM Throughput

(MB/Sec)

DEFAULT DEFAULT 24.167 386.7

128 262144 24.034 384.5

256 262144 24.378 390.28

256 524288 28.150 450.41

512 1048576 28.60 457.54

1024 524288 28.31 453.02

1024 1048576 27.47 439.56

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

17

• This is an often overlooked tunable setting. The default is only 32kb on most UNIX platforms.

• These settings only affect backups, restores and duplications going over a network, not SAN client or local media server backups

• On Windows this defaults to:

•For backup jobs: (<SIZE_DATA_BUFFERS> * 4) + 1024 •For restore jobs: (<SIZE_DATA_BUFFER S> * 2) + 1024

• Setting this to at least 262144 is advised • Use a SIZE_DATA_BUFFERS of at least 262144 on Windows and then let this value auto set • Duplication performance is also affected by NET_BUFFER_SZ_REST and dramatic gains are possible

NET_BUFFER_SZ and NET_BUFFER_SZ_REST

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

18

• Set of directives to test throughput and performance in a repeatable fashion

• Reduces impact on client but does use client’s network path

GEN_DATA Start of directives for generating test data. Any subsequent file list entries, other than NEW_STREAM and GEN* entries, will be ignored. GEN_KBSIZE=1 Specify the size in KB of each generated file. GEN_MAXFILES=1 Specify the total number of files to generate. GEN_PERCENT_RANDOM=0 Specify the amount of the generated file's data that should be random. This affects the compressibility of the generated data, with a value of 0 resulting in completely compressible data, and a value of 100 resulting in uncompressible data. GEN_PERCENT_INCR=100 Specify the percentage of the total files that will be generated for an incremental backup. • Further tunables for testing de-duplication workloads

• http://www.symantec.com/docs/TECH75213

Tuning for UNIX Client Backups using GEN_DATA

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Windows Client NetBackup Tuning

19

• Communications buffer size is also HKLM\Software\Veritas\NetBackup\CurrentVersion\Config\Buffer_Size

• Raw partition read buffer size is for FlashBackup as well as Raw partition backups on Windows

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Adjusting Batch Size for sending MetaData to the Catalog

20

• Can be used to tune problems with backing up file systems with many files and also file adds into catalog exceeding bpbrm timeout

•/usr/openv/netbackup/MAX_FILES_PER_ADD – affects all backups, default is 5,000

•/usr/openv/netbackup/FBU_MAX_FILES_PER_ADD – affects FlashBackup, default is 95,000

•/usr/openv/netbackup/CAT_BU_MAX_FILES_PER_ADD – affects catalog backups, default is maximum allowed 100,000

• http://www.symantec.com/docs/HOWTO56209

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

SAN Client Tuning

21

• By default, SAN Clients supports a maximum of two Fiber Transport ports at any given time; this allows FT Media Servers to balance an I/O load fairly among multiple SAN Clients. To change this:

• nbftconfig -changeclient -np 4 -C <clientName>

• By default, one Fibre Transport port can only be used by up to two different SAN Clients at any given time; this prevents oversubscribing a FT Media sever port to multiple clients.

• nbftconfig –setconfig –ncp 4

• Windows services don’t allow enough time for SAN client service to start. Change this to 90,000 for 90 seconds rather than 30 seconds (default)

•\\HKLM\SYSTEM\CurrentControlSet\Control\ServicesPipeTimeout • On FT media servers, using a NUMBER_DATA_BUFFERS above 16 may not yield performance improvements and may cause backup failures.

•http://www.symantec.com/docs/HOWTO56200

• Use NUMBER_DATA_BUFFERS_FT to set this value for just FT backups. The default is 16 for tape and 12 for disk.

• Recommended not to go above 320 “FT pipes X data buffers” so that means 20 pipes with NUMBER_DATA_BUFFERS_FT=16. Linux has problems above 20 pipes

• Best Practices Document http://www.symantec.com/docs/TECH54778

•Troubleshooting Guide http://www.symantec.com/docs/TECH51454

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Tuning nbrb for Resource Utilization

22

• The NetBackup Resource Broker handles granting resources to backup, restore and duplications

• If not tuned correctly long delays in jobs going from queued to active can occur

• Good technote document on tuning nbrb http://www.symantec.com/docs/TECH137761

• nbrb.conf settings are moved into EMM in 7.1 and above and nbrbutil –listSettings is used to view them.

• These setting should be reviewed after upgrading to 7.1, paying special attention to RESPECT_REQUEST_PRIORITY and DO_INTERMITTENT_UNLOADS .

• BREAK_EVAL_ON_DEMAND is a relatively new setting and should also be considered

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Tuning server.conf Part I

23

• On UNIX,/usr/openv/var/global/server.conf is the configuration file used by the Sybase ASA 11 database in 7.x (and 6.x for that matter which uses ASA 9)

• On Windows, by default the file is in C:\Program Files\Veritas\Netbackup\var\global\server.conf

• The amount of memory that server.conf is set to use by default is rather low

• the –ch value in the file should be increased until the /usr/openv/db/log/server.log file no longer shows cache increasing up to the limit of the value in server.conf

•Grep/find for “adjusting cache” to see how much cache Sybase is using

• server.log is in C:\Program Files\Veritas\NetbackupDB\log by default on Windows

• As a rule of thumb –ch can also be set to 30% of system RAM

• -c and –cl can be increased to 500mb as well

• Here is a good tech note on server.conf tuning. This applies to OpsCenter’s Sybase ASA database as well. The only note is that the –gn setting not longer gets set explicitly in 7.x and above, but can be added to allow more concurrent connections to Sybase

http://www.symantec.com/docs/HOWTO33625

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Tuning server.conf Part 2

24

• The default –gn value in ASA 11 is 20 and that is often too low for busy masters (it was 10 in ASA 9 and below)

•To see if the master is using more than 20 concurrent connections to Sybase, “grep terminated /usr/openv/db/log/server.log” and see if there are messages like:

W. 02/12 09:45:32. All threads were blocked when waiting to send or receive. A connection has been

terminated. Increasing -gn may prevent this in the future.

I. 02/12 09:45:32. Connection terminated abnormally I. 02/12 09:45:32. Disconnected SharedMemory

client's AppInfo: IP=10.10.10.1;HOST=test;OSUSER=root;OS='SunOS 5.10 Generic_147440-06

';EXE=/opt/openv/netbackup/bin/bpjobd;PID=0x1cb8;THREAD=0x7;VERSION=11.0.1.2753;API=OD

BC;TIMEZONEADJUSTMENT=-360

• This is more common in 7.5 because Sybase is now serving the DBM_DATA.db, JOBD_DATA.db and SEARCH_DATA.db databases in addition to EMM_DATA.db and NBDB.db databases in previous versions

• Several large customers have experienced this since going to 7.5

• Connections will retry and this typically doesn’t cause a hard error but results in slowness

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Tuning server.conf Part 3

25

• The transactions logs in nbrb.log can grow quite large and eventually cause issues with NBU operations. This typically only happens if catalog backups are not performed for an extended period but can also happen if the system is very busy

•To prevent this transaction log growth a –m option can be added at the end of server.conf after the –ud option on the last line. This automatically truncates and commits the tlogs when a checkpoint is done many times a day.

•This setting is automatic with new 7.5 installations but may not get set during upgrades to 7.5 or in 7.1 or below.

•The –m setting is described in

http://www.symantec.com/docs/HOWTO67149

• It can also be set via the Sybase admin CLI using this tech note

http://www.symantec.com/docs/HOWTO33588

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Tuning emm.conf

26

• UNIX: /usr/openv/var/global/emm.conf is the configuration file used by nbemm the Enterprise Media Manager.

• Windows: <install_path>\NetBackup\var\global

• With the default settings in emm.conf (or with the file not present) even a number of admins opening the Device Manager in the GUI or running commands can exceed the number of connections. The default for DB browse connections is only 3 and DB connections is 4!

• For large environments the following settings are recommended as a minimum NUM_DB_BROWSE_CONNECTIONS=20 NUM_DB_CONNECTIONS=21

NUM_ORB_THREADS=31

• This makes it important that the the emm db in /usr/openv/db/data is on really fast disk and often times it is advisable to have it on separate disk from the image catalog and any logging.

• Tech note on how these settings and nbrb settings can affect jobs getting resources and going active

http://www.symantec.com/docs/TECH57277

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013 27

Disk Layout Considerations • Setting up separate file systems/disk spindles for the following

components will improve the performance on large masters

1. Unified logs

2. Catalog flat file components (in particular the image database)

3. Catalog relational database data files

4. Catalog relational database index files

5. Catalog relational database transaction logs

• http://www.symantec.com/docs/TECH144969

• Use ln –s on UNIX and mklink command with Windows 2008

• Put databases and log files on a Raid protected file system with the right balance of performance and redundancy

• Consider block size as well as keeping disk access times as low as possible

• To curb catalog growth consider compression and archiving

• Consider SSD for relational databases which are relatively small

Tuning LIFECYCLE_PARAMETERS for SLP and AIR

28

• Edit /usr/openv/netbackup/db/config/LIFECYCLE_PARAMETERS to modify the way SLP and AIR perform. If file does not exist, defaults are used.

• MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB - 30 minute default but

may need to be reduced if most duplications are small backups

• IMAGE_EXTENDED_RETRY_PERIOD_IN_HOURS -2 hour default and if duplications

are being troubleshot and fail 3 times, then it will take 2 hours before it tries again.

• DUPLICATION_GROUP_CRITERIA – 1 is the default in 6.5.5 and above and allows

multiple SLPS of the same priority in a job together

• TAPE_RESOURCE_MULTIPLIER - 2 is the default in 6.5.6 and above and it means

that if there are say 3 write drives available that 6 SLPs will go queued so that there are

enough queued jobs at any given time to keep all the drives spinning. The old default

was 1 and was non configurable

• More details on these settings and many other configuration tips for SLP can be found

here:

http://www.symantec.com/docs/TECH153154

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Solaris 10 Part 1

29

• Disable tcp_fusion: in /etc/system add set do:tcp_fusion=0

• Decrease tcp_time_wait_interval to 20,000 in a Sol10 project or /etc/init.d by adding a startup file

•Get the current value using ndd -get /dev/tcp tcp_time_wait_interval, however this will not show values set in a project file

• Increase the number of file descriptors to 8192 at least, for 7.5 65536 is recommended. Use ulimit –a to determine the current limit. This can be raised on a per project basis or using /etc/system

• projadd -U NetBackup -K “process.max-file-descriptor=(priv,65536,deny)” user.nbu

•set rlim_fd_max = 65536 in /etc/system

• Increase the amount of shared memory available for NBU, especially on media servers using set shmsys:shminfo_shmmax=one half of system memory (or higher if NBU is the only application)

http://www.symantec.com/docs/TECH63229

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Solaris 10 Part 2

30

Old /etc/system Tunable Solaris 10 project resource Tuning

msgsys:msginfo_msgmnb process.max-msg-qbytes 65536

msgsys:msginfo_msgmni project.max-msg-ids 16384

msgsys:msginfo_msgtql process.max-msg-messages -

semsys:seminfo_semmni project.max-sem-ids 8192

semsys:seminfo_semmsl process.max-sem-nsems -

semsys:seminfo_semopm process.max-sem-ops -

shmsys:shminfo_shmmax project.max-shm-memory half RAM

shmsys:shminfo_shmmni project.max-shm-ids 8192

Kernel Tuning

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

Linux

31

•Increase the number of file descriptors to at least 8192 and again 65536 is recommended with 7.5. Use ulimit –a to determine the current limit. This can be raised in /etc/security/limits.conf

* hard nofile 65536 (can be tuned to unlimited as well) * soft nofile 65536

• Increase the amount of shared memory available for NBU, especially on media servers by editing /etc/sysctl.conf and adding or modifying kernel.shmmax= half or more of physical ram

• These minimums are also required for other kernel parameters, often customers with busy master/media servers end up with higher values • http://www.symantec.com/docs/TECH28934

Message Queues

Semaphores

msgmax=65536 semmsl = 300

msgmnb=65536 semmns = 1024

Msgmni=16384 semopm = 32

semmni = 1024

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

32

Partnership with BCS

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013

• Proactive tuning of common NBU components

• Deep knowledge of how NBU interacts with the OS and how to tune the OS

• Collaboration on the right time to migrate to new hardware for NBU servers

• Hardware migration and master rename proactive services

• Upgrade proactive services

• Guidance on how to implement new NBU features

• Timely notice of important fixes and known issues to avoid

• Guidance on when to patch and when to “hold”

SYMANTEC VISION 2013

Poolside Ask the Information Availability Experts Happy Hour

Bring your questions and your sunglasses and stop by the pool to share a drink and some conversation with the experts to discuss business continuity, disaster recovery, high availability and more. Stop by and unwind, get some fresh air and grab a cool beverage with the experts.

33 Ask the Information Availability Experts Happy Hour

• Business Continuity

• Business Critical Services

• Data Insight and Enterprise Vault

• Storage Foundation High Availability

• Intel

• Microsoft

• Red Hat

• Violin

• Session IA B16

• Tuesday April 16, 5-6 pm • Talent Pool

SYMANTEC PROPRIETARY/CONFIDENTIAL – INTERNAL USE ONLY Copyright © 2013 Symantec Corporation. All rights reserved.

Thank you!

34

• David Smiley

[email protected]

• 703-869-3183

NetBackup Performance Tuning: Lessons From The Field SYMANTEC VISION 2013