DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

11
DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8.1 Gururaj Kulkarni Software QA Manager EMC [email protected] Soumya Gupta Senior Software QA Engineer EMC [email protected]

Transcript of DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

Page 1: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8.1Gururaj KulkarniSoftware QA Manager EMC [email protected]

Soumya GuptaSenior Software QA Engineer EMC [email protected]

Page 2: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 2

Table of Contents

Introduction _________________________________________________________________ 3

Test Approach _______________________________________________________________ 4

Test Results _________________________________________________________________ 4

Best Practices ______________________________________________________________ 10

Disclaimer: The views, processes, or methodologies published in this article are those of the

authors. They do not necessarily reflect EMC Corporation’s views, processes, or

methodologies.

Page 3: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 3

Introduction

EMC NetWorker®, an enterprise-class Backup and Recovery solution, is three-tiered software;

NetWorker Server (that co-ordinates the entire backup/recover process and tracks the

metadata), NetWorker Client (which hosts the data to be backed up), and NetWorker Storage

Node (which connects to diverse storage devices and writes/reads data).

EMC Data Domain® Deduplication Storage System is a storage appliance that is revolutionizing

disk backup, archiving, and disaster recovery with high-speed, inline deduplication.

Two important new features were introduced in NetWorker 8.1 with Data Domain integration.

This article briefs the performance benefit gained from both features.

Data Domain Boost Over Fibre Channel

Customers using NetWorker with Virtual Tape Libraries (including EDL, Data Domain as VTL,

and others), auto changers, or tape device as their backup solutions cannot transition to

NetWorker backup to disk with Data Domain since they have a dedicated Fibre Channel

environment and Data Domain devices support data transfer only over TCP/IP.

This article describes the new feature introduced in NetWorker 8.1 where NetWorker clients and

storage nodes support Fibre Channel (backup and recovery operation) connectivity to Data

Domain devices by leveraging Fibre Channel capability available with DD Boost 2.6 library.

This support not only optimizes the customers’ existing investment in their Fibre Channel

infrastructure but offers both client-side deduplication and support of the Fibre Channel protocol

using a backup-to-disk workflow.

Boost over Fibre Channel with Client Direct is 20-25% faster compared to backup via

Data Domain VTL.

DFA-Recover throughput via Fibre Channel is 2.5x times faster than recover throughput

via Data Domain VTL.

Virtual Synthetics

In the current Synthetic Full (SF) backup feature, the data is sent to a NetWorker processes

from the DDR which sends it back to the same DDR. This increases time to synthesize the

saveset as well as network bandwidth usage.

Page 4: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 4

This article describes the new feature introduced in NetWorker 8.1 where Virtual Synthetic Full

(VSF) backups are an out-of-the-box integration with NetWorker, making it ‘self-aware.’

Therefore, if your customer is using a Data Domain System as their backup target, NetWorker

will use VSF backups as the backup workflow by default when a synthetic full backup is

scheduled, thus optimizing incremental backups for file systems.

VSF backups reduce the processing overhead associated with traditional synthetic full backups

by using metadata on the Data Domain system to synthesize a full backup without moving data

across the network.

VSF backup is 21x - 29x times faster than Synthetic Full (SF) backup.

Test Approach

NetWorker Server, Storage Node, and Clients were installed with NetWorker

dev.Build.6064 for Data Domain over Fibre Channel Feature Testing and

NetWorker dev.Build.6297 for Virtual Synthetics Feature Testing.

All tests were carried out on Windows NetWorker Server Platform.

Network speed of 1GB and Fibre Channel speed of 4GB was maintained in the

setup.

Data Domain was initialized to “zero” state prior to attempting each scenario.

Tests were carried out via Dedicated Storage Node (DSN) with Boost over Fibre

Channel device.

Synthetic Full and Virtual Synthetic Full backup were run for all the scenarios.

Test Results Test results for both features—Data Domain Boost over Fibre Channel and Virtual Synthetic Full

Backup—are shown in the following tables.

Page 5: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 5

Data Domain Boost over Fibre Channel

Client Direct backup via Dedicated Storage Node with Boost over Fibre Channel was

carried out.

Backup via Boost over Fibre Channel with Client Direct is 20-25% faster compared to

backup via Data Domain VTL.

Next subsequent full backup is 3x times faster compared to the first full backup.

Resource Utilization

Memory and CPU Usage by Single save process during client direct backup over Fibre

Chanel are shown below.

49

54

46474849505152535455

Linux Windows

Me

mo

ry U

sage

in M

B

OS Platform

Memory Consumption by Single Save Session during Client Direct Backup

DFA-FC

15 15

0

2

4

6

8

10

12

14

16

Linux Windows

CP

U U

sage

OS Platform

CPU Usage by single save session during Client direct backup

DFA-FC

Page 6: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 6

Recovery of 1TB data was carried out.

Recovery over Boost over Fibre Channel is 2.5x times faster compared to recovery via

Data Domain VTL.

Virtual Synthetic Full Backup

Results of Virtual Synthetic Full Backup

Figure above depicts backup of Medium Density FS (having 2360 files); 1%, 2%, and

5% more data added to 3 incremental backups.

File size ranged from 200MB to 4GB.

Database file size ranged from 10GB to 35GB.

The Next Full was run with Virtual Synthetic Full, Traditional Full, and Synthetic Full.

Results are given in the last column of the table below.

1.4

3.6

0

0.5

1

1.5

2

2.5

3

3.5

4

FC DD-VTL

Tim

e (

Ho

ur)

Recover Time for 1TB

475

13 33

77

7 0

50

100

150

200

250

300

350

400

450

500

Full incr1 incr2 incr5 Next FULL

Tim

e T

ake

n (

min

) Time to synthetize full backup is almost negligible using VSF.

Page 7: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 7

• Virtual Synthetic Full backup is 21x faster than Traditional Full backup.

• Virtual Synthetic Full backup is 29x faster than Synthetic Full backup.

7

150

205

0

50

100

150

200

250

VirtualSynthetic

Full

TraditionalBackup

SyntheticFull

Bac

kup

Tim

e (

min

)

Backup Level

Backup Time for Medium Density FS of 4 TB

This is for representation purpose, not to compare feature with L0 (Traditional backup) and Synthetic Full backup.

Page 8: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 8

As file size increases, Virtual Synthetic Full backup throughput

increases.

As the number of files increases, Virtual Synthetic Full backup

throughput decreases.

Resource Utilization

Memory Usage

Memory usage by nsrrecopy process on Linux and Windows Storage Node and by

nsrconsolidate process on Windows Server is shown below.

Memory Usage by nsrrecopy during Virtual Synthetic Full backup is 3x

less than Synthetic Full backup.

Memory Usage by nsr-recopy process is the same during the back up of

High Density File System and Medium Density File System.

144 1389

8959

0

2000

4000

6000

8000

10000

HDF (File Size= 10KB)

HDF (File Size= 100KB)

MDF (FileSize = 200MB

- 4GB)

Thro

ugh

pu

t in

MB

/s

VSF Backup Throughput for MDF vs HDF

35

80

27

85

0

20

40

60

80

100

VSF SF

Me

mo

ry in

MB

Memory Usage by per nsrrecopy on Windows Storage Node (32GB RAM)

MDF

HDF 20

72

16

70

0

20

40

60

80

VSF SF

Me

mo

ry in

MB

Memory Usage by per nsrrecopy on Linux Storage Node (8GB RAM)

MDF

HDF

Page 9: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 9

Memory Usage by nsrconsolidate process on server is the same during

both Virtual Synthetic Full and Synthetic Full backup.

Memory Usage by nsrconsolidate process on server is same during the

backup of Medium Density File System and High Density File System.

CPU Usage

CPU usage by nsrconsolidate process on Windows NetWorker Server and by

nsrrecopy process on both Linux and Windows Storage Node is shown below.

CPU usage by nsrconsolidate process is more during the High Density FS Virtual

Synthetic Full backup as the index processing is done by nsrconsolidate process.

42

41

43 43

40

41

42

43

44

VSF SF

Me

mo

ry in

MB

Memory Usage per nsrconsolidate process on Windows Server (32GB RAM)

MDF

HDF

1 1

8

1

0

2

4

6

8

10

VSF SF

% C

PU

Usa

ge

User CPU Usage by nsrconsolidate on Windows Server(2CPU @ 2.13 GHz)

MDF

HDF

Page 10: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 10

Best Practices

Recover throughput via Boost over Fibre Channel is 2.5x times faster than

recover via Data Domain VTL.

Virtual Synthetic Full Level Backup throughput varies with size of the File

backup. As file size increases, Virtual Synthetic Full Backup throughput also

increases.

Virtual Synthetic Full Backup time varies with the type of filesystem (high

density / medium density). Virtual Synthetic Full Backup for Medium Density

FileSystem performs better than the Virtual Synthetic Full Backup for High

Density FileSystem.

During Virtual Synthetic Full Backup, processing of data is off-loaded to Data

Domain. Hence, resource utilization on NetWorker Storage Node is

significantly less.

20

27

20

31

0

5

10

15

20

25

30

35

VSF SF

% C

PU

Usa

ge

User CPU Usage by nsrrecopy on Windows Storage Node (2 CPU@ 2.13 GHz)

MDF

HDF

12

10

12

10

9

10

11

12

13

VSF SF

% C

PU

Usa

ge

CPU Usage by nsrrecopy on Linux Stoarge Node (8 [email protected] GHz)

MDF

HDF

Page 11: DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8

2014 EMC Proven Professional Knowledge Sharing 11

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.