Hitachi Protection Platform S-Series - Data Collection · PDF fileHitachi Protection Platform...
Transcript of Hitachi Protection Platform S-Series - Data Collection · PDF fileHitachi Protection Platform...
ZZ-99HPP999-99
Hitachi Protection Platform S-Series
Network Performance Data Collection
Rev 1.1
ii
Hitachi Protection Platform S-Series Network Performance Data Collection
© 2015 Hitachi, Ltd. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or stored in a database or retrieval system for any purpose without the express written permission of Hitachi, Ltd. and Hitachi Data Systems Corporation (hereinafter referred to as “Hitachi”).
Hitachi, Ltd. and Hitachi Data Systems reserve the right to make changes to this document at any time without notice and assume no responsibility for its use. Hitachi, Ltd. and Hitachi Data Systems products and services can only be ordered under the terms and conditions of Hitachi Data Systems' applicable agreements.
All of the features described in this document may not be currently available. Refer to the most recent product announcement or contact your local Hitachi Data Systems sales office for information on feature and product availability.
Notice: Hitachi Data Systems products and services can be ordered only under the terms and conditions of Hitachi Data Systems’ applicable agreements. The use of Hitachi Data Systems products is governed by the terms of your agreements with Hitachi Data Systems.
By using this software, you agree that you are responsible for:
1. Acquiring the relevant consents as may be required under local privacy laws or otherwise from employees and other individuals to access relevant data; and
2. Verifying that data continues to be held, retrieved, deleted, or otherwise processed in
accordance with relevant laws.
Hitachi is a registered trademark of Hitachi, Ltd. in the United States and other countries. Hitachi Data Systems is a registered trademark and service mark of Hitachi in the United States and other countries. All other trademarks, service marks, and company names are properties of their respective owners.
Export authorization is required for Data At Rest Encryption. Import/Use regulations may restrict export of the S-Series platform to certain countries:
China – S-Series is eligible for import but the License Key and encryption may not be sent to China
France – Import pending completion of registration formalities Hong Kong – Import pending completion of registration formalities Israel – Import pending completion of registration formalities
iii
Hitachi Protection Platform S-Series Network Performance Data Collection
Table of Contents Introduction ........................................................................................................... 1
Testing/Evaluation prerequisite .............................................................................. 1
Capturing session output for later review ................................................................. 1
Grading network performance ................................................................................ 2
Data Gathering ....................................................................................................... 3
Basic data gathering ............................................................................................. 3
Using ‘sar’ and ‘scp’ .............................................................................................. 4
Using ‘iperf’ ......................................................................................................... 6
Using Graphical-based tools................................................................................... 9
Network triage ...................................................................................................... 11
Bonding issues ................................................................................................... 11
Firewalls ............................................................................................................ 13
Potential duplicate IP address ............................................................................... 15
End-to-end connectivity ....................................................................................... 16
Appendix A: CLI command usage statements ............................................................ 19
ping .................................................................................................................. 19
tracepath ........................................................................................................... 19
iftop .................................................................................................................. 19
iperf .................................................................................................................. 20
ethtool .............................................................................................................. 21
nethogs ............................................................................................................. 22
net_perf_mon .................................................................................................... 22
Appendix B: Installing ‘iperf’ .................................................................................... 23
Appendix C: Bonding modes .................................................................................... 25
Appendix D: Tools ................................................................................................. 27
iv
Hitachi Protection Platform S-Series Network Performance Data Collection
1
Hitachi Protection Platform S-Series Network Performance Data Collection
Introduction
This documents provides a list of tests and procedures which should be run on each HPP system at the site to gather networking-related performance statistics.
This document is intended for personnel who assist in installing, maintaining, and managing the HPP platform in the backup environment.
Testing/Evaluation prerequisite As a pre-cursor to performing any of this testing, a network diagram at both the LAN and WAN level should be provided by the customer. The diagram should list all IP addresses of the HPP’s and media servers as well as their DNS names.
Capturing session output for later review Most (if not all) of the tests outlined in this document should be run from a Secure Shell (SSH) session, usually via the ‘puTTY’ application. To allow a Global Support engineer to review the data for analysis at a later time, it is required that all session output be captured to a file on a local desktop.
To effect the capture with the ‘puTTY’ SSH application, use the following steps:
1) Bring up the main ‘puTTY’ window (not a session window)
2) On the ‘Category:’ tree, click on “Session Logging”.
3) Under ‘Session logging:’, click on the radio button “All session output”.
4) Select a log file name. The default is ‘putty.log’ but it is advisable to use a unique name. Be aware, if an existing ‘putty.log’ file exists on your desktop it will be overwritten, not appended.
1
1
2
Hitachi Protection Platform S-Series Network Performance Data Collection
5) On the ‘Category:’ tree, click on “Session”, connect to your target system, and run the procedure.
6) When the session is complete get the putty.log file from your desktop and archive it for analysis.
Grading network performance In general terms, network performance can be gauged based on the average percentage of line rate achieved in the tests:
Ratings apply to one LAN and no more than two interconnecting switches.
Any connections that would receive an ‘F’ grade should be investigated by the customer’s networking team and improved or at least satisfactorily explained. Any connections that would receive a ‘D’ grade should be reviewed by the customer’s networking team for any possible improvements.
% Line Rate MB/sec
Grade Min Max 1Gb
network 10 Gb
network
A 90.0 100 112-125 1120-
1250
B 80.0 89.9 100-112 1000-
1120
C 70.0 79.9 87-100 870-1000
D 60.0 69.9 75-87 750-870
F 0.0 59.9 <75 <750
2
3
Hitachi Protection Platform S-Series Network Performance Data Collection
Data Gathering
This section specifies a number of methods that can be used to gather network performance data on HPP systems:
Via a combination of the Linux ‘sar’ and ‘scp’ commands Via the public domain tool ‘iperf’ Via a graphical-based tool
In addition, a set of basic commands should be run to document the current configuration of the networking subsystem on the HPP system.
Basic data gathering The following are a set of basic commands to document the current configuration of the networking subsystem on the HPP system.
Steps:
1. Retrieve a full listing of the Ethernet devices:
ifconfig –a
2. Retrieve the kernel interface table for all bonds and interfaces:
netstat –i
3. Retrieve the state of all IP sockets
netstat –a | more
4. Retrieve per-networking-protocol statistics
netstat –s | more
5. Retrieve all active networking connections
netstat –o | more
23
4
Hitachi Protection Platform S-Series Network Performance Data Collection
6. Retrieve the address resolution protocol (ARP) table
arp –a | more
7. Retrieve the routing tables
route
8. Check for TCP and UDP connection issues
netstat –tnp
9. Retrieve network interface statics
ethtool –s eth#
Using ‘sar’ and ‘scp’ The system activity recorder (‘sar’) and secure copy (‘scp’) can be used together for simple tests over LAN (and WAN, where applicable) links to media servers.
On the source HPP system, use SCP to transfer a file over the desired ethX port and measure performance from the app. For example, run the SCP test from node0’s replication IP on HPP system “A” to node0’s replication IP on HPP system “B”. Continue with all slave node’s that are configured for I/O.
NOTE: If running SCP to Windows Servers you must install an SCP compliant tool such as “WindowSCP” from a freeware site.
http://winscp.net/eng/download.php
Steps:
1. Generate a 500MB data file:
dd if=/dev/zero of=/shm/dev/scptemp bs=1024 count=500K
This will create the 500MB data file in a RAM disk. This will allow us to avoid the read latency of an on-disk data file during the transfer part of the test.
2. On the target, run the following to measure performance throughput on all eth# replication-duplication ports for each node.
sar –n DEV 1
This will display networking statistics (‘-n’) once every second (‘1’) for each networking ‘DEV’ice (i.e., each NIC).
3. Use SCP to copy and measure throughput:
scp /dev/shm/scptemp root@remoteIPaddress:/dev/null
5
Hitachi Protection Platform S-Series Network Performance Data Collection
Run this test from node0 and all processing nodes configured for I/O. If you notice a “stall” status from SCP then that may indicate a problem with the customer’s network.
4. Watch the ‘sar’ output on the target system’s processing nodes. Pay
attention to the ‘rxkB/s’ (Received KBs) field.
For example, if using ‘mahpp01’ as the source HPP system and ‘mahpp02’ as the “remote” HPP system, communicating over a 10Gb NIC (eth4 on both systems):
MAHPP01
[root@mahpp01 ~]# dd if=/dev/zero of=/dev/shm/scptemp bs=1024 count=500K
512000+0 records in
512000+0 records out
524288000 bytes (524 MB) copied, 0.394121 seconds, 1.3 GB/s
[root@mahpp01 ~]# df –k
[root@mahpp01 ~]# scp /dev/shm/scptemp [email protected]:/dev/null
scptemp 100% 500MB 62.5MB/s 00:08
[root@mahpp01 ~]#
MAHPP02
[root@mahpp02 ~]# sar -n DEV 1 | grep eth4
IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 15:50:03 eth4 0.00 0.00 0.00 0.00 0.00 0.00 0.00
15:50:04 eth4 0.00 0.00 0.00 0.00 0.00 0.00 0.00
15:50:05 eth4 0.00 0.00 0.00 0.00 0.00 0.00 0.00
:
15:51:07 eth4 1.00 0.00 0.06 0.00 0.00 0.00 0.00
15:51:08 eth4 13.00 12.00 2.22 2.40 0.00 0.00 0.00
15:51:09 eth4 42645.00 7657.00 59268.57 427.65 0.00 0.00 0.00
15:51:10 eth4 50145.00 4727.00 69699.98 276.93 0.00 0.00 0.00
15:51:11 eth4 49572.00 4726.00 68903.79 276.88 0.00 0.00 0.00
15:51:12 eth4 49512.00 4721.00 68820.39 276.62 0.00 0.00 0.00
15:51:13 eth4 49536.00 4725.00 68853.75 276.78 0.00 0.00 0.00
15:51:14 eth4 49656.00 4737.00 69020.55 277.55 0.00 0.00 0.00
15:51:15 eth4 49621.00 4730.00 68970.75 277.09 0.00 0.00 0.00
15:51:16 eth4 43336.00 4156.00 60214.78 244.32 0.00 0.00 1.00
15:51:17 eth4 1.00 0.00 0.06 0.00 0.00 0.00 0.00
15:51:18 eth4 1.00 0.00 0.06 0.00 0.00 0.00 0.00
This shows a network “speed” of 62.5MB/sec (or 500Mb/sec) [ignoring the first and last statistics since they are most likely partial] on a single connection. Multiple, simultaneous runs of ‘scp’ can be used to achieve higher bandwidth. For example, twenty (20) simultaneous ‘scp’s were able to transfer data at an approximate combined transfer rate of 770MB/sec (6160Mb/sec).
6
Hitachi Protection Platform S-Series Network Performance Data Collection
Using ‘iperf’ iperf always uses port 5001. You can use the IP address of the remote HPP system or the media servers you want to test to as a target. The minimum test should be from each HPP node to the corresponding HPP node across the
WAN. So from site “A” node0 run ‘iperf’ to site “B” node0 and again in the
reverse order so performance can be compared bi-directionally. Continue the same methodology for site “A” node1 run iperf to site “B” node1 and so forth.
‘iperf’ must be installed on each processing node of an HPP system. ‘iperf’
requires that it be run with the “-s” option for the server side use and with the “-c” for the client side use.
NOTES:
1) The iperf64 obtained from the FTP site is compiled for to be run on the HPP appliance (ie. a 64-bit RedHat-based Linux OS). Other OS’s will have to be compiled separately. (Windows, SUSE, AIX, etc.)
2) An ‘iperf’ server process can handle only one connection at a time. You can not run parallel iperf tests to the same server process.
For reference on iperf see the following: http://openmaniak.com/iperf.php
Steps:
1. Install ‘iperf’ on each node the HPP system. See Appendix ‘B’ for instructions.
2. On all processing nodes of the HPP system, you must temporarily shut down the firewall:
service iptables stop
3. Start the ‘iperf’ server process on all nodes. First run the command on node0 (the master node). Then ‘ssh’ into each of the other processing nodes from node0 and run the same command:
nohup iperf64 –s &
4. On all processing nodes, run ‘iperf’ in client mode, specifying the IP address of each processing node of the remote HPP system:
/root/iperf64 –c destinationIPaddress –i 1
For example: /root/iperf64 –c 10.124.2.20 –i 1
If you want to monitor for network errors, use: sar -n EDEV 1
7
Hitachi Protection Platform S-Series Network Performance Data Collection
5. Once testing is complete, on each node of both HPP systems:
a. Find the iperf server process with the following:
ps –ef | grep iperf
b. Kill the process:
kill -9 pid
6. On all processing nodes of the HPP system, restart the firewall:
service iptables start
Examples:
[root@DCD01-SEPT01 -c~]# ./iperf64 –c 10.152.44.69
------------------------------------------------------------
Client connecting to 10.152.44.69, TCP port 5001
TCP window size: 8.34 MByte (default)
------------------------------------------------------------
[ 3] local 10.152.44.72 port 58179 connected with 10.152.44.69 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 9.90 GBytes 8.50 Gbits/sec
[root@DCD01-SEPT01 ~]#
Print stats every seconds continuosly:
[root@DCD01-SEPT02 ~]# ./iperf64 -c 10.152.44.69 -i 1
------------------------------------------------------------
Client connecting to 10.152.44.69, TCP port 5001
TCP window size: 8.34 MByte (default)
------------------------------------------------------------
[ 3] local 10.152.44.60 port 53251 connected with 10.152.44.69 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 635 MBytes 5.32 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 1.0- 2.0 sec 663 MBytes 5.56 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 2.0- 3.0 sec 767 MBytes 6.44 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 3.0- 4.0 sec 841 MBytes 7.06 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 4.0- 5.0 sec 867 MBytes 7.27 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 5.0- 6.0 sec 895 MBytes 7.51 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 6.0- 7.0 sec 868 MBytes 7.28 Gbits/sec
[ ID] Interval Transfer Bandwidth
[root@DCD01-SEPT02 ~]#
8
Hitachi Protection Platform S-Series Network Performance Data Collection
Change TCP window size from OS default to 512KB, print stats every 5 seconds for 60 seconds (1 minute).
[root@node1 systest]# ./iperf64 -c 172.16.22.36 -w 524288 -i 5 -t 60
------------------------------------------------------------
Client connecting to 172.16.22.36, TCP port 5001
TCP window size: 1.00 MByte (WARNING: requested 512 KByte)
------------------------------------------------------------
[ 3] local 172.16.22.28 port 54066 connected with 172.16.22.36 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 5.0 sec 4.31 GBytes 7.40 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 5.0-10.0 sec 4.71 GBytes 8.08 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 10.0-15.0 sec 5.09 GBytes 8.74 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 15.0-20.0 sec 4.46 GBytes 7.66 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 20.0-25.0 sec 4.65 GBytes 7.99 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 25.0-30.0 sec 4.91 GBytes 8.44 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 30.0-35.0 sec 4.83 GBytes 8.29 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 35.0-40.0 sec 5.06 GBytes 8.69 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 40.0-45.0 sec 4.94 GBytes 8.49 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 45.0-50.0 sec 5.16 GBytes 8.87 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 50.0-55.0 sec 4.77 GBytes 8.20 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 55.0-60.0 sec 4.90 GBytes 8.42 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-60.0 sec 57.8 GBytes 8.27 Gbits/sec
[root@node1 systest]#
9
Hitachi Protection Platform S-Series Network Performance Data Collection
Using Graphical-based tools The following lists a number of more advanced routines, some product-specific which can be run to gather additional information to be used as part of the triage process. It should be noted, though, that these tools are real-time tools that will require a screen shot to record properly.
Steps:
1. Monitor interface statistics in real time:
nethogs –m
This tool is very useful with the “m” flag switch that will show you totals received and rate in KB/s.
2. Displays a real time graph showing network performance
net_perf_mon
3. Displays display bandwidth usage on an interface by host
iftop –n –N –p -P –i interface
10
Hitachi Protection Platform S-Series Network Performance Data Collection
11
Hitachi Protection Platform S-Series Network Performance Data Collection
Network triage
This section details some data-gathering steps for issues involving:
Network interface bonding Communications through firewalls Potential duplicate IP addresses Basis end-to-end connectivity
Bonding issues The HPP system has 5 predefined potential NIC bonding scenarios:
If a bonding issue is suspected the following commands should help to determine the configuration.
Name Devices Description Mode
bond0 eth2/eth3 HPP’s private backend network (not used on S1500 systems)
1/
active-backup
bond1 eth4/eth5 (legacy)
eth8/eth9 (HPP) 10Gb NIC ports 4/LACP
bond2 eth4/eth5 ports 1 and 2 of quad-1Gb NIC 4/LACP
bond3 eth6/eth7 ports 3 and 4 of quad-1Gb NIC 4/LACP
bond4 eth4/eth5/eth6/eth7 all four ports of quad-1Gb NIC 4/LACP
3
12
Hitachi Protection Platform S-Series Network Performance Data Collection
Steps:
1. Check the current OS settings for the bond:
cat /proc/net/bonding/bond#
i. Verify the bonding mode matches what is in the table above.
ii. Verify the speed and duplex of the slave interfaces match
iii. Verify that the ‘MII status’ is ‘up’.
2. Check the configured settings for the bond:
cat /etc/sysconfig/network-scripts/ifcfg-bond#
Verify the bonding mode matches what is in the table above.
3. Check the ARP table:
arp –e –v
4. Check the individual link statuses of the bond and its slave NICs:
ip link show bond# ; ip link show nic1 ; ip link show nic2
For example, assume we have an issue around the bonded 10Gb NIC ports of a G8 processing node (where the 10Gb ports are ‘eth8’ and ‘eth9’):
[root@mahpp01 bonding]# cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.0 (June 2, 2010)
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: eth2 (primary reselect always)
Currently Active Slave: eth2
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 45000
Down Delay (ms): 0
Slave Interface: eth8
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: ac:16:2d:78:af:da
Slave queue ID: 0
Slave Interface: eth9
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: ac:16:2d:78:af:db
Slave queue ID: 0
[root@mahpp01 bonding]#
[root@mahpp01 ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
13
Hitachi Protection Platform S-Series Network Performance Data Collection
BOOTPROTO=static
IPADDR=10.1.1.135
NETMASK=255.255.255.0
ONBOOT=YES
BONDING_OPTS="mode=4 ad_select=bandwidth lacp_rate=fast
xmit_hash_policy=layer3+4 miimon=100 updelay=45000"
[root@mahpp01 ~]#
[root@mahpp01 ~]# arp -e -v
Address HWtype HWaddress Flags Mask
Iface
dvmnode ether 00:23:7d:34:87:08 C
bond0
192.168.16.1 ether 50:57:a8:5a:11:c5 C
eth0
Entries: 2 Skipped: 0 Found: 2
[root@mahpp01 ~]#
[root@mahpp01 ~]# ip link show bond1 ; ip link show eth8 ; ip link show eth9
15: bond1: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
13: eth8: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 10:60:4b:94:c3:38 brd ff:ff:ff:ff:ff:ff
14: eth9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 10:60:4b:94:c3:3c brd ff:ff:ff:ff:ff:ff
[root@mahpp01 ~]#
Firewalls The ability for communications between the HPP system and other HPP systems at the site as well as between the HPP and other customer systems (e.g., Netbackup media servers), must be tested for. The typical ports to test for are:
22 Secure-Shell (ssh) 80 HPP GUI (http) 443 HPP GUI (secure http) 5562 OST master service 5564 OST I/O service
Version 8.1 and later
Use the ‘nc’ to test that all required TCP ports are open through any firewalls
that may exist between the HPP system(s) and other network hosts:
nc –v –z –w 5 destinationIPaddress port#
For example, to test that the ‘ssh’ port is open:
[root@mahpp01 ~]# nc –v –z –w 5 10.200.9.210 22
Connection to 10.200.9.210 22 port [tcp/ssh] succeeded!
[root@mahpp01 ~]#
14
Hitachi Protection Platform S-Series Network Performance Data Collection
Version 8.0 and earlier
Use ‘telnet’ to test that all required TCP ports are open through any firewalls
that may exist between the HPP system(s) and other network hosts:
telnet destinationIPaddress port#
When the command is successful, it will appear to hang; it’s actually waiting for input. In some cases (e.g., port 80 [ssh]), simply hitting the ‘Enter’ key should be enough to cause the command to exit (in ‘ssh’s case, with an expected protocol mismatch error message). For others, it may be necessary to type ‘Ctrl-C’ or ‘Ctrl-D’ to get telnet to terminate.
When the command is NOT successful, it should return almost immediately with an error.
For example, this shows a successful test that the ‘ssh’ port is open:
[root@mahpp01 ~]# telnet 10.200.9.210 22
Trying 10.200.9.210...
Connected to 10.200.9.210.
Escape character is '^]'.
SSH-2.0-OpenSSH_5.4
Protocol mismatch.
Connection closed by foreign host.
[root@mahpp01 ~]#
The following shows two failed tests:
[root@mahpp02 ~]# telnet 172.22.17.151 80
Trying 172.22.17.151...
telnet: connect to address 172.22.17.151: No route to host
[root@mahpp02 ~]# ping -c 3 172.22.17.151
PING 172.22.17.151 (172.22.17.151) 56(84) bytes of data.
From 172.22.17.140 icmp_seq=1 Destination Host Unreachable
From 172.22.17.140 icmp_seq=2 Destination Host Unreachable
From 172.22.17.140 icmp_seq=3 Destination Host Unreachable
--- 172.22.17.151 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 1999ms
, pipe 3
[root@ mahpp02 ~]# telnet 172.22.17.25 80
Trying 172.22.17.25...
telnet: connect to address 172.22.17.25: Connection refused
[root@mahpp02 ~]# ping -c 3 172.22.17.25
PING 172.22.17.25 (172.22.17.25) 56(84) bytes of data.
64 bytes from 172.22.17.25: icmp_seq=1 ttl=64 time=0.134 ms
64 bytes from 172.22.17.25: icmp_seq=2 ttl=64 time=0.153 ms
64 bytes from 172.22.17.25: icmp_seq=3 ttl=64 time=0.154 ms
--- 172.22.17.25 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.134/0.147/0.154/0.009 ms
[root@mahpp02 ~]#
15
Hitachi Protection Platform S-Series Network Performance Data Collection
Potential duplicate IP address If there are unexplained network issues using the regular methods listed above there may be a duplicate IP address in the network. To check for that condition do the following:
Steps:
1. Run the ‘arping’ command:
arping -D -I eth# -c 2 ipAddress ; echo $?
where ‘ipAddress’ is the IP address you want to check for duplicate IP’s on ,
‘-c #’ is the number of probes to be sent (in this case ‘2’), and ‘-Iand
should be the one assigned to the port
2. Examine the return/exit value of ‘arping’ (i.e., the shell variable '$?'):
0 = no duplication 1 = duplicated:
For example, assume we suspect that the IP address assigned to ‘eth0’ of
processing node #1 is duplicated someplace else on the customer’s network. The following would be run on NODE1:
[root@node1 ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr D4:85:64:78:2B:58
inet addr:172.22.17.141 Bcast:172.22.31.255 Mask:255.255.240.0
inet6 addr: fe80::d685:64ff:fe78:2b58/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:368587 errors:0 dropped:0 overruns:0 frame:0
TX packets:9076 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:31809284 (30.3 MiB) TX bytes:682695 (666.6 KiB)
Interrupt:16 Memory:f4000000-f4012800
[root@node1 ~]# arping -D -I eth0 -c 2 172.22.17.141 ; echo $?
ARPING 172.22.17.141 from 0.0.0.0 eth0
Sent 2 probes (2 broadcast(s))
Received 0 response(s)
0
[root@node1 ~]#
In this case, the address is NOT duplicated.
16
Hitachi Protection Platform S-Series Network Performance Data Collection
End-to-end connectivity Use ‘ping’ to test basic end-to-end connectivity. Any excessively high values need to be investigated. But note that the definition of ‘excessively high values’ will depend on the network topology and whether the test is running over LAN links and/or WAN links.
If any of the pings fail:
Verify with the customer the IP address and/or DNS-resolvable host name used
Use the ‘tracepath’ utility to see where things may be failing
For example,
[root@mahpp01 ~]# ping -c 3 10.200.9.230
PING 10.200.9.230 (10.200.9.230) 56(84) bytes of data.
From 192.168.11.254 icmp_seq=3 Destination Net Unreachable
--- 10.200.9.230 ping statistics ---
3 packets transmitted, 0 received, +1 errors, 100% packet loss, time 2000ms
[root@mahpp01 ~]# tracepath 10.200.9.230
1: 172.22.17.181 (172.22.17.181) 0.292ms pmtu 1500
1: 192.168.16.1 (192.168.16.1) 0.715ms
2: 192.168.11.254 (192.168.11.254) 2.222ms
3: no reply
4: 192.168.11.254 (192.168.11.254) asymm 2 11.615ms !N
Resume: pmtu 1500
[root@mahpp01 ~]#
Steps:
1. Verify that a default gateway has been configured:
route –n
2. Ping the default gateway using the IP address.
3. Verify that DNS server(s) are defined:
cat /etc/resolv.conf
4. Ping the DNS servers.
5. If the HPP system will be replicating to another HPP system, a. Obtain from the customer the IP address and host name
(resolvable by DNS) of that other HPP system b. Ping the IP address of that other system c. Ping the DNS-resolvable host name of the other HPP system d. Repeat step 5 on the other HPP system using this HPP
system’s IP address and DNS-resolvable host name.
17
Hitachi Protection Platform S-Series Network Performance Data Collection
6. If the Symantec Netbackup (NBU) Open-Storage Technology (OST) protocol will be used to communicate with the HPP system:
a. Obtain from the customer the IP address and host name (resolvable by DNS) of all NBU servers that will communicate with the HPP system via OST
b. ping the IP address of each NBU server c. ping the DNS-resolvable host name of each NBU server
For example:
Verify gateway configured [root@karloff ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.22.16.1 0.0.0.0 UG 0 0 0 eth0
10.153.0.0 0.0.0.0 255.255.0.0 U 0 0 0 bond0
10.154.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth8
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth8
172.22.16.0 0.0.0.0 255.255.240.0 U 0 0 0 eth0
172.28.22.0 0.0.0.0 255.255.255.0 U 0 0 0 eth7
[root@karloff ~]#
In this case, a default gateway is configured and its IP address is: 172.22.16.1
Ping gateway [root@karloff ~]# ping -c 5 172.22.16.1
PING 172.22.16.1 (172.22.16.1) 56(84) bytes of data.
64 bytes from 172.22.16.1: icmp_seq=1 ttl=255 time=0.697 ms
64 bytes from 172.22.16.1: icmp_seq=2 ttl=255 time=0.687 ms
64 bytes from 172.22.16.1: icmp_seq=3 ttl=255 time=1.11 ms
64 bytes from 172.22.16.1: icmp_seq=4 ttl=255 time=0.655 ms
64 bytes from 172.22.16.1: icmp_seq=5 ttl=255 time=0.665 ms
--- 172.22.16.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4001ms
rtt min/avg/max/mdev = 0.655/0.763/1.115/0.179 ms
[root@karloff ~]#
Verify DNS servers configured [root@karloff ~]# cat /etc/resolv.conf
domain sepaton.com
search sepaton.com sangate.com
nameserver 172.22.0.10
nameserver 172.22.0.11
[root@karloff ~]#
In this case, two DNS servers are configured and their IP addresses are: 172.22.0.10 and 172.22.0.11
Ping DNS server [root@karloff ~]# ping -c 5 172.22.0.10
PING 172.22.0.10 (172.22.0.10) 56(84) bytes of data.
64 bytes from 172.22.0.10: icmp_seq=1 ttl=63 time=0.172 ms
64 bytes from 172.22.0.10: icmp_seq=2 ttl=63 time=0.179 ms
64 bytes from 172.22.0.10: icmp_seq=3 ttl=63 time=0.184 ms
64 bytes from 172.22.0.10: icmp_seq=4 ttl=63 time=0.165 ms
64 bytes from 172.22.0.10: icmp_seq=5 ttl=63 time=0.152 ms
--- 172.22.0.10 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 0.152/0.170/0.184/0.016 ms
[root@karloff ~]#
Ping OST storage server
18
Hitachi Protection Platform S-Series Network Performance Data Collection
[root@karloff ~]# ping -c 5 ssdnaught10gn0
PING ssdnaught10gn0 (172.28.22.140) 56(84) bytes of data.
64 bytes from ssdnaught10gn0 (172.28.22.140): icmp_seq=1 ttl=64 time=0.221 ms
64 bytes from ssdnaught10gn0 (172.28.22.140): icmp_seq=2 ttl=64 time=0.153 ms
64 bytes from ssdnaught10gn0 (172.28.22.140): icmp_seq=3 ttl=64 time=0.170 ms
64 bytes from ssdnaught10gn0 (172.28.22.140): icmp_seq=4 ttl=64 time=0.182 ms
64 bytes from ssdnaught10gn0 (172.28.22.140): icmp_seq=5 ttl=64 time=0.163 ms
--- ssdnaught10gn0 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 0.153/0.177/0.221/0.028 ms
[root@karloff ~]# ping -c 5 172.28.22.140
PING 172.28.22.140 (172.28.22.140) 56(84) bytes of data.
64 bytes from 172.28.22.140: icmp_seq=1 ttl=64 time=0.213 ms
64 bytes from 172.28.22.140: icmp_seq=2 ttl=64 time=0.153 ms
64 bytes from 172.28.22.140: icmp_seq=3 ttl=64 time=0.149 ms
64 bytes from 172.28.22.140: icmp_seq=4 ttl=64 time=0.125 ms
64 bytes from 172.28.22.140: icmp_seq=5 ttl=64 time=0.156 ms
--- 172.28.22.140 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3999ms
rtt min/avg/max/mdev = 0.125/0.159/0.213/0.030 ms
[root@karloff ~]#
Ping NBUE media server [root@karloff ~]# ping -c 5 sylvester
PING sylvester (172.22.17.24) 56(84) bytes of data.
64 bytes from sylvester (172.22.17.24): icmp_seq=1 ttl=64 time=2.03 ms
64 bytes from sylvester (172.22.17.24): icmp_seq=2 ttl=64 time=0.419 ms
64 bytes from sylvester (172.22.17.24): icmp_seq=3 ttl=64 time=0.314 ms
64 bytes from sylvester (172.22.17.24): icmp_seq=4 ttl=64 time=0.219 ms
64 bytes from sylvester (172.22.17.24): icmp_seq=5 ttl=64 time=0.381 ms
--- sylvester ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 0.219/0.674/2.039/0.686 ms
[root@karloff ~]#
19
Hitachi Protection Platform S-Series Network Performance Data Collection
Appendix A: CLI command usage statements
ping Usage: ping [-LRUbdfnqrvVaA] [-c count] [-i interval] [-w deadline]
[-p pattern] [-s packetsize] [-t ttl] [-I interface or address]
[-M mtu discovery hint] [-S sndbuf]
[ -T timestamp option ] [ -Q tos ] [hop1 ...] destination
tracepath Usage: tracepath [-nc] <destination>[/<port>]
iftop iftop: display bandwidth usage on an interface by host
Synopsis: iftop -h | [-npbBP] [-i interface] [-f filter code] [-N net/mask]
-h display this message
-n don't do hostname lookups
-N don't convert port numbers to services
-p run in promiscuous mode (show traffic between other
hosts on the same network segment)
-b don't display a bar graph of traffic
-B Display bandwidth in bytes
-i interface listen on named interface
-f filter code use filter code to select packets to count
(default: none, but only IP packets are counted)
-F net/mask show traffic flows in/out of network
-P show ports as well as hosts
-m limit sets the upper limit for the bandwidth scale
-c config file specifies an alternative configuration file
iftop, version 0.17
copyright (c) 2002 Paul Warren <[email protected]> and contributors
A
20
Hitachi Protection Platform S-Series Network Performance Data Collection
iperf Usage: iperf [-s|-c host] [options]
iperf [-h|--help] [-v|--version]
Client/Server:
-f, --format [kmKM] format to report: Kbits, Mbits, KBytes, MBytes
-i, --interval # seconds between periodic bandwidth reports
-l, --len #[KM] length of buffer to read or write (default 8 KB)
-m, --print_mss print TCP maximum segment size (MTU - TCP/IP
header)
-o, --output <filename> output the report or error message to this
specified file
-p, --port # server port to listen on/connect to
-u, --udp use UDP rather than TCP
-w, --window #[KM] TCP window size (socket buffer size)
-B, --bind <host> bind to <host>, an interface or multicast address
-C, --compatibility for use with older versions does not sent extra
msgs
-M, --mss # set TCP maximum segment size (MTU - 40 bytes)
-N, --nodelay set TCP no delay, disabling Nagle's Algorithm
-V, --IPv6Version Set the domain to IPv6
Server specific:
-s, --server run in server mode
-U, --single_udp run in single threaded UDP mode
-D, --daemon run the server as a daemon
Client specific:
-b, --bandwidth #[KM] for UDP, bandwidth to send at in bits/sec
(default 1 Mbit/sec, implies -u)
-c, --client <host> run in client mode, connecting to <host>
-d, --dualtest Do a bidirectional test simultaneously
-n, --num #[KM] number of bytes to transmit (instead of -t)
-r, --tradeoff Do a bidirectional test individually
-t, --time # time in seconds to transmit for (default 10 secs)
-F, --fileinput <name> input the data to be transmitted from a file
-I, --stdin input the data to be transmitted from stdin
-L, --listenport # port to recieve bidirectional tests back on
-P, --parallel # number of parallel client threads to run
-T, --ttl # time-to-live, for multicast (default 1)
-Z, --linux-congestion <algo> set TCP congestion control algorithm (Linux
only)
Miscellaneous:
-x, --reportexclude [CDMSV] exclude C(connection) D(data) M(multicast)
S(settings) V(server) reports
-y, --reportstyle C report as a Comma-Separated Values
-h, --help print this message and quit
-v, --version print version information and quit
[KM] Indicates options that support a K or M suffix for kilo- or mega-
The TCP window size option can be set by the environment variable
TCP_WINDOW_SIZE. Most other options can be set by an environment variable
‘IPERF_longOptionName’, such as ‘IPERF_BANDWIDTH’.
21
Hitachi Protection Platform S-Series Network Performance Data Collection
ethtool Usage:
ethtool DEVNAME Display standard information about device
ethtool -s|--change DEVNAME Change generic options
[ speed 10|100|1000|2500|10000 ]
[ duplex half|full ]
[ port tp|aui|bnc|mii|fibre ]
[ autoneg on|off ]
[ advertise %%x ]
[ phyad %%d ]
[ xcvr internal|external ]
[ wol p|u|m|b|a|g|s|d... ]
[ sopass %%x:%%x:%%x:%%x:%%x:%%x ]
[ msglvl %%d ]
ethtool -a|--show-pause DEVNAME Show pause options
ethtool -A|--pause DEVNAME Set pause options
[ autoneg on|off ]
[ rx on|off ]
[ tx on|off ]
ethtool -c|--show-coalesce DEVNAME Show coalesce options
ethtool -C|--coalesce DEVNAME Set coalesce options
[adaptive-rx on|off]
[adaptive-tx on|off]
[rx-usecs N]
[rx-frames N]
[rx-usecs-irq N]
[rx-frames-irq N]
[tx-usecs N]
[tx-frames N]
[tx-usecs-irq N]
[tx-frames-irq N]
[stats-block-usecs N]
[pkt-rate-low N]
[rx-usecs-low N]
[rx-frames-low N]
[tx-usecs-low N]
[tx-frames-low N]
[pkt-rate-high N]
[rx-usecs-high N]
[rx-frames-high N]
[tx-usecs-high N]
[tx-frames-high N]
[sample-interval N]
ethtool -g|--show-ring DEVNAME Query RX/TX ring parameters
ethtool -G|--set-ring DEVNAME Set RX/TX ring parameters
[ rx N ]
[ rx-mini N ]
[ rx-jumbo N ]
[ tx N ]
ethtool -k|--show-offload DEVNAME Get protocol offload information
ethtool -K|--offload DEVNAME Set protocol offload
[ rx on|off ]
[ tx on|off ]
[ sg on|off ]
[ tso on|off ]
[ ufo on|off ]
[ gso on|off ]
[ gro on|off ]
ethtool -i|--driver DEVNAME Show driver information
ethtool -d|--register-dump DEVNAME Do a register dump
[ raw on|off ]
[ file FILENAME ]
ethtool -e|--eeprom-dump DEVNAME Do a EEPROM dump
[ raw on|off ]
22
Hitachi Protection Platform S-Series Network Performance Data Collection
[ offset N ]
[ length N ]
ethtool -E|--change-eeprom DEVNAME Change bytes in device EEPROM
[ magic N ]
[ offset N ]
[ value N ]
ethtool -r|--negotiate DEVNAME Restart N-WAY negotation
ethtool -p|--identify DEVNAME Show visible port identification (e.g.
blinking)
[ TIME-IN-SECONDS ]
ethtool -t|--test DEVNAME Execute adapter self test
[ online | offline ]
ethtool -S|--statistics DEVNAME Show adapter statistics
ethtool -h|--help DEVNAME Show this help
nethogs usage: nethogs [-V] [-b] [-d seconds] [-t] [-p] [device [device [device
...]]]
-V : prints version.
-d : delay for update refresh rate in seconds. default is 1.
-t : tracemode.
-b : bughunt mode - implies tracemode.
-p : sniff in promiscuous mode (not recommended).
device : device(s) to monitor. default is eth0
When nethogs is running, press:
q: quit
m: switch between total and kb/s mode
net_perf_mon Usage:
net_perf_mon [options]
Options:
-?, -h : this help message
-W <num> : set history Window to <num> seconds (default 10)
-R <num> : set Refresh frequency to <num> seconds (default 3)
-N : No Graph. CSV output on each refresh interval.
-S <num> : For use with -N. Just take <num> samples, then exit.
-X : For use with -N, -S. Use XML output instead of CSV output
-M <num> : Mode
-o <file> : Output file
23
Hitachi Protection Platform S-Series Network Performance Data Collection
Appendix B: Installing ‘iperf’
For a more in depth network performance test use ‘iperf’. The file can be obtained from our FTP* server at the following location:
ftp.sepaton.com Login: upgrade Password: Upgrade!S3paton
1. cd /users/upgrade/VTL-TOOLS 2. get iperf.tgz
Steps:
The executable must be loaded on all nodes involved in the testing in /root folder (scp iperf64 root@nodeX:/root to all slave nodes. Follow procedure below:
1) SCP copy the ‘iperf.tgz’ file into the ‘/root’ directory on each node of
the HPP system to be tested.
2) Un-TAR the ‘pst.tgz’ file: tar zxf pst.tgz
3) Add execute permission to the extracted file chmod a+x /root/iperf64
4) Copy the executable to each processing node in the HPP system: scp iperf64 root@node#:/root
For example: scp iperf64 root@node3:/root
* At least for as long as the Sepaton FTP is available. Future plans call for all FTP-server-available files to be moved to the HDS TUF server at by the beginning of September 2015.
B
24
Hitachi Protection Platform S-Series Network Performance Data Collection
25
Hitachi Protection Platform S-Series Network Performance Data Collection
Appendix C: Bonding modes†
Modes for the Linux bonding driver (network interface aggregation modes) are supplied as parameters to the kernel bonding module at load time. The behavior of the single logical bonded interface depends upon its specified bonding driver mode.
NOTE: HPP systems have been qualified with modes 0, 1, and 4 only.
Round-robin (balance-rr) [Linux driver mode ‘0’]
Transmit network packets in sequential order from the first available network interface (NIC) slave through the last. This mode provides load balancing and fault tolerance.
Active-backup (active-backup) [Linux driver mode ‘1’]
Only one NIC slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The single logical bonded interface's MAC address is externally visible on only one NIC (port) to avoid distortion in the network switch. This mode provides fault tolerance.
XOR (balance-xor) [Linux driver mode ‘2’]
Transmit network packets based on [(source MAC address XOR'd with destination MAC address) modulo NIC slave count]. This selects the same NIC slave for each destination MAC address. This mode provides load balancing and fault tolerance.
Broadcast (broadcast) [Linux driver mode ‘3’]
Transmit network packets on all slave network interfaces. This mode provides fault tolerance.
† Taken from Wikapedia page (http://en.wikipedia.org/wiki/Link_aggregation).
C
26
Hitachi Protection Platform S-Series Network Performance Data Collection
IEEE 802.3ad Dynamic Link Aggregation (802.3ad) [a.k.a. LACP] (Default mode for HPP system bonds) [Linux driver mode ‘4’]
Creates aggregation groups that share the same speed and duplex settings. Utilizes all slave network interfaces in the active aggregator group according to the 802.3ad specification.
Adaptive transmit load balancing (balance-tlb) [Linux driver mode ‘5’]
Linux bonding driver mode that does not require any special network-switch support. The outgoing network packet traffic is distributed according to the current load (computed relative to the speed) on each network interface slave. Incoming traffic is received by one currently designated slave network interface. If this receiving slave fails, another slave takes over the MAC address of the failed receiving slave.
Adaptive load balancing (balance-alb) [Linux driver mode ‘6’]
Includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special network switch support. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the NIC slaves in the single logical bonded interface such that different network-peers use different MAC addresses for their network packet traffic.
27
Hitachi Protection Platform S-Series Network Performance Data Collection
Appendix D: Tools
NOTE: The following tools are not preloaded on HPP systems nor available via HDS sites. The links below are
provided for references purposes only.
Useful freeware can be obtained at the following sources:
http://packetsender.com/
http://www.simplecomtools.com/productcart/pc/viewPrd.asp?idproduct=6&
http://udp-test-tool.software.informer.com/
http://tcp.software.informer.com/download-tcp-load-test-tool/
http://winscp.net/eng/download.php
http://openmaniak.com/iperf.php
This tool is great for testing Windows networking performance:
http://gallery.technet.microsoft.com/NTttcp-Version-528-Now-f8b12769
This tool will benchmark TCP and UDP performance between 2 hosts. Very useful when trying to determine network performance/latency issues.
http://www.pcausa.com/Utilities/pcattcp.htm
D
ZZ-99HPP999-99
Hitachi Data Systems
Corporate Headquarters 2845 Lafayette Street Santa Clara, California 95050-2639 U.S.A www.hds.com
Regional Contact Information
Americas +1 408 970 1000 [email protected]
Europe, Middle East, and Africa +44 (0)1753 618000 [email protected]
Asia Pacific +852 3189 7900 [email protected]
Am