The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops...

33

Transcript of The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops...

Page 1: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3
Page 2: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

The Motivation for Analyzing perfSONAR

●○○○

○○

2

Page 3: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Network Measurement Platform Overview

●○

Collector

Store (long-term) Store (short-term)

pS MonitoringpS Configuration

Tape

Experiments

MONIT-GRAFANA

pS Dashboard

3

Page 4: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

The NSF SAND Project

SAND: Service Analysis and Network Diagnosis1827116) focusing on combining,

visualizing, and analyzing disparate network monitoring and service logging data.

. (GOAL: capitalize on our rich network dataset!!)

4

Website https://sand-ci.org/ (Project started in September 2018 and has 2 years funding)

PI: Brian Bockelman, Co-PIs: Shawn McKee, Rob Gardner

Page 5: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Finding Relevant Information

https://toolkitinfo.opensciencegrid.org/toolkitinfo/

5

Page 6: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Available Data Overview

●●●●

6

Page 7: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Machine Learning and a Network Database

7

Page 8: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

perfSONAR Data Details

●●●●●●●●

https://atlas-kibana.mwt2.org/s/networking/app/kibana#/discover?_g=()8

Page 9: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Network Topology

●●

○○

9

Page 10: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Updates to Traceroute Data

●●

●●●

10

Page 11: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Gathering ESnet Interface Data

11

Page 12: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Replay of 2018, 2019 Network Data

12

Page 13: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Students Working with SAND on Network Analytics

●●

13

Page 15: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Yuan Li’s Work

15

5. Plot owd data of top 10 `dest_host` with a given `src_host` and a predefined time range. E.g.,

Code snippets: dest_hosts = getDestHosts('perfsonar01.datagrid.cea.fr')

plotTop10DestHosts(src_host='perfsonar01.datagrid.cea.fr', dest_hosts = dest_hosts, from_date='2019/10/05', to_date='2019/11/05')

Page 16: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Manjari Trivedi’s Work

https://gitlab.com/963/sand/tree/master/common_hops )

Method: Find the list of hops from site1 → site2 and compare with the list of hops from site3 → site4.Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3 → site4.Problems:

● There may be several paths between a source and destination.● For now, found the most common hash and mapped it to a sequence of hops. In the future, this

could be extended to the n most common hashes.● The most common hash from site1 → site2 may map to IPv4 hops, but the hash from site3→site4

maps to IPv6 hops (or vice versa), so they cannot be compared.● Added an argument to the function to specify whether we are looking at IPv4 or IPv6 (it may be

useful to do both). This is a prerequisite for the most common hash.16

Page 17: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Petya Vasileva’s Work

○○○○

17

Page 18: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Site-wise in some cases it is clear there is a problem with a specific site for a long period.

Average packet loss for each month and for each site being a source and a destination

SourceDestination

Data analysis on the packet loss tests

18

Page 19: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Analyzing data for a single day (12-13 Jan 2020) to see problematic points

Test entries > 6 MIn most cases packet loss is between 0 and 5%

Results aggregated by 1h Distribution of average packet loss in %

Results aggregated by src-dest Distribution of average packet loss in %

Petya Vasileva - work reportSummary of Packet Loss for a Day (Jan 12-13, 2020)

19

Page 20: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Petya Vasileva - work reportAverage Packet Loss by Src-Dest Host 12-13 Jan 2020

20

Page 21: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Petya Vasileva - work reportAverage Packet Loss by Src-Dest Site 12-13 Jan 2020

21

Page 22: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Average packet loss by src-dest IPs

Petya Vasileva - work reportAverage Packet Loss by Src-Dest IP 12-13 Jan 2020

22

Page 23: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Related Work In Canada

○○○

○○

23

Page 24: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Related Work In Canada II

24

Page 25: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Collaboration with MEPhI on Network Visualization

The SAND project is collaborating with MEPHI (Moscow Engineering Physics Institute) on network path

visualization

25

https://perfsonar.uc.ssl-hep.org/graph/viewer

Page 26: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Near-term Work and Beyond...

●●

●○

■○

26

Page 27: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Summary

●○○

27

Page 28: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

CHEP 2019, Adelaide

Acknowledgements

●●●

28

Page 29: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Questions, Comments?

29

Page 31: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020 31

Page 32: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Issues with Traceroute and Network Paths

●●●●

https://www.cellstream.com/reference-reading/tipsandtricks/403-ecmp-linux-paristr

32

●●

Page 33: The Motivation for Analyzing perfSONAR · 2020. 2. 11. · site3 → site4. Output: Number of hops in common, followed by a list of hops in common between site1 → site2 and site3

GDB Jan 2020

Some Context: IRIS-HEP

The Institute for Research and Innovation in Software in High Energy Physics (IRIS-HEP) project has been funded by National Science Foundation in the US as grant OAC-1836650 as of 1 September, 2018.

The institute focuses on preparing for High Luminosity (HL) LHC and is funded at $5M / year for 5 years. There are three primary development areas:

● Innovative algorithms for data reconstruction and triggering;● Highly performant analysis systems that reduce `time-to-insight’ and maximize the HL-LHC

physics potential; ● Data organization, management and access systems for the community’s upcoming Exabyte era.

The institute also funds the LHC part of Open Science Grid, including the networking area and will create a new integration path (the Scalable Systems Laboratory) to deliver its R&D activities into the distributed and scientific production infrastructures.

33