XRootD Monitoring Report A.Beche D.Giordano
description
Transcript of XRootD Monitoring Report A.Beche D.Giordano
![Page 1: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/1.jpg)
IT-SDC : Support for Distributed Computing
XRootD Monitoring ReportA.Beche
D.Giordano
![Page 2: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/2.jpg)
2IT-SDC
Outlines Talk 1: XRootD Monitoring Dashboard
Context Dataflow and deployment model Database: storage & aggregation User interface & use cases Open issues & future work Summary
Talk 2: Beyond XRootD monitoring HTTP/WebDAV integration
10 – April - 14A.Beche – Federated Workshop
![Page 3: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/3.jpg)
IT-SDC
01-JUL-1
2
01-SEP-12
01-NOV-12
01-JAN-13
01-MAR-13
01-MAY-13
01-JUL-1
3
01-SEP-13
01-NOV-13
01-JAN-14
01-MAR-14
05
1015202530354045
Number of sites reporting
# si
tes
XRootD federation monitoring
Activity started during summer 2012 4 sites for FAX, 11 for AAA
Monitoring data increased accordingly
July 2012 March 2014
AAA 606k 43M
FAX 15k 22M
10 – April - 14A.Beche – Federated Workshop 3
![Page 4: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/4.jpg)
IT-SDC
Why monitoring ? Understand data flows to estimate
data traffic
Provide information for efficient operations
Identify access patterns and propose data placement strategies
10 – April - 14A.Beche – Federated Workshop 4
![Page 5: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/5.jpg)
IT-SDC
Raw
Stats
10 m
inut
es
XRootD monitoring dataflow
FederationGLED
Collector Consumer
WEBAPI
DashboardUI
Externalapplications
real time
UDP
stomp
stomp
asynchronous
ActiveMQ
10 – April - 14A.Beche – Federated Workshop 5
![Page 6: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/6.jpg)
IT-SDC
GLED Deployment model
050
100150200
EOS monitoring data rate
Hz
05
101520
Federation monitoring data rate
Hz
AMQ @ CERNShared cluster
5 nodesAAA
UCSD (16Hz)
EOS CMSCERN (10Hz)
EOS ATLASCERN (150Hz)
FAX USSLAC (9Hz)
FAX EUCERN (1 site)
10 – April - 14A.Beche – Federated Workshop 6
AAA EU
![Page 7: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/7.jpg)
IT-SDC
Consolidated dataflow Two usage of these raw data:
Dashboard monitoring XRootD popularity
Now share the same database: Storage optimization Consistency guaranteed
10 – April - 14A.Beche – Federated Workshop 7
![Page 8: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/8.jpg)
IT-SDC
AAA~300 GB
~1B records
Database
FAX~600 GB
~2B records
Daily insert2 GB / 6M rows
Storage Raw, statistics, metadata Tables daily partitioned, no global
indexes
2012-02
2012-04
2012-06
2012-08
2012-10
2012-12
2013-02
2013-04
2013-06
2013-08
2013-10
2013-12
2014-020
100200300400500600700
Database usage growth*
GB
* Indexes excluded
10 – April - 14A.Beche – Federated Workshop 8
![Page 9: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/9.jpg)
IT-SDC
Database Raw data aggregation:
Done using PL/SQL procedures Events are unordered Stateless: Full re-computation of touched
bins each time Compute stats from raw data in 10 min bins Aggregate 10 min stats in daily bins
10 – April - 14A.Beche – Federated Workshop 9
![Page 10: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/10.jpg)
IT-SDC
Aggregation methods
2pm 3pm 4pm 5pm 6pm 7pm
Transfers 1 0 0 2 1Bytes 10 0 0 15 20
Transfers 1 (1) 1 (0) 2 (0) 3 (2) 1 (1)
Bytes 8 1 14 (9+6) 15 (1+9+5) 5
Easy method
Tran
sfer
s
Adopted method
10 – April - 14A.Beche – Federated Workshop 10
![Page 11: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/11.jpg)
IT-SDC
Visualization Interface
10 – April - 14A.Beche – Federated Workshop 11
![Page 12: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/12.jpg)
IT-SDC
Pre-defined set of views
10 – April - 14A.Beche – Federated Workshop 12
![Page 13: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/13.jpg)
Alexandre Beche 13IT-SDC
Matrix Example
27 May 2013
Matrix showing the remote IO CNAF served in the last hour
• # operations• # bytes • Averaged throughput
![Page 14: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/14.jpg)
IT-SDC
Use case exampleUnderstand site access patterns
1. Which sites are reading from FNAL
2. Zoom to a specific site to understand which users are reading
3. Understand which files are read by a user
1
23
2
10 – April - 14A.Beche – Federated Workshop 14
![Page 15: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/15.jpg)
IT-SDC
Data popularity XRootD monitoring provides information
about file access patterns: Including non official collections (ie: user
files) Contribute to simplify and make more efficient
the usage of disk resources
Popularity data analytics built on this information: Adopted already for CMS-EOS will be extended to full AAA
10 – April - 14A.Beche – Federated Workshop 15
![Page 16: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/16.jpg)
IT-SDC
Archive recommendation for CMS-EOS
Help to manage the disk space of EOS including user space No central bookkeeping system
Unused files: created > 4 months ago, no access in the last 3 months: ~500 TB of space occupied and not used <=> 30% of total for these areas
10 – April - 14A.Beche – Federated Workshop 16
%TB
![Page 17: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/17.jpg)
IT-SDC
Open issues Server should provide their site name.
CMS: only 5 sites (to be followed) ATLAS: Done
GLED Collector improvements: Reliability of the service:
Recover time, can be long due to time difference GLED should be operated as a production service
Multi-VOs sites: Discrimination will happened at GLED level
10 – April - 14A.Beche – Federated Workshop 17
![Page 18: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/18.jpg)
IT-SDC
Future work Strong requirement from ATLAS to
understand efficiency: Need the concept of error / failure How XRootD server could be instrumented to report it?
Topology Resolution will be based on the new “server site” field
Data-mining activity (2 years of data ~ 1TB)
10 – April - 14A.Beche – Federated Workshop 18
![Page 19: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/19.jpg)
IT-SDC
Application usage
20
10
30
15
FAX AAA
10 – April - 14A.Beche – Federated Workshop 19
![Page 20: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/20.jpg)
IT-SDC
HTTP Federation is coming HTTP protocol will be used in the future
XRootD servers can be accessed See Fabrizio’s presentation on xrdhttp
Two kind of accesses: Pure HTTP access (through Apache) HTTP gate to XRootD server
Can’t be monitor in the same way
10 – April - 14A.Beche – Federated Workshop 20
![Page 21: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/21.jpg)
IT-SDC
Monitoring XRootD access protocol
XRootD 4 will now reports the user protocol: All the monitoring chain needs to be
updated Dashboard DB and UI are fully readyHTTP
XRootD
10 – April - 14A.Beche – Federated Workshop 21
![Page 22: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/22.jpg)
IT-SDC
Site
GLEDcollector
ActiveMQ
JOB
XRootD Federation
XRoo
tD
Site
SE
HTTP/WebDAV federation monitoring
10 – April - 14A.Beche – Federated Workshop 22
![Page 23: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/23.jpg)
IT-SDC
Site
GLEDcollector
ActiveMQ
JOB
XRootD Federation
XRoo
tD
Site
SE
HTTP Federation
Site
HTTP/WebDAV federation monitoring
10 – April - 14A.Beche – Federated Workshop 23
![Page 24: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/24.jpg)
24IT-SDC
Site
GLEDcollector
ActiveMQ
JOB
JOB
XRootD Federation HTTP Federation
XRoo
tDXrd
HTTP
SiteSite
SE
29 November 2013Alexandre Beche - ITTF
HTTP/WebDAV federation monitoring
![Page 25: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/25.jpg)
IT-SDC
Site
GLEDcollector
ActiveMQ
JOB
JOB
JOB
XRootD Federation HTTP Federation
XRoo
tDXrd
HTTP
Apache
SiteSite
SE
HTTP/WebDAV federation monitoring
10 – April - 14A.Beche – Federated Workshop 25
![Page 26: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/26.jpg)
IT-SDC
Site
GLEDcollector
ActiveMQ
JOB
JOB
JOB
XRootD Federation HTTP Federation
XRoo
tDXrd
HTTP
Apache
SiteSite
SE
?
HTTP/WebDAV federation monitoring
10 – April - 14A.Beche – Federated Workshop 26
![Page 27: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/27.jpg)
IT-SDC
Summary Lots of effort has been put in XRootD monitoring
workflow and dashboard in the last 2 years Reliable system achieved Lots of use cases covered
HTTP Monitoring already started Will require a lot of effort to reach XRootD monitoring level First prototype for pure HTTP monitoring will be ready by
autumn thanks to DPM team
10 – April - 14A.Beche – Federated Workshop 27
![Page 28: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/28.jpg)
IT-SDC
Credits Andreeva Julia Cons Lionel Giordano Domenico Saiz Pablo Tadel Matevz Tuckett David Vukotic Ilija The AAA and FAX deployment team ….
10 – April - 14A.Beche – Federated Workshop 28
![Page 29: XRootD Monitoring Report A.Beche D.Giordano](https://reader036.fdocuments.net/reader036/viewer/2022081514/5681654b550346895dd7c327/html5/thumbnails/29.jpg)
IT-SDC
Useful links AAA Dashboard
http://dashb-cms-xrootd-transfers.cern.ch FAX Dashboard:
http://dashb-atlas-xrootd-transfers.cern.ch CHEP materials
https://indico.cern.ch/abstractDisplay.py?abstractId=101&confId=214784 https://
indico.cern.ch/getFile.py/access?contribId=94&sessionId=6&resId=0&materialId=slides&confId=214784
https://indico.cern.ch/getFile.py/access?contribId=265&sessionId=6&resId=1&materialId=slides&confId=214784
Xbrowse framework: https://twiki.cern.ch/twiki/bin/view/ArdaGrid/XbrowseFramework
Thanks for your attention
10 – April - 14A.Beche – Federated Workshop 29