XROOTD Storage
description
Transcript of XROOTD Storage
XROOTD Storage
Recent directions
Fabrizio Furano
XROOTD storage and ALICE AF - Recent directions 2
The ALICE recipe for storage
Many sites, exposing the XROOTD protocol Native XROOTD A few with DPM+XROOTD One with CASTOR+XROOTD A few with Dcache’s Xrootd protocol
implementation
Native XROOTD + 2 plugins + MonALISA In a simple bundled setup
Alien points directly to single SEs Privileges local data according to the
catalogue’s content OCDB accessed via WAN
XROOTD storage and ALICE AF - Recent directions 3
An unique protocol Having an unique WAN+LAN compliant
protocol allows to do the right thing Exploit locality whenever possible (=most of
the times) Do not worry too much if a job accesses some
data files which is not in the same site. This has to be possible and foreseen. Explicitly creating 100s of replicas just for a job
takes much more time and risk. Access condition data ONLY via WAN
XROOTD storage and ALICE AF - Recent directions 4
The XROOTD way Each server manages a portion of the
storage many servers with small disks, or fewer servers with huge disks
Low overhead DB-free aggregation of servers Gives the functionalities of an unique thing A non-transactional file system
Efficient LAN/WAN byte-level data access Protocol/architecture built on the tough HEP
requirements
XROOTD storage and ALICE AF - Recent directions 5
What we can do Build efficient storage clusters Aggregating storage clusters into WAN
federations Access efficiently remote data Build proxies that can cache an external
repository And increase the data access performance (or
decrease the WAN traffic) through a decent ‘hit rate’
Build hybrid proxies Caching an external repository while storing
local data locally In practice, the ‘Jamboree demonstrator’
XROOTD storage and ALICE AF - Recent directions 6
Aggregated sites Suppose that we can easily aggregate sites
And provide an efficient entry point that “knows them all natively”
We could use it to access data directly We could use it as a building block for a
proxy-based structure called VMSS If site A is asked for file X, A will fetch X from
some other ‘friend’ site, though the unique entry point
A itself is a potential source, accessible through the same entry point
XROOTD storage and ALICE AF - Recent directions 7
The VMSS
Xrootd site
A globalized clusterALICE global redirector
Local clients workNormally at each
site
Missing a file?The storage asks to the global redirector
Gets redirected to the rightcollaborating cluster, and fetches it.
Immediately.
A smart clientcould point here
Any otherXrootd site
Xrootd site
Cmsd
Xrootd
VirtualMassStorageSystem… built on data Globalization
XROOTD storage and ALICE AF - Recent directions 8
The ALICE CAF storage Data is proxied locally to adequately feed
PROOF From the 91 AliEn sites
Cmsd
Xrootd
Cmsd
Xrootd
Cmsd
Xrootd
Cmsd
Xrootd
Cmsd
Xrootd AliEnALICE CAFData mgmt
tools
XROOTD storage and ALICE AF - Recent directions 9
The SKAF/AAF storage Take a PROOF cluster, with XROOTD storage, make it easily
installable and well monitored (MonALISA) Add the xrd-dm plugin by M.Vala
Transform your AF into a proxy of the ALICE globalized storage, through the ALICE GR
If something needed is not present, it will be fetched in FAST Also support sites not seen by the GR, through internal dynamic
prioritization of the AliEn sites.
Data management: how does the data appear? (Pre)staging requests
This means that it works with the usual ROOT tools but also without Suppose that an user always runs the same analysis several times
Which is almost always true The first round will be not so fast but working, the subsequents will
be fast
The first one was the ALICE SKAF (Kosice, Slovakia)
XROOTD storage and ALICE AF - Recent directions 10
AliEn LFN/PFN Often a source of misunderstandings
AliEn has LFNs They are the user-readable names
AliEn converts them to PFNs The ugly filenames with numbers
An AliEn PFN is considered by XROOTD as an XROOTD LFN
XROOTD takes care internally of its PFN translation Hiding the internal mount points
At the end: USERS see Alien LFNs SYSADMINS see XROOTD PFNs (= Alien PFNs
with a prefix)
XROOTD storage and ALICE AF - Recent directions 11
The ALICE PROOF farms
Historically, the *AF admins didn’t like to deal with the AliEn PFNs The ugly filenames made by numbers They wanted to store only LFNs (i.e. the human-readable
filenames) So, Afs are ALREADY storing native LFNs
If these XROOTD-based storages get aggregated by the Global Redirector: Their content will be accessible as a whole, with no need of
translating names through AliEn, the files are there with their true name.
Interesting wild experiment (pioneered by SKAF) The *AFs could give data each other, by using the VMSS
ATLAS-US is doing this as a demonstrator for tier-3s So, a part of the ALICE storage could be accessed directly with
nice names, skipping the overhead of the AliEn xlation.
XROOTD storage and ALICE AF - Recent directions 12
What’s needed… at the end?
The storage part acting as an automatic stager (an LFN-based proxy of the ALICE storage) Like this by default now !
Looks for friend AFs hosting the LFN through the GR
Eventually, look in AliEn-only SEs Through the AliEn mechanism (lfn->guid->pfn) And keep the file named with the LFN
Internally prioritizes sites with a “penalty” mechanism
WAN accessibility of the cluster, without NATs
OR: a small set of machines that proxy it through the firewall (maybe a future dev)
XROOTD storage and ALICE AF - Recent directions 13
PROOF on the GRID? GRID WNs to start PROOF interactive
workers Ongoing interesting developments, e.g. PoD
by Anar Manafov
Data globalization/proxying seems an interesting match to feed them with data Ideas are welcome
The purpose is: Give handles to build a lightweight/dynamic
Data Management structure Whose unique goal is to work well
Enable interactivity for users
XROOTD storage and ALICE AF - Recent directions 14
Proxy sophistication Proxying is a concept, there are basically
two ways it could work: Proxying whole files (e.g. the VMSS)
The client waits for the entire file to be fetched in the SE
Proxying chunks (or data pages) The client’s requests are forwarded, and the
chunks are cached in the proxy as they pass through
In HEP we do have examples of the former It makes sense to make also the latter
possible Some work has been done (the original
XrdPss proxy or the newer, better prototype plugin by A.Peters)
XROOTD storage and ALICE AF - Recent directions 15
The eXtreme Copy Let’s suppose that we have to get a (big) file
And that there are several replicas in different sites
Big question: where to fetch it from? The closest one?
How can we tell if it’s the closest? Closest to what? Will it be faster as well?
The best connected one? It can always be overloaded or weak or broken
Whatever we choose, the situation can change over time Instead we want always the max efficiency
The eXtreme Copy
XROOTD storage and ALICE AF - Recent directions 16
Copy programxrdcp –x
Wants to get ‘myfile’ from the
repository
Xrootd siteA
A globalized cluster(ALICE global redirector)
Any otherXrootd site
Xrootd siteB
Cmsd
Xrootd
Locate ‘m
yfile’
Open items (1/2) Hot: putting clients into servers (e.g. to make efficient proxies)
Or: different criteria to fully differentiate clients in the same app E.g. How to instantiate together:
a client tuned for WAN TTreeCache-based random access one optimized for blasting non-TTreeCache LAN traffic ? one optimized for large files xfers
The Extreme Copy : Torrent-like dynamic multiserver file fetching Needs the previous item to be really strong
Components for site cooperation Proxies and caching proxies (proofs of concept right now) Bandwidth/queuing manager (early alpha)
‘Personal’ persistent caching proxy, caching chunks in a local disk A full-featured ‘xrd’ command line interface
The current one is a quite rough tool
XROOTD storage and ALICE AF - Recent directions 17
Open items (2/2) Client-side data management funcs (e.g. ‘recursive
ls’ or ‘df’): good level but incomplete by now WAN performance: huge breakthrough, still to gain
Both for file xfer and data access (TTreeCache and not)
A robust and complete server-to-server file copy ROOT integration: very good quality, but still to gain
Only partially asynchronous (XrdClient can be fully async instead)
Will be more evident with the parallelization of the computing, I/O will likely become the bottleneck again
More “Intelligent” readahead An homogeneous, top-class support structure
XROOTD storage and ALICE AF - Recent directions 18
XROOTD storage and ALICE AF - Recent directions 19
Greedier data consumers
In the data access frameworks (e.g. ROOT) many things evolve
Applications tend to become more efficient (=greedier)
Applications exploiting multicore CPUs will be even more An opportunity for interactive data access
(e.g. from a laptop) A challenge for the data access providers (the
sites) The massive deployment of newer
technologies could be the real challenge for the next years
Questions?Thank you!
XROOTD storage and ALICE AF - Recent directions 20