XROOTD Storage

20
XROOTD Storage Recent directions Fabrizio Furano

description

XROOTD Storage. Recent directions. Fabrizio Furano. The ALICE recipe for storage. Many sites, exposing the XROOTD protocol Native XROOTD A few with DPM+XROOTD One with CASTOR+XROOTD A few with Dcache’s Xrootd protocol implementation Native XROOTD + 2 plugins + MonALISA - PowerPoint PPT Presentation

Transcript of XROOTD Storage

Page 1: XROOTD  Storage

XROOTD Storage

Recent directions

Fabrizio Furano

Page 2: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 2

The ALICE recipe for storage

Many sites, exposing the XROOTD protocol Native XROOTD A few with DPM+XROOTD One with CASTOR+XROOTD A few with Dcache’s Xrootd protocol

implementation

Native XROOTD + 2 plugins + MonALISA In a simple bundled setup

Alien points directly to single SEs Privileges local data according to the

catalogue’s content OCDB accessed via WAN

Page 3: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 3

An unique protocol Having an unique WAN+LAN compliant

protocol allows to do the right thing Exploit locality whenever possible (=most of

the times) Do not worry too much if a job accesses some

data files which is not in the same site. This has to be possible and foreseen. Explicitly creating 100s of replicas just for a job

takes much more time and risk. Access condition data ONLY via WAN

Page 4: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 4

The XROOTD way Each server manages a portion of the

storage many servers with small disks, or fewer servers with huge disks

Low overhead DB-free aggregation of servers Gives the functionalities of an unique thing A non-transactional file system

Efficient LAN/WAN byte-level data access Protocol/architecture built on the tough HEP

requirements

Page 5: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 5

What we can do Build efficient storage clusters Aggregating storage clusters into WAN

federations Access efficiently remote data Build proxies that can cache an external

repository And increase the data access performance (or

decrease the WAN traffic) through a decent ‘hit rate’

Build hybrid proxies Caching an external repository while storing

local data locally In practice, the ‘Jamboree demonstrator’

Page 6: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 6

Aggregated sites Suppose that we can easily aggregate sites

And provide an efficient entry point that “knows them all natively”

We could use it to access data directly We could use it as a building block for a

proxy-based structure called VMSS If site A is asked for file X, A will fetch X from

some other ‘friend’ site, though the unique entry point

A itself is a potential source, accessible through the same entry point

Page 7: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 7

The VMSS

Xrootd site

A globalized clusterALICE global redirector

Local clients workNormally at each

site

Missing a file?The storage asks to the global redirector

Gets redirected to the rightcollaborating cluster, and fetches it.

Immediately.

A smart clientcould point here

Any otherXrootd site

Xrootd site

Cmsd

Xrootd

VirtualMassStorageSystem… built on data Globalization

Page 8: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 8

The ALICE CAF storage Data is proxied locally to adequately feed

PROOF From the 91 AliEn sites

Cmsd

Xrootd

Cmsd

Xrootd

Cmsd

Xrootd

Cmsd

Xrootd

Cmsd

Xrootd AliEnALICE CAFData mgmt

tools

Page 9: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 9

The SKAF/AAF storage Take a PROOF cluster, with XROOTD storage, make it easily

installable and well monitored (MonALISA) Add the xrd-dm plugin by M.Vala

Transform your AF into a proxy of the ALICE globalized storage, through the ALICE GR

If something needed is not present, it will be fetched in FAST Also support sites not seen by the GR, through internal dynamic

prioritization of the AliEn sites.

Data management: how does the data appear? (Pre)staging requests

This means that it works with the usual ROOT tools but also without Suppose that an user always runs the same analysis several times

Which is almost always true The first round will be not so fast but working, the subsequents will

be fast

The first one was the ALICE SKAF (Kosice, Slovakia)

Page 10: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 10

AliEn LFN/PFN Often a source of misunderstandings

AliEn has LFNs They are the user-readable names

AliEn converts them to PFNs The ugly filenames with numbers

An AliEn PFN is considered by XROOTD as an XROOTD LFN

XROOTD takes care internally of its PFN translation Hiding the internal mount points

At the end: USERS see Alien LFNs SYSADMINS see XROOTD PFNs (= Alien PFNs

with a prefix)

Page 11: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 11

The ALICE PROOF farms

Historically, the *AF admins didn’t like to deal with the AliEn PFNs The ugly filenames made by numbers They wanted to store only LFNs (i.e. the human-readable

filenames) So, Afs are ALREADY storing native LFNs

If these XROOTD-based storages get aggregated by the Global Redirector: Their content will be accessible as a whole, with no need of

translating names through AliEn, the files are there with their true name.

Interesting wild experiment (pioneered by SKAF) The *AFs could give data each other, by using the VMSS

ATLAS-US is doing this as a demonstrator for tier-3s So, a part of the ALICE storage could be accessed directly with

nice names, skipping the overhead of the AliEn xlation.

Page 12: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 12

What’s needed… at the end?

The storage part acting as an automatic stager (an LFN-based proxy of the ALICE storage) Like this by default now !

Looks for friend AFs hosting the LFN through the GR

Eventually, look in AliEn-only SEs Through the AliEn mechanism (lfn->guid->pfn) And keep the file named with the LFN

Internally prioritizes sites with a “penalty” mechanism

WAN accessibility of the cluster, without NATs

OR: a small set of machines that proxy it through the firewall (maybe a future dev)

Page 13: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 13

PROOF on the GRID? GRID WNs to start PROOF interactive

workers Ongoing interesting developments, e.g. PoD

by Anar Manafov

Data globalization/proxying seems an interesting match to feed them with data Ideas are welcome

The purpose is: Give handles to build a lightweight/dynamic

Data Management structure Whose unique goal is to work well

Enable interactivity for users

Page 14: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 14

Proxy sophistication Proxying is a concept, there are basically

two ways it could work: Proxying whole files (e.g. the VMSS)

The client waits for the entire file to be fetched in the SE

Proxying chunks (or data pages) The client’s requests are forwarded, and the

chunks are cached in the proxy as they pass through

In HEP we do have examples of the former It makes sense to make also the latter

possible Some work has been done (the original

XrdPss proxy or the newer, better prototype plugin by A.Peters)

Page 15: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 15

The eXtreme Copy Let’s suppose that we have to get a (big) file

And that there are several replicas in different sites

Big question: where to fetch it from? The closest one?

How can we tell if it’s the closest? Closest to what? Will it be faster as well?

The best connected one? It can always be overloaded or weak or broken

Whatever we choose, the situation can change over time Instead we want always the max efficiency

Page 16: XROOTD  Storage

The eXtreme Copy

XROOTD storage and ALICE AF - Recent directions 16

Copy programxrdcp –x

Wants to get ‘myfile’ from the

repository

Xrootd siteA

A globalized cluster(ALICE global redirector)

Any otherXrootd site

Xrootd siteB

Cmsd

Xrootd

Locate ‘m

yfile’

Page 17: XROOTD  Storage

Open items (1/2) Hot: putting clients into servers (e.g. to make efficient proxies)

Or: different criteria to fully differentiate clients in the same app E.g. How to instantiate together:

a client tuned for WAN TTreeCache-based random access one optimized for blasting non-TTreeCache LAN traffic ? one optimized for large files xfers

The Extreme Copy : Torrent-like dynamic multiserver file fetching Needs the previous item to be really strong

Components for site cooperation Proxies and caching proxies (proofs of concept right now) Bandwidth/queuing manager (early alpha)

‘Personal’ persistent caching proxy, caching chunks in a local disk A full-featured ‘xrd’ command line interface

The current one is a quite rough tool

XROOTD storage and ALICE AF - Recent directions 17

Page 18: XROOTD  Storage

Open items (2/2) Client-side data management funcs (e.g. ‘recursive

ls’ or ‘df’): good level but incomplete by now WAN performance: huge breakthrough, still to gain

Both for file xfer and data access (TTreeCache and not)

A robust and complete server-to-server file copy ROOT integration: very good quality, but still to gain

Only partially asynchronous (XrdClient can be fully async instead)

Will be more evident with the parallelization of the computing, I/O will likely become the bottleneck again

More “Intelligent” readahead An homogeneous, top-class support structure

XROOTD storage and ALICE AF - Recent directions 18

Page 19: XROOTD  Storage

XROOTD storage and ALICE AF - Recent directions 19

Greedier data consumers

In the data access frameworks (e.g. ROOT) many things evolve

Applications tend to become more efficient (=greedier)

Applications exploiting multicore CPUs will be even more An opportunity for interactive data access

(e.g. from a laptop) A challenge for the data access providers (the

sites) The massive deployment of newer

technologies could be the real challenge for the next years

Page 20: XROOTD  Storage

Questions?Thank you!

XROOTD storage and ALICE AF - Recent directions 20