TeraGrid Archival Migration of Data to the XD Era Phil Andrews et al
description
Transcript of TeraGrid Archival Migration of Data to the XD Era Phil Andrews et al
TeraGrid Archival Migration of Data to the XD Era
Phil Andrews et al
The Moving Finger writes; and having writ,Moves on; nor all your Piety nor WitShall lure it back to cancel half a Line,Nor all your Tears wash out a Word of it.
-Omar Khayyam
Users’ view: Once the data is written there is an ongoing responsibility of the TeraGrid to both ensure its survival and to provide continued access (within limits)
Users’ view: Once the data is written there is an ongoing responsibility of the TeraGrid to both ensure its survival and to provide continued access (within limits)
1Science Advisory Board Meeting, January'09
Significant Archival Data is at TeraGrid RP sites unfunded in XD
What to do about data at current TeraGrid RP sites not currently funded for the XD era? We feel that the TeraGrid has a communal obligation to continue data availability past the funding of centers where it was written.
It’s Later Than You Think! – A Tale of Two Cities, Charles Dickens
Science Advisory Board Meeting, January'09 2
Task Force Considered the Issue
• Representation from all TG archival sites• Users need the option of having copies of their
data at XD funded sites• Would like to solve the general problem:
replicating user data across TeraGrid• Needed to know how much data is involved• Need supplementary funding for extra archival
storage
Science Advisory Board Meeting, January'09
Half of my money went on women, booze, and gambling-the rest of it, I just frittered away! - George Best
3
How much data is there?
• Performed a data census of all TG sites• Critical sites are SDSC and NCSA• SDSC and NCSA dominate the total XD unfunded
archival storage• SDSC has 3.7PB of data that needs continued
support (3.1 PB in HPSS, 0.6 PB in SAM)• NCSA has about 2.5PB of unique data• There is continuing growth
Science Advisory Board Meeting, January'09
In those days a decree went out from Caesar Augustus that the whole world should be counted – Luke, 2:1
4
What to do about it?
• Data is the essential life force of some efforts• These transitions are ongoing and will continue• TeraGrid needs to be seen as a responsible data
guardian, independently of RP changes• Rather than a one-off solution, want to increase
TeraGrid’s data offerings to handle this requirement• Data replication between sites and a single
persistent archive are possible solutions• Replication capability is a necessity in any solution
Science Advisory Board Meeting, January'09
Plus ça change, plus c'est la même chose
5
0.00
10.00
20.00
30.00
40.00
50.00
60.00
70.00
80.00
2008 2009 2010 2011 2012 2013 2014 2015 2016
NCSA-BW
2D+XD (6)
PSC (2C)
NICS (2B)
TACC (2A)
Other TG
IU
NCSA-TG
SDSC
Limbo
Looking ahead … this will be a recurring and growing issue
Science Advisory Board Meeting, January'09
Notional projections for illustration only! Estimated 3PB/yr/PF(peak) for T2 systems, 5PB/yr for Blue Waters.
Current transition
6
The 1/99 experience at SDSC
• 1% of the users use 99% of the storage
• Can use this to help with replication (contacting users, complete tapes, etc.)
Science Advisory Board Meeting, January'09
7
What are the high-end archival users doing?
• (From 12/18/08 interview with Mike Norman)• Needs data from the last 3-5 years
–But this is generally the largest data–Experimental data must be stored long-term, dual copies
• Hero runs can generate 100+TB, 1 yr to generate, 1-2 years to analyze, need easy access during period–Can lots of disk replace an archive with that usage model? Yes, if
there’s enough of it for 1-3 years!
• “Deep” user, runs cannot be done elsewhere, data cannot be stored elsewhere. We need to meet these needs.
Science Advisory Board Meeting, January'09 8
How much will replication cost?
• Tape costs are approximately $100 per TB• Assume 10PB to be replicated • Tape Libraries, Drives, etc., approximately equal
tape cartridge costs• 10PB equates to about $2M in hardware• Some people costs
Science Advisory Board Meeting, January'09
How priceless is this piece of mind – Master Hsu Yun
9
How long would a network transfer take?
• Assume 10 PB to be moved via network• 10 Gb/s = 10 PB/100 days (will achieve less)• Most archives cannot move data at 1 GB/s• Overhead in SRB is high• 6 months would be a good achievement• 12 months should be the top end• Need to start soon!
Science Advisory Board Meeting, January'09
For though his body ’s under hatches, His soul has gone aloft. – Charles Dibdin
10
How should it be done?
Science Advisory Board Meeting, January'09
Multiple replication approaches are both possible and necessary
SRB and iRODS support replication via middleware.
GPFS is working with HPSS for archival federation.
Slash2 from PSC would federate multiple Lustre sites
A merry road, a mazy road, and such as we did tread. The night we went to Birmingham by way of Beachy Head! – G.K Chesterton
11
Specifics
• SDSC has 0.6 PB under SRB in SAM: SRB can start replication quickly
• SDSC has 3.1 PB in HPSS and exports gpfs-wan: could be used for replication
• PSC working on federation of Lustre file systems via Slash2
• Intent to provide replication is paramount
Science Advisory Board Meeting, January'09
Cut the cackle and come to the hosses – Physical applications of the operational method, Jeffreys and Jeffreys
12
Philosophy:
• Current funding approach allows continual advent and elimination of RP sites
• Need replication to allow for a frequent gain and loss of Data RP’s, either by itself or in combination with a persistent archive
• Shouldn’t wait for XD to solve this problem• Ask NSF for supplementary funding of ~$2M • Can the SAB speak to the importance of this effort?
Science Advisory Board Meeting, January'09
There is a tide in the affairs of menWhich taken at the flood, leads on to fortune– Shakespeare
13