The GSI Mass Storage for Experiment Data
-
Upload
brett-case -
Category
Documents
-
view
29 -
download
2
description
Transcript of The GSI Mass Storage for Experiment Data
The GSI Mass Storage for Experiment Data
DVEE-Palaver GSI Darmstadt
Feb. 15, 2005
Horst Göringer, GSI Darmstadt
Horst Göringer GSI DVEE Palaver 15.2.2005 2
Overview
different views current status last enhancements:
- write cache
- on-line connection to DAQ future plans conclusions
Horst Göringer GSI DVEE Palaver 15.2.2005 3
GSI Mass Storage System
Gsi mass STORagE system
gstore
Horst Göringer GSI DVEE Palaver 15.2.2005 4
gstore: storage view
central tape central disk clients
write cache
tsmcli client
RFIO client
DAQ client
ArchivePool,
RetrievePool,StagePool,
...
...
DAQPool,...
disk
memory
memory
read cache
write cache
ATL
Horst Göringer GSI DVEE Palaver 15.2.2005 5
gstore: hardware view
3 automatic tape libraries (ATL):
(1) IBM 3494 (AIX)
8 tape drives IBM 3590 (14 MByte/s)
ca. 2300 volumes (47 TByte, 13 TByte backup)
1 data mover (adsmsv1)
access via adsmcli, RFIO read
read cache 1.1 TByte
StagePool, RetrievePool
Horst Göringer GSI DVEE Palaver 15.2.2005 6
gstore: hardware view
(2) StorageTek L700 (Windows 2000)
8 tape drives LTO2 ULTRIUM (35 MByte/s)
ca 170 volumes (32 TByte)
8 data mover (gsidmxx), connected via SAN
access via tsmcli, RFIO
read cache 2.5 TByte
StagePool, RetrievePool
write cache
ArchivePool: 0.28 TByte
DAQPool: 0.28 TByte
Horst Göringer GSI DVEE Palaver 15.2.2005 7
gstore: hardware view
(3) StorageTek L700 (Windows 2000)
4 tape drives LTO1 ULTRIUM (15 MByte/s)
ca. 80 volumes (10 TByte):
backup copy of 'irrecoverable' archives ...raw
mainly for backup of user data (~ 30 TByte)
Horst Göringer GSI DVEE Palaver 15.2.2005 8
gstore: software view
2 major components:
• TSM (Tivoli Storage Manager) commercial
handles tape drives and robots
data base• GSI software (~ 80,000 lines of code)
C, sockets, threads
- interface to user (tsmcli / adsmcli, RFIO)
- interface to TSM (TSM API client)
- cache administration
Horst Göringer GSI DVEE Palaver 15.2.2005 9
gstore user view: tsmcli
tsmcli subcommands:
archive file* archive path
retrieve file* archive path
query file* archive path*
stage file* archive path
delete file archive path
ws_query file* archive path
pool_query pool*
*: any combination of wildcard characters (*,?) allowed
soon: file may contain list of files (with wildcard chars)
Horst Göringer GSI DVEE Palaver 15.2.2005 10
gstore user view: RFIO
rfio_[f]open
rfio_[f]read
rfio_[f]write
rfio_[f]close
rfio_[f]stat
rfio_lseek
GSI extensions (for on-line DAQ connection):
rfio_[f]endfile
rfio_[f]newfile
Horst Göringer GSI DVEE Palaver 15.2.2005 11
gstore server view: query
writecacheserver
readcacheserver
DB
DB
TSMserver
client
serverentry
DB
Horst Göringer GSI DVEE Palaver 15.2.2005 12
gstore server view: archive to cache
writecacheserver
readcacheserver
DB
DB
TSMserver
writecache
client
data mover i (of n)
serverentry
DB
moverserver
Horst Göringer GSI DVEE Palaver 15.2.2005 13
gstore server view: archive from cache
writecacheserver
DBTSMserver
tape
Agent
writecache
data mover i (of n)
DB
SAN
serverarchive
TSMStor.
Horst Göringer GSI DVEE Palaver 15.2.2005 14
gstore server view: retrieve from tape
writecacheserver
readcacheserver
DB
DB
TSMserver
tape
AgentStor.
TSM
cacheread
client
data mover i (of n)
entryserver
DB
SAN
moverserver
Horst Göringer GSI DVEE Palaver 15.2.2005 15
server view: retrieve from write cache
writecacheserver
readcacheserver
DB
DB
TSMserver
cacheread write
cache
client
data mover jdata mover i
DB
serverentry
moverserver
servermover
Horst Göringer GSI DVEE Palaver 15.2.2005 16
gstore: overall server view
writecacheserver
readcacheserver
DB
DB
TSMserver
tape tape
tape tape
... servercache
AgentStor.
TSM
cacheread write
cache
client
data mover i (of n)
serverentry
DB
SAN
moverserver
archiveserver
Horst Göringer GSI DVEE Palaver 15.2.2005 17
server view: gstore design concepts
• strict separation of control and data flow• no bottleneck for data• scalable in
capacity (tape and disk)
I/O bandwidth• hardware independent
(as long as TSM support)• platform independent• unique name space
Horst Göringer GSI DVEE Palaver 15.2.2005 18
server view: cache administration • multithreaded servers for read and write cache• each with own metadata DB• main tasks:
- lock/unlock files
- select data movers and file systems
- collect actual infos on
disk space
soon: data mover and disk load -> load balancing
- trigger asynchronous archiving
- disk cleaning • several disk pools with different attributes:
StagePool, RetrievePool, ArchivePool, DAQPool, ...
Horst Göringer GSI DVEE Palaver 15.2.2005 19
usage profile: batch farmbatch farm: ~120 double processor nodes
=> highly parallel mass storage access (read and write)
• read requests:
'good' user: stage all files before
use wildcard chars
'bad' user: read lots of single files from tape
'bad' system: stage disk/DM crashes during analysis
• write requests:
via write cache
distribute as uniformly as possible
Horst Göringer GSI DVEE Palaver 15.2.2005 20
usage profile: experiment DAQ
• several continous data streams from DAQ• keep same DM during life time of data stream• only via RFIO• GSI extensions necessary:
rfio_[f]endfile, rfio_[f]newfile• disks faster emptied than filled:
network -> disk: ~10 MByte/s
disk -> tape: ~30 MByte/s
=> time to stage for on-line analysis• enough disk buffer necessary for case of problems
(robot, TSM, ...)
Horst Göringer GSI DVEE Palaver 15.2.2005 21
current plans: new hardwaremore and safer disks:• write cache: all RAID
4 TByte (ArchivePool, DAQPool)• read cache: +7.5 TByte new RAID
StagePool, RetrievePool,
new pools, e.g. with longer file life time• 5 new data movers:
new fail-safe entry server• hosts query server, cache administration servers
-> query performance!• take-over in case of host failure• metadata DBs mirrored on 2nd host
Horst Göringer GSI DVEE Palaver 15.2.2005 22
current plans: merge tsmcli /adsmcli
new command gstore:• replaces tsmcli and adsmcli• unique name space (already available)• users need not care in which robot data reside• new archive: policy computing center
Horst Göringer GSI DVEE Palaver 15.2.2005 23
brief excursion: future of IBM 3494?
• still heavily used• rather full• hardware highly reliable• should be decided this year!
Horst Göringer GSI DVEE Palaver 15.2.2005 24
usage IBM 3494 (AIX)
Horst Göringer GSI DVEE Palaver 15.2.2005 25
brief excursion: future of IBM 3494?
2 extreme options (and more in between):• no more money investment
use as long as possible
in a few years: move data to other robot• upgrade tape drives and connect to SAN
3590 (~30 GB, 14 MB/s) -> 3592 (300 GB, 40 MB/s)
new media: => 700 TByte capacity
access with available data movers via SAN
new fail-safe TSM server (Linux?)
Horst Göringer GSI DVEE Palaver 15.2.2005 26
current plans: load balancing
• acquire actual info on no. of read/write processes
for each disk, data mover, pool• new write request:
select resource with lowest load• new read request:
avoid 'hot spots'
-> create additional instances of stage file• new option '-randomize' for stage/retrieve
distribute equally to different data movers / disks
split into n (parallel) jobs
Horst Göringer GSI DVEE Palaver 15.2.2005 27
current plans: new org. of DMs
• Linux platform
more familar environment (shell scripts, Unix commands, ...)
case sensitive file names
current mainstream OS for experiment DV
• '2nd level' data movers
no SAN connection
disks filled via ('1st level') DMs with SAN connection
for stage pools with guaranteed life time of files
Horst Göringer GSI DVEE Palaver 15.2.2005 28
current plans: new org. of DMs• integration of selected group file servers
as '2nd level' data movers
disk space (logically) reserved for owners
pool policy according to owners
many advantages:
no NFS => much faster I/O
files physically distributed over several servers
load balancing of gstore
disk cleaning
disadvantages:
only for exp. data, access via gstore interface
Horst Göringer GSI DVEE Palaver 15.2.2005 29
current plans: user interface• a large number of user requests:
- longer file names
- option to rename files
- more specific return codes
- ...• program code consolidation • improved error recovery after HW failures• support for successor of alien• GRID support
- gstore as Storage Element (SE)
- Storage Resource Manager (SRM)
-> new functionalities, e.g. reserve resources
Horst Göringer GSI DVEE Palaver 15.2.2005 30
Conclusions
• GSI concept for mass storage successfully verified• hardware and platform independent• scalable in capacity and bandwidth to keep up with
- requirements of future batch farm(s)
- data rates of future experiments• gstore able to manage very different usage profiles• but still a lot of work ...
to fully reach all discussed plans