Experience of a low-maintenance distributed data management system W.Takase 1, Y.Matsumoto 1,...

20
Experience of a low- maintenance distributed data management system W.Takase 1 , Y.Matsumoto 1 , A.Hasan 2 , F.Di Lodovico 3 , Y.Watase 1 , T.Sasaki 1 1. High Energy Accelerator Research Organization (KEK), Japan 2. University of Liverpool, UK 3. Queen Mary, University of London, UK 1

Transcript of Experience of a low-maintenance distributed data management system W.Takase 1, Y.Matsumoto 1,...

Experience of a low-maintenance distributed data management

systemW.Takase1, Y.Matsumoto1, A.Hasan2, F.Di Lodovico3, Y.Watase1, T.Sasaki1

1. High Energy Accelerator Research Organization (KEK), Japan2. University of Liverpool, UK

3. Queen Mary, University of London, UK

1

Contents

• KEK iRODS system– Running in production over 2 years– Rules enable to store file efficiently– Federation with QMUL

• iRODS applications– SCALA : Visualization tool for SCALA– iRODS XOR-based backup

• Summary

2

iRODS overview

3

• Distributed data management system• Client-server architecture• Allows data management policies to be

enforced on the server-side• Provides interface to many different

types of storage• Client can access to iRODS via

– i-commands : Commands-line utilities– iRODS Browser : Web interface

KEK iRODS Systems

• 4 iRODS servers– RHEL 5.6– iRODS 2.5 ⇒ 3.2– PostgreSQL 9.1.1– 2 years〜

4

• iRODS Zone– KEK-T2K– KEK-MLF– KEKZone– demoKEKZone

HPSS (High Performance Storage System)

Disk System

• Storage resource

Data Management for T2K

• Tokai to Kamioka (T2K) Neutrino experimental group

• The experimental data is stored to KEK storage

• The group needed to provide an easy way to quickly access data collected to evaluate the quality of the data from outside of KEK

• iRODS provided the solution

5

http://t2k-experiment.org/wp-content/uploads/t2kmap.gif

Data Management for T2K

• KEK-T2K Zone for the experimental group started operation from October 2010

• Detected data are processed then transferred to KEK iRODS

• People in the group became to able to access the stored data easily and quickly– i-commands– iRODS Browser

6

iRODS Rules for KEK-T2K Zone

• Bundle and replicate the data

7

Client

T2Kdata server

disk

DB

Disk system

HPSS

iRODSserver

rodswebfilefile

filetar file

tar file

Each experimental data file is small (〜 several MB)

HPSS prefers large file

iRODS Rules for KEK-T2K Zone

• Response to request

8

disk

DB

Disk system

HPSS

ClientiRODSserver

rodswebtar filefile

file

request

T2Kdata server

Federation with QMUL

9

• Data replication among 2 sites• Share each site data

KEK-T2KExperimental

dataQMULZoneAnalytical

data

Federation

10

Amount of data in KEK-T2K

T2K group start the data taking on 22nd

Dec, 2011

11

SCALA : Visualization tool for iRODS

• Statistical Charts And Log Analyzer• iRODS lacked an interface for usage

statistics and also for debugging problems

• We developed a web interface for visualizing iRODS status overview– Statistical Charts page– Log Analyzer page

• SCALA has been installed to KEK iRODS

12

SCALA Overview

iRODS

Resource usage

Log files

ParseSummarize

Display

SCALA

• Input : iRODS outputs• Output : Visualized system daily status as charts

Parsed table

Summarized table

Database

13

Statistical Charts• Visualizes iRODS daily operational data

14

Log Analyzer

1. User clicks an bar

3. User clicks an error message

4. Related log displayed

2. Error detail displayed

• Provides error debugging tool

15

Download SCALA

• http://tgwww.kek.jp/scala/

16

iRODS XOR-based backup

• Full file replication– Current method for reliable storage of data is

replicate data– If disk fails or server fails still have a copy– Requires much storage space– Portion of the file becomes corrupt you have to

replace the full file• XOR-based backup

• Reduces the space with same robustness• Splits file into some blocks and creates parity blocks• If a block becomes corrupt you have to recreate only

corrupted block

17

XOR-based backup:100% recovery with any 2 servers fail

Full-File Replication uses 3 servers and needs 300GB

XOR-based backup uses4 servers but only needs 200GB

iRODS rule enables automatic processing

Server1

Server2

Server3

Server4

A B C D

E = B + C

F = C + D

G = A + D

H = A + B

18

XOR-based backup:Decoding flow

Server1 Server2 Server3 Server4

A B C D

E = B + C

F = C + D

G = A + D

H = A + B

19

Summary

• KEK iRODS system has been running in production over 2 years

• iRODS gives a way to quickly and easily access data outside of KEK

• Rule of bundle and replicate the data leads to store files efficiently

• Federation with QMUL enables to share each data and backup

• SCALA is a visualizing tool and has been installed KEK iRODS– It leads to better management of the iRODS overall service

• XOR-based backup provides data reliability and less storage cost compared with replication– iRODS rule enables automatic processing

20

Thank you for your attention!

Wataru [email protected]