Belle/Gfarm Grid Experiment at SC04
description
Transcript of Belle/Gfarm Grid Experiment at SC04
National Institute of Advanced Industrial Science and Technology
Belle/Gfarm Grid Experiment at SC04
Osamu TatebeOsamu TatebeGrid Technology Research Center, AISTGrid Technology Research Center, AIST
APAN WorkshopJan 27, 2005Bangkok
National Institute of Advanced Industrial Science and Technology
Goal and feature of Grid Datafarm
GoalGoalDependable data sharing among multiple organizationsHigh-speed data access, High-performance data computing
Grid DatafarmGrid DatafarmGfarm File System – Global dependable virtual file system
Federates scratch disks in PCsParallel & distributed data computing
Associates Computational Grid with Data GridFeaturesFeatures
Secured based on Grid Security InfrastructureScalable depending on data size and usage scenariosData location transparent data accessAutomatic and transparent replica selection for fault toleranceHigh-performance data access and computing by accessing multiple dispersed storages in parallel (file affinity scheduling)
National Institute of Advanced Industrial Science and Technology
Grid Datafarm (1): Gfarm file system - World-wide virtual file system [CCGrid 2002]
Transparent access to dispersed file data via global namespaceTransparent access to dispersed file data via global namespaceFiles can be stored somewhere in a GridApplications can access Gfarm file system without any modification as if it were mounted at /gfarmAutomatic and transparent replica selection for fault tolerance and access-concentration avoidance
Gfarm File System
/gfarm
ggf jp
aist gtrc
file1 file3file2 file4
file1 file2
File replica creation
Globalnamespace
mapping
National Institute of Advanced Industrial Science and Technology
Grid Datafarm (2): High-performance data access and computing support [CCGrid 2002]
CPU CPU CPU CPU
Gfarm File System
Computational GridComputational Grid
User A submits that accesses File A is allocated on a node that has
network
Job A File A
User’s view Physical execution view
User B submits that accesses File B
Job A
Job B is allocated on a node that has Job B File B
Compute and file system nodesShared network file system
Do not separate Storage and CPU
Scalable file I/O by exploiting local I/O
National Institute of Advanced Industrial Science and Technology
GfarmTM Data Grid middleware
Open source development Open source development GfarmTM version 1.0.4-4 released on Jan 11th, 2005 (http://datafarm.apgrid.org/)
Read-write mode support, more support for existing binary applications
A shared file system in a cluster or a gridAccessibility from legacy applications without any modificationStandard protocol support by scp, GridFTP server, samba server, . . .application
Gfarm client library
Metadata server
CPU CPU CPU CPU
. . .gfsd gfsd gfsd gfsd
gfmd slapd
Compute and file system nodes
* Existing applicationscan access Gfarm file systemwithout any modification using LD_PRELOAD
National Institute of Advanced Industrial Science and Technology
Demonstration
File manipulationFile manipulationcd, ls, cp, mv, cat, . . .grep
Gfarm commandGfarm commandFile replica creation, node & process information
Remote (parallel) program executionRemote (parallel) program executiongfrun prog args . . .gfrun -N #procs prog args . . .gfrun -G filename prog args . . .
National Institute of Advanced Industrial Science and Technology
Belle/Gfarm Grid experimentat SC2004
1. Online KEKB/Belle distributed data 1. Online KEKB/Belle distributed data analysisanalysis
2. KEKB/Belle large-scale data analysis2. KEKB/Belle large-scale data analysis
(terabyte-scale US-Japan file (terabyte-scale US-Japan file replication)replication)
National Institute of Advanced Industrial Science and Technology
1. Online KEKB/Belle distributed data analysis (1)
Online distributed and parallel data analysis of raw Online distributed and parallel data analysis of raw data using AIST and SDSC clustersdata using AIST and SDSC clustersRealtime histogram and event data display at SC2Realtime histogram and event data display at SC2004 conference hall004 conference hall
AISTSDSC
KEKSC2004Gfarm file system
- on demand data replication- distributed & parallel data analysis
Raw data10 MB/sec
128 nodes3.75 TB64 nodes
50 TB
192 nodes53.75 TB
- realtime histogram display- realtime event data display
National Institute of Advanced Industrial Science and Technology
1. Online KEKB/Belle distributed data analysis (2)
Construct a Construct a shared network file systemshared network file system between Japan and between Japan and USUSStore Store KEKB/Belle raw dataKEKB/Belle raw data to the Gfarm file system to the Gfarm file system
Physically, it is divided into N fragments, and stored on N different node
Every compute node can access it as if it were Every compute node can access it as if it were mounted at mounted at /gfarm/gfarm
AIST SDSC
KEKSC2004
Raw data10 MB/sec
128 nodes3.75 TB
64 nodes50 TB
192 nodes53.75 TB
- realtime histogram display- realtime event data display
Gfarm File System
National Institute of Advanced Industrial Science and Technology
1. Online KEKB/Belle distributed data analysis (3)
Basf is installed at Basf is installed at /gfarm/~/belle/gfarm/~/belleInstall once, run everywhere
The raw data will be analyzed at AIST or SDSC just after it is storedThe raw data will be analyzed at AIST or SDSC just after it is storedAnalyzed data can be viewed at SC2004 in realtimeAnalyzed data can be viewed at SC2004 in realtime
Histogram snapshot is generated every 5 minutes
AIST SDSC
KEKSC2004
Raw data10 MB/sec
128 nodes3.75 TB
64 nodes50 TB
192 nodes53.75 TB
- realtime histogram display- realtime event data display
Gfarm File System
Computational GridComputational Grid
National Institute of Advanced Industrial Science and Technology
2. KEKB/Belle large-scale data analysis in a Grid
Gfarm file system using SC conference hall and AIST F clusGfarm file system using SC conference hall and AIST F cluster ter Assume data is stored at SC conference hallAssume data is stored at SC conference hall
Terabyte-scale mock dataSubmit data analysis job at AIST F clusterSubmit data analysis job at AIST F cluster
Required data is automatically transferred from SC to AIST on demandUsers just see a shared file systemNetwork transfer rate is measured
Conf 1: 8 parallel processes (2GBx8 data)Conf 1: 8 parallel processes (2GBx8 data)Conf 2: 16 parallel processes (8GBx16 data)Conf 2: 16 parallel processes (8GBx16 data)
National Institute of Advanced Industrial Science and Technology
2. Network & machine configuration
SC2004StorCloud
AISTF cluster
JGN2Tsukuba WAN
10Gbps 10Gbps
JGN2 Japan-US
Tokyo10Gbps Chicago
Pittsburgh
PCPCPCPCPCPCPCPC
PCx81Gx8FCx16
2TBx16
10G (OC192)
AISTF cluster
PCx256
National Institute of Advanced Industrial Science and Technology
SC→AIST (Iperf x 8)
7,925,607,155 bps (Wed Nov 10 17:13:22 JST 2004)(5-sec average bandwidth, 991 Mbps / TCP flow)
National Institute of Advanced Industrial Science and Technology
Iperf measurement
Standard TCP (Linux 2.4)Standard TCP (Linux 2.4)Socket buffer size and txqueuelen
No kernel patch, no TCP modificationNo kernel patch, no TCP modificationNo traffic shapingNo traffic shapingNo bottleneck, no problemNo bottleneck, no problem
National Institute of Advanced Industrial Science and Technology
Conf 1: 8 processes (2GBx8)
2,084,209,307 bps (Fri Nov 12 03:41:54 JST 2004)(5-sec average, 261 Mbps / TCP flow, ~ disk performance of F cluster)
National Institute of Advanced Industrial Science and Technology
Conf 2: 16 processes (8GBx16)
738,920,649 bps (Fri Nov 12 05:30:35 JST 2004)(5-sec average, 46 Mbps!! / TCP flow, ?????)
National Institute of Advanced Industrial Science and Technology
Conf 2: network traffic of JGN2 int’l link
Heavy traffic when application started
Heavy packet loss→ssthresh decreases
National Institute of Advanced Industrial Science and Technology
Summary
Belle/Gfarm Grid experiment at SC2004Belle/Gfarm Grid experiment at SC20041. Online KEKB/Belle distributed data analysis2. KEKB/Belle large-scale data analysis
We succeeded distributed & parallel data analysis We succeeded distributed & parallel data analysis of KEKB/Belle data and realtime display at SC conof KEKB/Belle data and realtime display at SC conference hallference hall
National Institute of Advanced Industrial Science and Technology
Development Status and Future Plan
Gfarm – Grid file systemGfarm – Grid file systemGlobal virtual file system
A dependable network shared file system in a cluster or a gridHigh performance data computing support
Associates Computational Grid with Data GridGfarm v1 Data Grid middlewareGfarm v1 Data Grid middleware
Version 1.0.4-4 released on Jan 11, 2005 (http://datafarm.apgrid.org/)Existing programs can access Gfarm file system as if it were mounted at /gfarm
Gfarm v2 – towards *true* global virtual file systemGfarm v2 – towards *true* global virtual file systemPOSIX compliant - supports read-write mode, advisory file locking, . . .Performance and Robustness improved, Security enhanced.Can be substituted for NFS, AFS, . . .
Application areaApplication areaScientific application (High energy physics, Astronomic data analysis, Bioinformatics, Computational Chemistry, Computational Physics, . . .)Business application (Dependable data computing in eGovernment and eCommerce, . . .)Other applications that needs dependable file sharing among several organizations
Standardization effort with GGF Grid File System WG (GFS-WG)Standardization effort with GGF Grid File System WG (GFS-WG)
https://datafarm.apgrid.org/