Performance and Scalability of xrootd

22
Performance and Scalability of xrootd Andrew Hanushevsky (SLAC), Wilko Kroeger (SLAC), Bill Weeks (SLAC), Fabrizio Furano (INFN/Padova), Gerardo Ganis (CERN) Jean-Yves Nief (IN2P3), Peter Elmer (U Wisconsin) Les Cottrell (SLAC), Yee Ting Li (SLAC) Computing in High Energy Physics 13-17 February 2006 http://xrootd.slac.stanford.edu xrootd is largely funded by the US Department of Energy Contract DE-AC02-76SF00515 with Stanford University

description

Performance and Scalability of xrootd. Andrew Hanushevsky (SLAC), Wilko Kroeger (SLAC), Bill Weeks (SLAC), Fabrizio Furano (INFN/Padova), Gerardo Ganis (CERN) Jean-Yves Nief (IN2P3), Peter Elmer (U Wisconsin) Les Cottrell (SLAC), Yee Ting Li (SLAC). Computing in High Energy Physics - PowerPoint PPT Presentation

Transcript of Performance and Scalability of xrootd

Page 1: Performance and Scalability of xrootd

Performance and Scalability of xrootd

Andrew Hanushevsky (SLAC),Wilko Kroeger (SLAC), Bill Weeks (SLAC),

Fabrizio Furano (INFN/Padova), Gerardo Ganis (CERN)Jean-Yves Nief (IN2P3), Peter Elmer (U Wisconsin)

Les Cottrell (SLAC), Yee Ting Li (SLAC)

Computing in High Energy Physics

13-17 February 2006 http://xrootd.slac.stanford.edu xrootd is largely funded by the US Department of Energy

Contract DE-AC02-76SF00515 with Stanford University

Page 2: Performance and Scalability of xrootd

CHEP 13-17 February 2006 2: http://xrootd.slac.stanford.edu

Outline

Architecture Overview Performance & Scalability

Single Server Performance Speed, latency, and bandwidth Resource overhead

Scalability Server and administrative

Conclusion

Page 3: Performance and Scalability of xrootd

CHEP 13-17 February 2006 3: http://xrootd.slac.stanford.edu

authentication(gsi, krb5, etc)

Clustering(olbd)

lfn2pfnprefix encoding

Storage System(oss, drm/srm, etc)

authorization(name based)

File System(ofs, sfs, alice, etc)

Protocol (1 of n)(xrootd)

xrootd Plugin Architecture

Protocol Driver(XRD)

Page 4: Performance and Scalability of xrootd

CHEP 13-17 February 2006 4: http://xrootd.slac.stanford.edu

Performance Aspects

Speed for large transfers MB/Sec

Random vs Sequential Synchronous vs asynchronous Memory mapped (copy vs “no-copy”)

Latency for small transfers sec round trip time

Bandwidth for scalability “your favorite unit”/Sec vs increasing load

Page 5: Performance and Scalability of xrootd

CHEP 13-17 February 2006 5: http://xrootd.slac.stanford.edu

Raw Speed I (sequential)

Disk Limit

Sun V20z2x1.86GHz Opteron 244

16GB RAMSeagate ST373307LC73GB 10K rpm SCSI

sendfile() anyone?

Page 6: Performance and Scalability of xrootd

CHEP 13-17 February 2006 6: http://xrootd.slac.stanford.edu

Raw Speed II (random I/O)

(file not preloaded)

Page 7: Performance and Scalability of xrootd

CHEP 13-17 February 2006 7: http://xrootd.slac.stanford.edu

Latency Per Request

Page 8: Performance and Scalability of xrootd

CHEP 13-17 February 2006 8: http://xrootd.slac.stanford.edu

Event Rate Bandwidth

NetApp FAS270: 1250 dual 650 MHz cpu, 1Gb NIC, 1GB cache, RAID 5 FC 140 GB 10k rpmApple Xserve: UltraSparc 3 dual 900MHz cpu, 1Gb NIC, RAID 5 FC 180 GB 7.2k rpm Sun 280r, Solaris 8, Seagate ST118167FCCost factor: 1.45

Page 9: Performance and Scalability of xrootd

CHEP 13-17 February 2006 9: http://xrootd.slac.stanford.edu

Latency & Bandwidth

Latency & bandwidth are closely related Inversely proportional if linear scaling present

The smaller the overhead the greater the bandwidth Underlying infrastructure is critical

OS and devices

Page 10: Performance and Scalability of xrootd

CHEP 13-17 February 2006 10: http://xrootd.slac.stanford.edu

Server Scaling (Capacity vs Load)

Page 11: Performance and Scalability of xrootd

CHEP 13-17 February 2006 11: http://xrootd.slac.stanford.edu

ESnet routed ESnet SDN layer 2 via USN

SLAC to Seattle

BW Challenge

Seattle to SLAC

•SC2005 BW Challenge•Latency Bandwidth

•8 xrootd Servers•4@SLAC & 4@Seattle•Sun V20z w/ 10Gb NIC•Dual 1.8/2.6GHz Opterons•Linux 2.6.12

•1,024 Parallel Clients•128 per server

•35Gb/sec peak•Higher speeds killed router•2 full duplex 10Gb/s links•Provided 26.7% overall BW

•BW averaged 106Gb/sec•17 Monitored links total

I/OBandwidth (wide area network)

http://www-iepm.slac.stanford.edu/monitoring/bulk/sc2005/hiperf.html

Page 12: Performance and Scalability of xrootd

CHEP 13-17 February 2006 12: http://xrootd.slac.stanford.edu

xrootd Server Scaling

Linear scaling relative to load Allows deterministic sizing of server

Disk NIC CPU Memory

Performance tied directly to hardware cost Underlying hardware & software are critical

Page 13: Performance and Scalability of xrootd

CHEP 13-17 February 2006 13: http://xrootd.slac.stanford.edu

Overhead Distribution

Page 14: Performance and Scalability of xrootd

CHEP 13-17 February 2006 14: http://xrootd.slac.stanford.edu

OS Effects

Page 15: Performance and Scalability of xrootd

CHEP 13-17 February 2006 15: http://xrootd.slac.stanford.edu

Device & File System Effects

CPU limited

I/O limited

1 Event 2K

UFS good on small readsVXFS good on big reads

Page 16: Performance and Scalability of xrootd

CHEP 13-17 February 2006 16: http://xrootd.slac.stanford.edu

NIC Effects

Page 17: Performance and Scalability of xrootd

CHEP 13-17 February 2006 17: http://xrootd.slac.stanford.edu

Super Scaling

xrootd Servers Can Be Clustered Support for over 256,000 servers per cluster Open overhead of 100us*log64(number servers)

Uniform deployment Same software and configuration file everywhere No inherent 3rd party software requirements

Linear administrative scalingEffective load distribution

Page 18: Performance and Scalability of xrootd

CHEP 13-17 February 2006 18: http://xrootd.slac.stanford.edu

Cluster Data Scattering (usage)

Page 19: Performance and Scalability of xrootd

CHEP 13-17 February 2006 19: http://xrootd.slac.stanford.edu

Cluster Data Scattering (utilization)

Page 20: Performance and Scalability of xrootd

CHEP 13-17 February 2006 20: http://xrootd.slac.stanford.edu

Low Latency Opportunities

New programming paradigm Ultra-fast access to small random blocks

Accommodate object data Memory I/O instead of CPU to optimize access

Allows superior ad hoc object selection Structured clustering to scale access to memory

Multi-Terabyte memory systems at commodity prices PetaCachePetaCache Project SCALLASCALLA SStructured CCluster AArchitecture for LLow LLatency AAccess

Increased data exploration opportunities

Page 21: Performance and Scalability of xrootd

CHEP 13-17 February 2006 21: http://xrootd.slac.stanford.edu

Memory Access Characteristics

Block size effect on average overall

latency per I/O (1 job - 100k I/O’s)

Scaling effect on average overall

latency clients (5 - 40 jobs)

Disk I/O

Mem I/O

Page 22: Performance and Scalability of xrootd

CHEP 13-17 February 2006 22: http://xrootd.slac.stanford.edu

Conclusion

System performs far better than we anticipatedWhy? Excruciating attention to details

Protocols, algorithms, and implementation Effective software collaboration

INFN/Padova: Fabrizio Furano, Alvise Dorigao Root: Fons Rademakers, Gerri Ganis Alice: Derek Feichtinger, Guenter Kickinger Cornell: Gregory Sharp SLAC: Jacek Becla, Tofigh Azemoon, Wilko Kroeger, Bill Weeks BaBar: Pete Elmer

Critical operational collaboration BNL, CNAF, FZK, INFN, IN2P3, RAL, SLAC

Commitment to “the science needs drive the technology”