UTA Site Report

15
UTA Site Report Jae Yu UTA Site Report 5 th DOSAR Workshop Louisiana Tech University Sept. 27 – 28, 2007 Jae Yu Univ. of Texas, Arlington

description

UTA Site Report. Jae Yu Univ. of Texas, Arlington. 5 th DOSAR Workshop Louisiana Tech University Sept. 27 – 28, 2007. Introduction. UTA a partner of ATLAS SWT2 Actively participating in ATLAS production Kaushik De is co-leading Panda development - PowerPoint PPT Presentation

Transcript of UTA Site Report

Page 1: UTA Site Report

UTA Site ReportJae Yu

UTA Site Report

5th DOSAR WorkshopLouisiana Tech University

Sept. 27 – 28, 2007

Jae YuUniv. of Texas, Arlington

Page 2: UTA Site Report

9/27/2007 2UTA Site ReportJae Yu

• UTA a partner of ATLAS SWT2– Actively participating in ATLAS production

• Kaushik De is co-leading Panda development• Phase I implementation at UTACC completed and running• Phase II hardware installation completed

– Software installation in progress

– MonALISA based OSG Panda monitoring implemented Allow OSG sites to show up on the LHC Dashboard

– Working on DDM monitoring• HEP group working with other discipline in shared use of

existing computing resources– Interacting with the campus HPC community

• Working with HiPCAT, Texas grid community

Introduction

Page 3: UTA Site Report

9/27/2007 3UTA Site ReportJae Yu

• UTA HEP-CSE + UTSW Medical joint project through NSF MRI

• Primary equipment for D0 reconstruction and MC production up to 2005

• Now primarily participating in ATLAS MC production and reprocessing at part of SWT2 resources

• Other disciplines also use this facility but at a minimal level– Biology, Geology, UTSW medical, etc

• Hardware Capacity– PC based Linux system assisted by some 70TB of IDE disk storage– 3 IBM PS157 Series Shared Memory system

UTA DPCC – The 2003 Solution

Page 4: UTA Site Report

9/27/2007 4UTA Site ReportJae Yu

UTA – DPCC•100 P4 Xeon 2.6GHz CPU = 260 GHz•64TB of IDE RAID + 4TB internal•NFS File system

•84 P4 Xeon 2.4GHz CPU = 202 GHz•5TB of FBC + 3.2TB IDE Internal•GFS File system

•Total CPU: 462 GHz•Total disk: 76.2TB•Total Memory: 168Gbyte•Network bandwidth: 68Gb/sec

•HEP – CSE Joint Project•DØ+ATLAS•CSE Research

Page 5: UTA Site Report

9/27/2007 5UTA Site ReportJae Yu

• Joint effort between UTA, OU, LU and UNMSWT2

2000ft2 in the new buildingDesigned for 3000 1U nodesCould go up to 24k cores1MW Total power capacityCooling with 5 Livert units

Page 6: UTA Site Report

9/27/2007 6UTA Site ReportJae Yu

Installed SWT2 Phase I Equipment• 160 Node cluster (Dell SC1425)

– 320 cores (3.2GHz Xeon EM64T)– 2GB RAM/core– 160GB SATA local disk drive

• 8 Head Nodes (Dell 2850)– Dual 3.2 GHz Xeon EM64T– 8GB RAM– 2x 73GB (RAID1) SCSI Storage

• 16TB Storage System– Direct Data Networks S2A3000 system

• 80x250GB SATA drives– 6 I/O servers– IBRIX Fusion file system – Dedicated internal storage network (Gigabit Ethernet)

• Has been operating and conducting Panda production over a year

Page 7: UTA Site Report

9/27/2007 7UTA Site ReportJae Yu

SWT2 Phase II Equipment• 50 node cluster (SC 1435)

– 200 Cores (2.4GHz Dual Opteron 2216)– 8GB RAM (2GB/core)– 80 GB SATA disk

• 2 Head nodes– Dual Opteron 2216– 8 GB RAM– 2x73GB (RAID1) SAS Drives

• 75 TB (raw) Storage System– 10xMD1000 Enclosures– 150x500GB SATA Disk Drives– 8 I/O Nodes– DCache will be used for aggregating Storage

• 10GB internal network capacity• Hardware installation completed and software installation in

progress

Page 8: UTA Site Report

9/27/2007 8UTA Site ReportJae Yu

ATLAS SWT2 (2007)

SWT2-PH2@UTACPB

•320 Xeon 3.2 GHz cores = 940 GHz•2GB Ram/core = 640GB•160GB Internal/unit 25.6TB•8 dual core server nodes•16TB of storage assisted by 6I/O server•Dedicated Gbit internal connections

SWT2-PH1@UTACC

•200 Optaron 3.2 GHz cores = 640 GHz•2GB Ram/core = 400GB•8GB SATA /unit 4TB•8 dual core server nodes•75TB of storage by 10 Dell MD100 Raid assisted by 8 I/O servers•Dedicated Gbit internal connections

Page 9: UTA Site Report

9/27/2007 9UTA Site ReportJae Yu

• Had DS3 (44.7MBits/sec) till late 2004– Choke the heck out of the network for about a month

downloading D0 data for re-reconstruction– Met with VP of Research at UTA and emphasized the

importance of network backbone for attracting external funds• Increased to OC3 (155 MBits/s) early 2005• OC12 as of early 2006 • Connected to NLR (10GB/s) through LEARN (

http://www.tx-learn.org/) via 1GB connection to NTGP– $9.8M ($7.3M for optical fiber network) state of Texas funds

approved in Sept. 2004

Network Capacity History at UTA

Page 10: UTA Site Report

9/27/2007 10UTA Site ReportJae Yu

LEARN Status

Page 11: UTA Site Report

9/27/2007 11UTA Site ReportJae Yu

NLR – National LambdaRail

10GB/sec connections

LONI

ONENET

LEARN

Page 12: UTA Site Report

9/27/2007 12UTA Site ReportJae Yu

Software Development Activities • MonALISA based ATLAS distributed analysis

monitoring – A good, scalable system– Software development and implementation completed

• ATLAS-OSG sites are on the LHC Dashboard– New server purchased for OSG at UTA Activation to

follow shortly• Working on DDM monitoring project

Page 13: UTA Site Report

9/27/2007 13UTA Site ReportJae Yu

Centralized LHC Distributed Computing Monitor

Page 14: UTA Site Report

9/27/2007 14UTA Site ReportJae Yu

CSE Student Exchange Program • Joint effort between HEP and CSE

– David Levine is the primary contact at CSE• A total of 10 CSE MS Students each have worked in SAM-

Grid team– Five generations of the student– Many of them playing leading roles in grid community

• Abishek Rana at UCSD• Parag Mashilka at FNAL• Sudhamsh Reddy working for UTA at BNL

• New program with BNL implemented– First student on completed the tenure and is on job training– Second set of two Ph.D. students at BNL– Participating in ATLAS Panda project– One student working on pilot factory using condor glide-in

• Working on developing middleware

Page 15: UTA Site Report

9/27/2007 15UTA Site ReportJae Yu

Conclusions• MonALISA based panda monitoring activated

– New server needs to be brought up• Working on DDM monitoring

– Will be involved in further DDM work• Leading EG2 (photon) CSC note exercise• Connected to 10GB/s NLR via 1GB connection to UTD• Working closely with HiPCAT for State-wide grid

activities• Need to go back on to the LAW project but awaiting for

successes at OU and ISU