SDSC TG RP Report September 07

24
SAN DIEGO SUPERCOMPUTER CENTER SDSC TG RP Report September 07 Mark Sheddon SDSC TG RP PI [email protected]

description

SDSC TG RP Report September 07. Mark Sheddon SDSC TG RP PI [email protected]. SDSC RP Quarterly Meeting Report. HPCOPS Plans FY08 RP Plans CY08 New and Cool Draft SDSC YR3 TG RP Milestones Quarterly RP Accomplishments/Milestone Status Wrap-up. HPCOPS FY08. Operate DataStar and BlueGene - PowerPoint PPT Presentation

Transcript of SDSC TG RP Report September 07

Page 1: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC TG RP ReportSeptember 07

Mark SheddonSDSC TG RP PI

[email protected]

Page 2: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC RP Quarterly Meeting Report

• HPCOPS Plans FY08• RP Plans CY08

• New and Cool• Draft SDSC YR3 TG RP Milestones

• Quarterly RP Accomplishments/Milestone Status• Wrap-up

Page 3: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

HPCOPS FY08

• Operate DataStar and BlueGene• Provide archival storage for HPC users• Support advanced consulting and porting of applications codes• Support GPFS-WAN• Continue Technical Collaborations with NCSA

• Middleware (Metascheduling)• Visualization Pipeline• Training

• Conduct EOT activities• Summer intern program• HPC Summer Institute• TeacherTECH• Develop online training• …

Page 4: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

New and CoolMeta-scheduling

• Co-scheduling• User set-able reservations

• Automatic resource selection

• On demand• Ensemble scheduling• Workflow

Themes from TG futures user WS’s

• Managing scheduling flexibility and allocations, such as throughput on demand or opportunistic scheduling

• Take meta-scheduling seriously, not as a future dream—allocate funding for development (11)

• Standardization that would make TeraGrid a real grid that could support the effective use of allocations and meta-scheduling

Page 5: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

• GUR/HARC Local Schedulers• GUR and HARC make reservations on local schedulers (Catalina, Moab, etc.)

• No communication between GUR and HARC is required.  

• Current Installations• GUR (from SDSC)

• NCSA (IA-64)• SDSC (IA-64, DataStar)

• HARC (from LSU)

• NCSA (All machines, except Abe)• SDSC (IA-64)

Co-schedulingUser Set-able Reservations

Catalina(SDSC)

Catalina(SDSC) MoabMoab

GUR(SDSC)

GUR(SDSC)

HARC(LSU)

HARC(LSU)

Local Schedulers

Reservation S/W

Page 6: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

New and CoolCo-scheduling

UK e-science All-Hands Demo (Sep07) SC’07 Demo (Nov07)

• SDSC/NCSA/UK• Working w/Peter Coveney• GUR, HARC

• Catalina, Moab

• Clinical Application• HemeLB used in the GENIUS project to characterize

blood flow in arteries and vessels of the brain (cerebral haemodynamics).

• 4 sites = 2+2• SDSC, NCSA, 2 UK - Manchester, Oxford (?)

Page 7: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

New and CoolSC’07 Demo: GPFS/HPSS

• SDSC and IBM• GPFS <=> HPSS integration

• Largest archive system (# of files)• 1,000,000,000 file archive

• Auto migration• GPFS' ILM policy scan performance -- how long it takes to find (pre-migration and

migration) candidates• HPSS' archive performance using GHI and file aggregation -- how fast can HPSS

update GPFS' extended attributes and store data into HPSS • HPSS' backup performance -- how long it takes HPSS to capture a file system

snapshot• HPSS' restore performance -- how long it takes for HPSS to rebuild the GPFS

namespace after a disaster

GPFSHPSS

Page 8: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

New and CoolSC’07 Demo: pNFS

SCinet Bandwidth Challenge

• TG partners and IBM

• TG Global Filesystem

• No GPFS-WAN license required

GPFS-WANServerSDSC

pNFS

pNFSClient

TeraGridNetwork

pNFSClient

pNFSClient

Page 9: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

Category Draft SDSC Major TG RP YR3 Milestones

Computational Resources

Deploy CTSSv4/5 on SDSC production resources as kits are released.

Evaluate and deploy software in conjunction with the Metascheduling WG, including deploying co-scheduling and user-settable reservations, automatic resource management, on demand (preemption, on selected machines), and on demand (highest priority).

Evaluate and deploy GPFS-HPSS in production.

Develop GPFS-WAN Best Practices Guide (with NCSA).

Evaluate and test-export GPFS-WAN via beta pNFS software.

IncaEnhance and use historical data/graphs produced from previous year and fine tune the information that is displayed to better identify trends and problem areas.

Investigate the ability to tune test frequency based on previous pass/fail history.

Page 10: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

CategoryDraft SDSC Major TG RP YR3 Milestones

Storage Operate 2+ PB Fibre Channel and SATA based storage infrastructure.

Operate 25 Petabyte HPSS and SAM-QFS tape archives.

Upgrade HPSS servers. (HPSS will be upgraded with two new IBM p575 and ten new IBM x3655 systems. The x3655 systems will serve as the HPSS disk and tape movers, with a 10GbE connection each, providing roughly 80 Gigabit of bandwidth in and out of HPSS. The two IBM p575 nodes will work as the HPSS Core and HPSS DB2 servers.)

NetworkingWork daily with systems group for moves, adds, deletions of servers and compute nodes within the HPC environment.

Track daily network performance of major interfaces and have online graphs available for performance comparison or troubleshooting.

SecurityCollaborate with NCSA /TG partners on log analysis, evaluating tighter integration of Kerberos authentication and use of KX.509 for GSI.

Improve intrusion detection systems for TG, for example, through increased use of Bro.

AccountingDevelop charging and uploading of TG storage usage, in concert with TeraGrid policies and procedures.

Page 11: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

CategoryDraft SDSC Major TG RP YR3 Milestones

Accounting (cont.)Merge TeraGrid Roaming/Non-roaming projects to simplify account/user reporting (local SDSC interface with TG accounting)Improve support for project activation/inactivation.

Applications Select and support two new ASTA applications.

User ServicesAct as primary consulting contact for new and existing users as defined by the TG Services WG.

Continue and complete the preparation work for participation in TeraGrid's Shibboleth testbed.

DB & Data Collection Services

Provide collections space on GPFS-WAN and archival systems for collections approved by the allocation committee.

Dissemination & Communication

Plan and execute successful SC’08 booth/demos.

Contribute to planning and participate in TG’08.

Page 12: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: ResourcesGPFS-WAN “2.0” (milestone complete)

• GPFS-WAN 2.0 “Test Day” March 21, 07• Sites participating: ANL, PSC, NCAR, NCSA, TACC, SDSC

• GPFS-WAN 2.0 Production April 2, 07• Production sites: ANL, NCSA, NCAR (Frost front-end), SDSC• Successfully migrated 150 TB from old to new system• Successfully tested GPFS-WAN w/network outage and fail-over to

Abilene on June 12

• GPFS-WAN 2.0 serving from SDSC• New IBM p575 (16, 8-way) servers in fully redundant, high-availability

configuration• Purged “scratch” (63TB)• “Projects” space (404TB)• “Collections” space (105TB)

Page 13: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: Resources

• Systems• Implement CTSSv4 kits as they become available on DataStar,

BlueGene, and the IA-64 clusters (milestone complete to date) • CTSSv4 kits implemented on DataStar, BlueGene, and IA-64 clusters

• Archive• Expand disk cache for HPSS to provide increased performance

(milestone complete)• HPSS disk cache expanded from 33TB to 200TB

• Put SDSC remote silo at PSC into production (milestone complete)• Developing usage policies

• Provide high-speed GridFTP access to HPSS (milestone in progress)• Dependent on HPSS 6.2 upgrade, production target by year end

Page 14: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: Data Collections• Add collections approved by the allocation committee (ongoing

milestone compete to date)• Multiphase Flow Simulations

• PI: Josette Bellan, Caltech/JPL

• The Next Generation Biology Workbench: A Resource for Biological Data Analysis• PI: Mark Miller, SDSC

• Projects in Astrophysical and Cosmological Structure Formation• PI: Michael Norman, UCSD

• Employment Responses to Global Markets• PI: Marc Muendler, UCSD

• Insight into biomolecular structure, function and interactions from simulation• PI: Thomas Cheatham, Utah

• Water Infrastructure Information System• PI: Sunil Sinha, Penn State

• Dynamic Astronomical Image Services for the Virtual Observatory• PI: Roy Williams, Caltech

• Expand online disk available to collections from 230TB to 400TB (milestone complete)

Page 15: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Updates: ASTA

• ASTA - Current• Southern California Earthquake Center (SCEC)

• PI Tom Jordan (USC), et. al. - SDSC Lead Yifeng Cui• The TeraShake-3 project is currently generating very-large scale 1-Hz

100m simulations on both DataStar and Lonestar, eight times larger than the original TeraShake case, requiring 80 hours run time on DataStar. (SDSC staff worked with Tommy Minyard at TACC to port the TeraShake code to Lonestar).

• The Anelastic Wave Model (AWM) CVS was updated to version 4.6.4 and new features were added to AWM for portability.

• Support was provided to the CyberShake project, managing the execution of more the 100K jobs on SDSC’s IA-64 cluster.

• Support was provided to the DynaShake project, implementing new AWN MPI I/O features for the staggered-grid split node method.

• ASTA - Complete• Demographic Patterns in Small-Scale Population Groups

• PI Steve Lancing, U. Arizona - SDSC Lead Amit Majumdar• Implemented a checkpointing feature and automatic job submission

script

Page 16: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Updates: User Services• Consulting/Training

• Participate in TG’07 training activities by contributing to “Making Efficient Use of TeraGrid” workshop (milestone complete)

• Krishna Muriki worked with Sergiu in planning the workshop, deciding on content, and coordinating speakers. Dave Hart from SDSC presented an overview of NSF allocations.

• Simplify and promote data management across the TeraGrid through improvement of SRB configuration for users. (milestone in progress)

• Krishna Muriki developed a script through which all TG users can setup their SRB accounts. Along with George Kremenek , he’s working on adding this script to the CTSS SRB distribution.

• Documentation• Create robust database-backed User News application and integrate

with Inca for system outages. (milestone in progress - delayed temporarily, redirected to CTSSv4 documentation)

Page 17: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Updates: Networking

• Support proposed 2nd 10Gb/s link from LA to Chi (milestone revised and completed)• Shared10Gb/s Framenet link between Chicago and LA now in place

Page 18: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: EOTInstitutes/Workshops April – August 07

13 TeacherTECH, 1 StudentTECH, 9 Other = 23 TotalApril

• Introduction to Parallel Programming for Community College Students; SC-07 Education Program

• Blue Gene Applications Workshop• TeacherTECH Science - Sensors and the Environment

May

• TeacherTECH Science - Sensing the Environment in San Diego County• TeacherTECH Tools - Creating a Successful Science Webquest for Your Students• TeacherTECH Tools - SMART™ Interactive User Group

June

• StudentTECH, Introduction to Maya and 3D Modeling for High School Students

July

• TeacherTECH Tools - Podcasting for the Science Educator• IT – Engineering and Environmental Education Tools (with UCSD Jacobs School of Engineering)

• TeacherTECH Science - Computer Mapping and GIS for Educators• NSF CI-TEAM – Current Projects and Future Directions (NSF OCI - Washington, D.C.)

• TeacherTECH Tools - Beginning Podcasting for the Science Educator, SDSC

Page 19: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: EOTInstitutes/Workshops April – August 07

July ( continued)

• Reducing Time to Solution – SDSC HPC Summer Institute

• Introduction to Grid Computing and CI Science; Navajo Technical College; SC-07 Education Program

• TeacherTECH Math - Using Technology to Enhance the Math Classroom

• TeacherTECH Tools - Advanced SMART Board Technology for the Classroom

• TeacherTECH Tools - Advanced Podcasting for the Science Classroom

• Computational Biology Institute; NCSA; SC-07 Education Program

• TeacherTECH Tools - iMovie for the 21st Century Educator

August

• Cyberinfrastructure for Humanities, Arts, and Social Sciences, SDSC; SC-07 Education Program

• Society of American Archivists CI Summer Camp

• TeacherTECH Tools - Advanced SMART Technology for the Science Classroom

• TeacherTECH Science - Molecular Modeling for Biology and Chemistry Teachers

Page 20: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: Misc.

• Data Collections• New working group formed - Natasha Balac (SDSC) lead

• TG’07 Program Committee• Nancy Wilkins-Diehr, Amit Majumdar, David Hart, Mark Sheddon

Page 21: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: New Building(beginning of April, 07)

• 80K GSF• (50K ASF)

• Summer 08 completion

Page 22: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: New Building(beginning of June, 07)

• 80K GSF• (50K ASF)

• Summer 08 completion

Page 23: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

SDSC Q-Update: New Building(beginning of September, 07)

• 80K GSF• (50K ASF)

• Summer 08 completion

Page 24: SDSC TG RP Report September 07

SAN DIEGO SUPERCOMPUTER CENTER

Dr. PhilTeraGrid Roaming

HPC Counseling and Heavy Lifting…