Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

27
Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2 Al Geist May 10-11, 2005 Chicago, ILL

description

Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2. Al Geist May 10-11, 2005 Chicago, ILL. Resource Management. Accounting & user mgmt. System Monitoring. System Build & Configure. Job management. ORNL ANL LBNL PNNL. SNL LANL Ames. - PowerPoint PPT Presentation

Transcript of Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Page 1: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Working Group updates, SSS-OSCAR Releases, API Discussions, External Users,

and SciDAC Phase 2

Working Group updates, SSS-OSCAR Releases, API Discussions, External Users,

and SciDAC Phase 2

Al GeistMay 10-11, 2005

Chicago, ILL

Page 2: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

IBMCrayIntelSGI

Scalable Systems SoftwareScalable Systems Software

Participating Organizations

ORNLANLLBNLPNNL

NCSAPSC

SNLLANLAmes

• Collectively (with industry) define standard interfaces between systems components for interoperability

• Create scalable, standardized management tools for efficiently running our large computing centers

Problem

Goals

• Computer centers use incompatible, ad hoc set of systems tools

• Present tools are not designed to scale to multi-Teraflop systems

ResourceManagement

Accounting& user mgmt

SystemBuild &Configure

Job management

SystemMonitoring

www.scidac.org/ScalableSystems

To learn more visit

Page 3: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Grid Interfaces

Accounting

Event Manager

ServiceDirectory

MetaScheduler

MetaMonitor

MetaManager

SchedulerNode StateManager

AllocationManagement

Process Manager

UsageReports

Meta Services

System &Job Monitor

Job QueueManager

NodeConfiguration

& BuildManager

Standard XML

interfacesauthentication communication

Components written in any mixture of C, C++, Java, Perl, and Python can be integrated into the Scalable Systems Software Suite

Checkpoint /Restart

Validation & Testing

HardwareInfrastructure

Manager

SSS-OSCAR

Scalable Systems Software SuiteScalable Systems Software SuiteAny Updates to this diagram?Any Updates to this diagram?

Page 4: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Components in SuitesComponents in Suites

Gold

EM

SD

Grid scheduler

Warehouse MetaManager

Mauisched

NSM

Gold

PM

UsageReports

Meta Services

Warehouse(superMonNWPerf)

BambooQM

BCM

Multiple Component

Implementations exits

ssslib

BLCR

APITest

HIMCompliant with PBS, Loadlever

job scripts

Page 5: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Scalable Systems UsersScalable Systems Users

Production use today:• Running an SSS suite at ANL, and Ames• Running components at PNNL• Maui w/ SSS API (3000/mo), Moab (Amazon, Ford, TeraGrid, …)

Who can we involve before the end of the project?- National Leadership-class facility?

NLCF is a partnership between ORNL (Cray), ANL (BG), PNNL (cluster)

- NERSC and NSF centersNCSA cluster(s)NERSC cluster?

Page 6: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Goals for This MeetingGoals for This Meeting

Updates on the Integrated Software Suite components Planning for SciDAC phase 2 – discuss new directions and June SciDAC meeting

Preparing for next SSS-OSCAR software suite release What is missing? What needs to be done?

Getting more outside Users. Production and feedback to suite

Discussion of involvement with NLCF machines: IBM BG/L, Cray XT3, Clusters

Page 7: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Highlights of Last Meeting (Jan. 25-26 in DC)Highlights of Last Meeting (Jan. 25-26 in DC)

Details in Main project notebook

Fred Attended - he gave state of MICS, SciDAC-2, and his vision for changed focus

Discussion of whitepaper and presentation for Strayer ideas and Fred feedback

API Discussions - voted for Process Manager API (12 yes 0 no 0 abstain)- New Warehouse protocol presented

Agreed to Quarterly Suite Releases this year– and dates.

Page 8: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Since Last MeetingSince Last Meeting

• CS ISICs meet with SciDAC director (Strayer) Feb 17 DC• Whitepaper – some issues with Mezzacapa• Give hour “highlight” presentation on goals, impact, and

potential CS ISIC ideas for next round.• Strayer was very positive. Fred reported that the meeting

could not have gone any better.

• Cray Software Workshop (called by Fred)• January in Minneapolis• Status of Cray SW and how DOE research could help• Several SSS members there. Anything since?

• Telecoms and New entries in Electronic Notebooks • Pretty sparse since last meeting

Page 9: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Major Topics for This MeetingMajor Topics for This Meeting

• Latest news on the Software Suite components• Preparing for next SSS-OSCAR software suite release • Discuss ideas for next round of CS ISICs • Preparation for upcoming meetings in June• Presentation/ 1st vote on Queue Manager API• Getting more users and feedback on suite

Page 10: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Agenda – May 10Agenda – May 10

8:00 Continental Breakfast 8:30 Al Geist - Project Status 9:00 Discussion of ideas presented to Strayer 9:30 Scott Jackson - Resource Management components 10:30 Break 11:00 Will Mclendon - Validation and Testing Ron Oldfield – integrated SSS test suites 12:00 Lunch (on own at cafeteria ) 1:30 Paul Hargrove Process Management and Monitoring 2:30 Narayan Desai - Node Build, Configure, and Cobalt on BG/L 3:30 Break 4:00 Craig Steffen – SSSRMAP in ssslib 4:30 Discussion of getting SSS users and feedback5:30 Adjourn for dinner

Page 11: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Agenda – May 11Agenda – May 11

8:00 Continental Breakfast 8:30 Thomas Naughton - SSS OSCAR software releases through SC05 9:30 Discussion and voting

• Bret Bode - XML API for Queue Manager 10:30 Group discussion of ideas for SciDAC-2.11:00 Preparations for upcoming meetings

FastOS meeting June 8-10, SciDAC PI Meeting in June 26-30 (poster and panels), Set next meeting date/location: August 17-19, ORNL

12:00 Meeting Ends

Page 12: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Ideas Presented to SciDAC Director

Mike Strayer

February 17, 2005Washington DC

Ideas Presented to SciDAC Director

Mike Strayer

February 17, 2005Washington DC

Page 13: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

View to the FutureView to the Future

HW, CS, and Science Teams all contribute to the science breakthroughs

Leadership-class Platforms

BreakthroughScience

Software & Libs

SciDAC CS teamsTuned codes Research

teamHigh-End

science problem

Computing EnvironmentCommon look&feel across diverse HW

UltrascaleHardware

Rainer, Blue Gene, Red Storm OS/HW teams

SciDACScience Teams

Page 14: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

SciDAC Phase 2 and CS ISICsSciDAC Phase 2 and CS ISICs

Future CS ISICs need to be mindful of needs of

National Leadership Computing facility w/ Cray, IBM BG, SGI, clusters, multiple OSNo one architecture is best for all applications

SciDAC Science TeamsNeeds depend on application areas chosenEnd stations? Do they have special SW needs?

FastOS Research ProjectsComplement, don’t duplicate these efforts

Cray software roadmapMaking the Leadership computers usable, efficient, fast

Page 15: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Gaps and potential next stepsGaps and potential next steps

Heterogeneous leadership-class machines science teams need to have a robust environment that presents similar programming interfaces and tools across the different machines.

Fault tolerance requirements in apps and systems softwareparticularly as systems scale up to petascale around 2010

Support for application users submitting interactive jobs

computational steering as means of scientific discovery

High performance File System and I/O research

increasing demands of security, scalability, and fault tolerance

Security

One-time-passwords and impact on scientific progress

Page 16: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Heterogeneous MachinesHeterogeneous Machines

Heterogeneous Architectures Vector architectures, Scalar, SMP, Hybrids, Clusters

How is a science team to know what is best for them?

Multiple OSEven within one machine, eg. Blue Gene, Red Storm

How to effectively and efficiently administer such systems?

Diverse programming environment

science teams need to have a robust environment that presents similar programming interfaces and tools across the different machines

Diverse system management environmentManaging and scheduling multiple node types

System updates, accounting, … everything will be harder in round 2

Page 17: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Fault ToleranceFault Tolerance

Holistic Fault Tolerance Research into schemes that take into account the full impact of faults:

application, middleware, OS, and hardware

Fault tolerance in systems software• Research into prediction and prevention• Survivability and resiliency when faults can not be avoided

Application recovery

• transparent failure recovery

• Research into Intelligent checkpointing based on active monitoring, sophisticated rule-based recoverys, diskless checkpointing…

• For petascale systems research into recovery w/o checkpointing

Page 18: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Interactive ComputingInteractive Computing

Batch jobs are not the always the best for Science Good for large numbers of users, wide mix of jobs, but

National Leadership Computing Facility has different focus

Computational Steering as a paradigm for discoveryBreak the cycle: simulate, dump results, analyze, rerun simulation

More efficient use of the computer resources

Needed for Application developmentScaling studies on terascale systems

Debugging applications which only fail at scale

Page 19: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

File System and I/O ResearchFile System and I/O Research

Lustre is today’s answer There are already concerns about its capabilities as systems scale up to 100+ TF

What is the answer for 2010?Research is needed to explore the file system and I/O requirements for petascale systems that will be here in 5 years

I/O continues to be a bottleneck in large systems

Hitting the memory access wall on a node

To expensive to scale I/O bandwidth with Teraflops across nodes

Research needed to understand how to structure applications or modify I/O to allow applications to run efficiently

Page 20: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

SecuritySecurity

New stricter access policies to computer centers Attacks on supercomputer centers have gotten worse.

One-Time-Passwords, PIV?Sites are shifting policies, tightening firewalls, going to SecureID tokens

Impact on scientific progressCollaborations within international teams

Foreign nationals clearance delays

Access to data and computational resources

Advances required in system softwareTo allow compliance with different site policies and be able to handle tightest requirements

Study how to reduce impact on scientists

Page 21: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Meeting notesMeeting notes

Al Geist – project statusAl Geist – Ideas for CS ISICs in next round of SciDACScott Jackson – production use at more places eg. U. Utah Icebox 430proc Incorporation of SSSRMAP into ssslib in progressPaper accepted and new documents (see RM notebook)

SOAP as basis for SSSRMAP v4Discussion of pros and cons (scalability issues, but ssslib can support)Fault tolerance in Gold using hot failoverNew Gold release v2 b2.10.2 includes distributed accounting Simplify allocation managementEnabled support for mysql database

Bamboo QM v1.1 releasedNew fountain component alternate to Warehouse used in Work for support for SuperMon, Ganglin, and NwperfMaui – improved grid scheduler multisite authentication. Support for Globus 4Future Work - increase deployment base, ssslib integration, portability support for loadlever-like multi-step jobs, and PBS job language release of Silver meta-scheduler

Page 22: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Meeting notesMeeting notes

Will McClendon – APITest project status current release v 1.0Latest work – new look using cascading style sheets new capabilities – pass/fail batch files, better parse error reporting User Guide Documentation done (50 pages) and SNL approvedSW requirements: Python 2.3+, ElementTree, MySQL, ssslib, Twisted (version 2.0 added new dependencies)Helping fix bad tests – led to good discussion of this utilityFuture work: config file, test developer GUI, more…

Ron Oldfield – Testing SSS suites2 wks ago hired full time contractor (Tod Cordenbach) plus summer studentGoals and deliverables for summer work performance testing of SSS-OSCAR comparison to other components write tech report of resultsWhat is important for each component: scheduler, job launch, queue, I/O,…Discussion of metrics. Scalability? User time, Admin time, HW resource efficiency Report what works, what doesn’t, what is performance critical

Page 23: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Meeting notesMeeting notes

Paul Hargrove – PM updateCheckpoint (BLCR) status: users on four continents, bug fixes, Works with Linux2.6.11, partial AMD64/EM64T potNext step is process groups/sessionsOpenMPI work this summer ( student of Lumsdane)Have sketch of less restrictive syntax APIProcess manager status: complete rewrite of MPD more OO and pythonic provided a non-MPD implementation for BG/L using SSS API

Narayan Dasi – BCM updateSSS infrastructure in use at ANL: clusters, BG/L, IA32, PPC64Better documentationLRS Syntax: spec done, SDK complete, todo ssslib integrationBG/L: arrived in January, initial Cobalt (SSS) suite on February many features being requested eg, node modes set in mpirun DB2 used for everythingCobalt – same as SW on Chiba City. All python components implemented using SSS-SDK several major extensions required for BG/L

Page 24: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Meeting notesMeeting notes

Narayan Dasi – Cobalt update for BG/LScheduler (bgsched): new implementation needed to be topology aware, use DB2 partition unit is 512 nodes.Queue Manager (cqm): same SW as Chiba OS change on BG/L is trivial since system rebooted for each jobProcess Manager (bgpm): new implementation computer nodes don’t run full OS so no MPD mpirun complicatedAllocation Manager (am): same as chiba very simple designExperiences: SSS really worksEasy to port, simple approach makes system easy to understandAgility required for BG/LComprehensive interfaces expose all information Admins can access internal state component behavior less mysterious extracting new info is easyShipping Cobalt to a couple other sites

Page 25: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Meeting notesMeeting notes

Craig Stefan – (no slides)Not as much to report. Sidetracked for past three months on other projectsGives reasons Warehouse bugs also not done. Fixes to be done by next OSCAR releaseGraphical display for Warehouse createdSame interfaces as Maui wrt requesting everything from all nodesSSSRMAP into ssslibInitial skeleton code for integration into ssslib begun.Needs questions answered from Jackson and Narayan to proceed

Page 26: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Meeting notesMeeting notes

Thomas Naughton – SSS OSCAR releasesTesting for v1.1 releaseBase OSCAR v4.1 includes SSS APItest runs post-install tests on packagesDiscussion that Debian support will require both RPM and DEB formatsFuture work: complete v1.1 testing, migrate distribution to FRE repository extend SSS component tests, distribute as basic OSCAR “package set” needed ordering within a phase (work around for now)Release schedule: version Freeze Release New v1.0 Nov (SC05) first full suite release v1.1 Feb 15 May Gold update, bug fixes v1.2 Jun 15 July RH9 to Fedora2 oscar 4.1, BLCR to linux 2.6, improved tests, close known bug reports v2.0b Aug 15 Sept less restrictive syntax switch over, perf tests Silver meta-scheduler, Fedora4 v2.0 Oct 15 Nov (SC05) bug fixes, minor updates In oscar 5.0 as package set (after SC05)Remove Bugzilla link from web page

Page 27: Working Group updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2

Meeting notesMeeting notes

Bret Bode – Queue Manager APILists all the functions then goes through detailed scheme of eachBamboo Uses SSSRMAP messaging and wire protocolAuthentication – uses ssslibAuthorization – uses info in SSSRMAP wire protocolQuestions and discussion of interfaces