Advancing Scientific Discovery through TeraGrid

30
Advancing Scientific Discovery through TeraGrid Scott Lathrop TeraGrid Director of Education, Outreach and Training University of Chicago and Argonne National Laboratory [email protected] www.teragrid.org

description

Advancing Scientific Discovery through TeraGrid. Scott Lathrop TeraGrid Director of Education, Outreach and Training University of Chicago and Argonne National Laboratory [email protected] www.teragrid.org. 11 Resource Providers, One Facility. Grid Infrastructure Group (UChicago). UW. - PowerPoint PPT Presentation

Transcript of Advancing Scientific Discovery through TeraGrid

Page 1: Advancing Scientific Discovery  through TeraGrid

Advancing Scientific Discovery through TeraGrid

Scott Lathrop

TeraGrid Director of Education, Outreach and TrainingUniversity of Chicago and Argonne National Laboratory

[email protected]

www.teragrid.org

Page 2: Advancing Scientific Discovery  through TeraGrid

SDSC

TACC

UC/ANL

NCSA

ORNL

PU

IU

PSC

NCAR

Caltech

USC/ISI

UNC/RENCI

UW

Resource Provider (RP)

Software Integration Partner

Grid Infrastructure Group (UChicago)

11 Resource Providers, One Facility

LONI

NICS

Page 3: Advancing Scientific Discovery  through TeraGrid

TeraGrid Objectives

• DEEP Science: Enabling Petascale Science–Make Science More Productive through an integrated set of very-high capability resources

•Address key challenges prioritized by users

• WIDE Impact: Empowering Communities–Bring TeraGrid capabilities to the broad science community

•Partner with science community leaders - “Science Gateways”

• OPEN Infrastructure, OPEN Partnership–Provide a coordinated, general purpose, reliable set of services and resources

•Partner with campuses and facilities

Page 4: Advancing Scientific Discovery  through TeraGrid

TeraGrid Resources and Services• Computing - nearly a petaflop of computing power today

and growing– 500 Tflop Ranger system at TACC– NICS (U Tenn) system to come on-line this year– Centralized help desk for all resource providers

• Remote visualization servers and software• Data

– Allocation of data storage facilities – Over 100 Scientific Data Collections

• Central allocations process • Technical Support

– Central point of contact for support of all systems– Advanced Support for TeraGrid Applications (ASTA)– Education and training events and resources– Over 20 Science Gateways

Page 5: Advancing Scientific Discovery  through TeraGrid

Requesting Allocations of Time

• TeraGrid resources are provided for free to academic researchers and educators

• Development Allocations Committee (DAC) for start-up accounts up to 30,000 hours of time are requests processed in two weeks - start-up and courses

• Medium Resource Allocations Committee (MRAC) for requests of up to 500,000 hours of time are reviewed four times a year

• Large Resource Allocations Committee (LRAC) for requests of over 500,000 hours of time are reviewed twice a year

Page 6: Advancing Scientific Discovery  through TeraGrid

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

PIs (879)

Active Users

(3,197)

Charging Users

(1,141)

Allocations (1.8B NUs)

NUs (618M NUs)

All 20 Others (< 2% Usage each) Atmospheric Sciences

Chemical, Thermal Systems

Materials Research

Astronomical Sciences

Physics

Chemistry

Molecular Biosciences

TeraGrid User Community

Page 7: Advancing Scientific Discovery  through TeraGrid

25

50

75

100

125

150

175

200

225

250

275

J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J

2004 2005 2006 2007

NU

s (m

illi

on

s)Specific

Roaming

TeraGrid Usage

33% Annual Growth

Specific Allocations Roaming Allocations

200

100

Normalized Units (millions)

TeraGrid currently delivers an average of 420,000 cpu-hours per day -> ~21,000 CPUs DC Dave Hart ([email protected])

Page 8: Advancing Scientific Discovery  through TeraGrid

Use ModalityUse ModalityCommunity SizeCommunity Size

(est. number of (est. number of people/projects)people/projects)

Batch Computing on Individual Resources 850

Exploratory and Application Porting 650

Workflow, Ensemble, and Parameter Sweep 160

Science Gateway Access 100

Remote Interactive Steering and Visualization 35

Tightly-Coupled Distributed Computation 10

TeraGrid Usage Modes in CY2006

Grid

-y U

sers

Page 9: Advancing Scientific Discovery  through TeraGrid

Coupled Simulation: Full Body Arterial Tree Simulation

Karniadakis (Brown)

Virtualized Resources, Ensembles:

FOAM Climate

Model

Liu (UWisc)

Sources: Ian Foster (UC/ANL), Mike Papka (UC/ANL), George Karniadakis (Brown). Images by UC/ANL.

Advanced Support for TeraGrid Applications

Page 10: Advancing Scientific Discovery  through TeraGrid

On Demand:

Predicting Severe Weather

Droegemeier (OU) and LEAD

Large Data; Virtualized Resources: Earthquake Simulation

Olsen (SDSU), Okaya (USC), Southern California Earthquake CenterSources: Kelvin Droegemeier (OU), Dennis Gannon (IU), Tom Jordan (USC). Images by PSC and SDSC.

Page 11: Advancing Scientific Discovery  through TeraGrid

TeraGrid Science Highlights 2007

Page 12: Advancing Scientific Discovery  through TeraGrid

CosmologyTiziana di Matteo, Carnegie Mellon U

• Gas density is shown (increasing with brightness) with temperature (increasing from blue to red color). Yellow circles indicate black holes (diameter increasing with mass). At about 6 billion years, the universe has many black holes and a pronounced filamentary structure.

• Found that black holes regulate galaxy formation. As they swallow gas, they radiate so much energy, they stop the inflow of gas.

• Worked with PSC to improve scaling and use hybrid MPI-shared memory programming for GADGET.

Page 13: Advancing Scientific Discovery  through TeraGrid

Arterial Tree Simulation and Visualization Brown University, Northern Illinois University, and University of Chicago/Argonne National Laboratory

Blood flow visualization demonstration at SC07

Simulation runs across multiple TeraGrid sitesComputation:

NCSA: 256 processors UC/ANL: 64 processors SDSC: 128 processors SDSC: 144 processors Total: 592 processors

Data transfer from compute to visualization site (GridFTP)

UC/ANL: 4 processors

Visualization UC/ANL: 16 processors SC07 Exhibit floor

Page 14: Advancing Scientific Discovery  through TeraGrid

Storm predictionMing Xue, U. of Oklahoma

• Better alerts for thunderstorms, especially supercells that spawn tornados, could save millions of dollars and many lives.

• Unprecedented experiment, every day from April 15- June 8 (tornado season) to test the ability of storm-scale ensemble prediction under real forecasting conditions for US east of the Rockies.

• First time for–ensemble forecasting at storm scale –real-time in a simulated operational

environment • Successful predictions of the overall pattern and

evolution of many of the convective-scale features, sometimes out to the second day, and good ability to capture storm-scale uncertainties Top: prediction 21

hours ahead of time for May 24, 2007 ; Bottom: observed.

Page 15: Advancing Scientific Discovery  through TeraGrid

Protein StructureDavid Baker, U. of Washington

• David Baker’s Rosetta code has proved the best at predicting protein 3-D structure from sequence in biannual competitions (CASP- Critical Assessment of Structural Predictions)

• Used 1.3 M hours on NCSA Condor to identify promising targets, then refined 22 promising targets on 730,000 hours of SDSC Blue Gene.

• SDSC helped improve scaling to run on 40,960 processor BlueGene at IBM, which reduced the running time for a single prediction to 3 hours, instead of weeks on

a typical 1,000 processor cluster.

Protein structure prediction by the Rosetta code, showing the predicted structure (blue), the X-ray structure (red), and a low-resolution NMR structure (green).

Page 16: Advancing Scientific Discovery  through TeraGrid

Solve any Rubik’s Cube in 26 moves?

• Rubik's Cube is perhaps the most famous combinatorial puzzle of its time.

• > 43 quintillion states (4.3x10^19)• Gene Cooperman and Dan Kunkle of Northeastern Univ. just proved any state can be solved in 26 moves.

• 7TB of distributed storage on TeraGrid allowed them to develop the proof

URL: http://www.physorg.com/news99843195.html

Page 17: Advancing Scientific Discovery  through TeraGrid

TeraGrid Web Resources

• TeraGrid User Portal for managing user allocations and job flow

• Knowledge Base for quick answers to technical questions

• User Information including documentation, information about hardware and software resources

• Science Highlights

• News and press releases

• Education, outreach and training events and resources

TeraGrid Provides a rich array of web-based resources:

In general, seminars and workshops will be accessible via video on the Web. Extensive documentation will also be Web-based.

Page 18: Advancing Scientific Discovery  through TeraGrid

Science GatewaysBroadening Participation in TeraGrid

• Increasing investment by communities in their own cyberinfrastructure, but heterogeneous:

• Resources• Users – from expert to K-12• Software stacks, policies

• Science Gateways– Provide “TeraGrid Inside”

capabilities– Leverage community investment

• Three common forms:– Web-based Portals – Application programs running on

users' machines but accessing services in TeraGrid

– Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.

Technical Approach

Biomedical and Biology, Building Biomedical Communities

OG

CE

Sc

ien

ce

Po

rta

l

OGCE Portletswith ContainerOGCE Portletswith Container

Apache JetspeedInternal ServicesApache JetspeedInternal Services

ServiceAPI

ServiceAPI

GridProtocols

GridServiceStubs

GridServiceStubs

RemoteContentServices

RemoteContentServices

RemoteContentServersHTTP

GridService

s

Java

Co

G K

it

LocalPortal

Services

LocalPortal

Services

Grid Resources

Open Source Tools

Build standard portals to meet the domain requirements of the biology communitiesDevelop federated databases to be replicated and shared across TeraGrid

Workflow Composer

Source: Dennis Gannon ([email protected])

Page 19: Advancing Scientific Discovery  through TeraGrid

Gateways are Expanding• 10 initial projects as part of TG proposal• >20 Gateway projects today• No limit on how many gateways can use TG

resources– Prepare services and documentation so

developers can work independently

• Open Science Grid (OSG)• Special PRiority and Urgent Computing

Environment (SPRUCE)• National Virtual Observatory (NVO)• Linked Environments for Atmospheric

Discovery (LEAD)• Computational Chemistry Grid (GridChem)• Computational Science and Engineering

Online (CSE-Online)• GEON(GEOsciences Network)• Network for Earthquake Engineering

Simulation (NEES)• SCEC Earthworks Project• Network for Computational Nanotechnology

and nanoHUB• GIScience Gateway (GISolve)• Biology and Biomedicine Science Gateway• Open Life Sciences Gateway• The Telescience Project• Grid Analysis Environment (GAE)• Neutron Science Instrument Gateway• TeraGrid Visualization Gateway, ANL• BIRN• Gridblast Bioinformatics Gateway• Earth Systems Grid• Astrophysical Data Repository (Cornell)

Page 20: Advancing Scientific Discovery  through TeraGrid

TeraGrid as a Social Network

• Annual TeraGrid conference - TeraGrid ‘08 - Las Vegas - June

• Science Gateway community very successful–Transitioning to consulting

model

• Campus Champions– Campus Representatives

assisting local users

• HPC University– training and education

resources and events

• Education and Outreach –Engaging thousands of people

Page 21: Advancing Scientific Discovery  through TeraGrid

Riviera Hotel and CasinoLas VegasJune 9th-13th, 2008

TeraGrid ‘08 Conference

Science, Technology and Education Papers

TutorialsBOFs

Student CompetitionsVisualization Showcase

Call for Participation!

Page 22: Advancing Scientific Discovery  through TeraGrid

Student Competition Teams

Page 23: Advancing Scientific Discovery  through TeraGrid

Campus Champions Program

• Training program for campus representatives• Campus advocate for TeraGrid and CI• TeraGrid ombudsman for local users• Quick start-up accounts for campus• TeraGrid contacts for problem resolution• We’re looking for interested campuses!

Page 24: Advancing Scientific Discovery  through TeraGrid

HPC Education and Training

• Workshops, institutes and seminars on high-performance scientific computing

• Hands-on tutorials on porting and optimizing code for the TeraGrid systems

• On-line self-paced tutorials

• High-impact educational and visual materials suitable for K–12, undergraduate and graduate classes

TeraGrid partners offer training and education events and resources to educators and researchers:

Page 25: Advancing Scientific Discovery  through TeraGrid

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

“HPC University”• Advance researchers’ HPC skills

– Catalog of live and self-paced training– Schedule series of training courses– Gap analysis of materials to drive development

• Work with educators to enhance the curriculum– Search catalog of HPC resources– Schedule workshops for curricular development– Leverage good work of others

• Offer Student Research Experiences– Enroll in HPC internship opportunities– Offer Student Competitions

• Publish Science and Education Impact– Promote via TeraGrid Science Highlights, iSGTW– Publish education resources to NSDL-CSERD

Page 26: Advancing Scientific Discovery  through TeraGrid

Sampling of Training Topics Offered• HPC Computing

– Introduction to Parallel Computing– Toward Multicore Petascale Applications– Scaling Workshop - Scaling to Petaflops– Effective Use of Multi-core Technology – TeraGrid - Wide BlueGene Applications – Introduction to Using SDSC Systems – Introduction to the Cray XT3 at PSC – Introduction to & Optimization for SDSC Sytems – Parallel Computing on Ranger & Lonestar

• Domain-specific Sessions– Petascale Computing in the Biosciences – Workshop on Infectious Disease Informatics at NCSA

• Visualization– Introduction to Scientific Visualization– Intermediate Visualization at TACC– Remote/Collaborative TeraScale Visualization on the TeraGrid

• Other Topics– NCSA to host workshop on data center design – Rocks Linux Cluster Workshop– LCI International Conference on HPC Clustered Computing

• Over 30 on-line asynchronous tutorials

Page 27: Advancing Scientific Discovery  through TeraGrid

SC08-SC10 Education Program• Multi-year, year-long, Education Programs to provide

continuity and sustained impact• Integrate HPC into high school and undergraduate science,

technology, engineering and mathematics classrooms – Foster High School - College partnerships

• Significantly expanded digital libraries of resources for teaching and learning - CSERD/NSDL, ACM Digital Library

• Sponsors: ACM, IEEE, TeraGrid, NCSI, CSERD, Krell, and NSF

• Recruiting faculty and institutions to innovate their curriculum

Page 28: Advancing Scientific Discovery  through TeraGrid

Internships and Fellowships

• Computer science in user support and operations

• Future technologies

• Research activities

TeraGrid Partners offer internships and fellowships that allow undergraduates, post-graduate students and faculty to be located on-site and work with TeraGrid staff and researchers in areas critical to advancing scientific discovery:

Page 29: Advancing Scientific Discovery  through TeraGrid

Broadening Participation in TeraGrid

• Broaden awareness of TeraGrid – Campus Visits (coupled with CI Days)– Professional Society Meetings– Develop promotional materials

• Build human capacity for Terascale research– In-depth consulting (5-8 consultants)– TeraGrid Fellowship Program for faculty and students– Mentoring Program

– Campus Champions

• Enhance the usability and access of TG via SGs– Assess Science Gateway readiness and community requirements– Develop replicable strategies for integrating TeraGrid resources into SGs, with an emphasis on under-served community needs

Page 30: Advancing Scientific Discovery  through TeraGrid

For More Information

www.teragrid.org

www.s-education.org

cserd.nsdl.org

[email protected]