28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for...
-
Upload
emory-williamson -
Category
Documents
-
view
217 -
download
0
Transcript of 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for...
W. Sudholt, K. Baldridge 28.09.2006 / 1
Swiss Grid Day, Geneva 28.09.2006
Grid Computing for Computational Chemistryand Beyond
Wibke Sudholt1 and Kim Baldridge1,2
1Institute of Organic Chemistry,University of Zurich, Switzerland2San Diego Supercomputer Center,University of California, San Diego, USA
W. Sudholt, K. Baldridge 28.09.2006 / 2
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Overview
• Why do we use Grid Computing to answer scientific questions?
• How are we involved in Grid Computing?
• What can we do to support users to apply Grid Computing?
• How does Grid Computing help us to solve real-world problems?
W. Sudholt, K. Baldridge 28.09.2006 / 3
Towards a Global Cyberinfrastructure
Workstationsand PCs
Supercomputersand clusters
Internetand grids
ComputingStorage
InstrumentsNetworking
CollaborationInformation
W. Sudholt, K. Baldridge 28.09.2006 / 4
Interdisciplinary Research
Mathematics
Physics
Chemistry
Biology
Computer Science
ComputationalChemistry
W. Sudholt, K. Baldridge 28.09.2006 / 5
Bridging Scientific Gaps
Atoms
Molecules
Proteins
Cells
Organs
Organisms
Nuclei
10-14 m
10-9 m
10-8 m
10-4 m
10-1 m
100 m
10-10 m
Accuracy
Complexity
Quantum mechanics
Classical mechanics
W. Sudholt, K. Baldridge 28.09.2006 / 6
Grid Computing
“A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.”
Foster/Kesselman, 1999
• Grid types:- Computational Grids- Desktop Grids- Data Grids- Knowledge Grids
• Grid opportunities:- Performance- Throughput- Scalability- Fairness- Collaboration- Knowledge exchange and
dissemination
• Grid challenges:- Heterogeneity- Network speed- Fault tolerance- Distribution algorithms- Job scheduling- Legacy codes- Policy issues
W. Sudholt, K. Baldridge 28.09.2006 / 7
Grid Infrastructure Layers
Applications
Upper-level middleware
Resources
User interfacesWeb portalsResource brokersWorkflow systems
Lower-level middlewareSecurity infrastructureResource managementInformation servicesData management
Scientific or business softwareScientific or business dataVisualization
NetworkJob managersOperating systemsHardware
W. Sudholt, K. Baldridge 28.09.2006 / 8
Baldridge Group Hardware
• Chempossiblecluster at SDSC
• Individual resources• Grid resources
Y. Potier, M. Packard, W. Sudholt, UniZH, et al.
• Mac laptops• Matterhorn cluster
at UniZH
W. Sudholt, K. Baldridge 28.09.2006 / 9
Grid Middleware Experience
• Cluster management:-Rocks-SGE
• Open source or freeware:-Globus-Nimrod-ProActive-UNICORE-Condor-BOINC-GridPort-SRB-Web services
• Commercial products:-DataSynapse
-United Devices
• Workflow infrastructures:-Kepler
-Informnet
• Security:-CA
-GAMA
W. Sudholt, K. Baldridge 28.09.2006 / 10
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Virtual Organizations
• Swiss Grid Initiative:http://www.gridinitiative.ch/
• Swiss Bio Grid (SBG):http://www.swissbiogrid.org/
• Southern European Partnership for Advanced Computing (SEPAC):http://www.sepac-grid.org/
• Pacific Rim Applications and Grid Middleware Assembly (PRAGMA):http://www.pragma-grid.net/
• Chemomentum:http://www.chemomentum.org/
• I2CAM:http://www.i2cam.org/
W. Sudholt, K. Baldridge 28.09.2006 / 11
Computational ChemistryGrid User Interfaces
• Molecular visualization and remote execution(K. Baldridge, J. Greenberg, SDSC):
-QMView
• GridPort web portals(J. Greenberg, SDSC, et al.):
-GAMESS-APBS-Euler-AMBER-CE
• Workflow and integrated infrastructure projects(UniZH/SDSC):
-Resurgence-Gemstone
W. Sudholt, K. Baldridge 28.09.2006 / 12
The Resurgence Project
• RESearch sURGe ENabled by CyberinfrastructurE
• http://www.baldridge.unizh.ch/resurgence/
• Description:- Workflow tool for computational chemistry- Allow researchers to easily combine existing
computational chemistry tools in innovative ways
- Exploit the possibilities of the growing web and grid infrastructure
- Focus on high-throughput calculations- Based on and included in the collaborative
Kepler scientific workflow system
• Interfaced programs:- GAMESS (quantum chemistry), Babel,
Open Babel (file format conversion), QMView (molecular visualization)
- In preparation: Nimrod/G (grid distribution)- Planned: APBS (biomolecular continuum
electrostatics)
• Participants:- UniZH/ETHZ: W. Sudholt et al.- SDSC: I. Altintas et al.
• Status:- Project has ended- Many ideas to be taken over into the Gemstone
framework
W. Sudholt, K. Baldridge 28.09.2006 / 13
The Gemstone Project
• Grid Enabled Molecular Science Through Online Networked Environments
• http://gemstone.mozdev.org/• Goals:
- Integrated framework for accessing grid resources
- Support scientific exploration, workflow capture and replay, and a dynamic services oriented architecture
- Provide researchers in the molecular sciences with a tool to discover and compose remote grid application services
• Components:- Dynamic rich-client desktop interface
(Firefox extension, XUL, registry lookup)- Strongly typed data schemas (XML
Schema, CML, GamessXML)- Molecular visualization tools (Flash, Garnet,
interface to QMView)- Optional: Authenticated interaction (GAMA)- Planned: Workflow integration (Informnet)
• Application web services:- APBS, GAMESS, Autodock, SIESTA codes- MolPrep, Babel, PDB2PQR, Psize utilities
• Related and interfaced projects:- Opal: Web services wrapping toolkit- Topaz: GridFTP Firefox extension
• Status:- Ongoing NSF and NBCR-supported project- Version 1.0 released on 26.09.2006
• Participants:- UniZH: K. Baldridge, C. Amoreira, A.
Bowen, Y. Potier et al.- SDSC: K. Bhatia, J. Greenberg, S.
Krishnan, S. Mock, B. Stearn et al.
W. Sudholt, K. Baldridge 28.09.2006 / 14
Example Computational Chemistry Grid Projects and Collaborations
• Parameterization of a Group Difference Pseudopotential for QM/MM calculations using GAMESS and the Nimrod distributed parametric modeling tool (W. Sudholt, UCSD/UniZH/ETHZ, D. Abramson, Monash, et al.)
• Coupling of the GAMESS quantum chemical code with the BOINC desktop grid platform (M. Taufer, UCSD/UTEP, et al.)
• Investigation of protein-ligand interactions based on a GAMESS and APBS pipeline using Nimrod and Gemstone on the PRAGMA testbed (C. Amoreira, UniZH, et al.)
• Implementation of the material science application SIESTA into the Gemstone framework (A. Garcia, UPV/ICMAB, Spain)
W. Sudholt, K. Baldridge 28.09.2006 / 15
Parameterization of a Group Difference Pseudopotential
• Challenge:- Parameterization of a pseudopotential for
QM/MM calculations- Embarrassingly parallel parameter sweeps
and optimizations
• Setup:- GAMESS quantum chemistry code (pre-
deployed)- Globus and Nimrod grid middleware- PRAGMA testbed resources (mainly at the
PRAGMA 4 and Supercomputing 2003 conferences)
• Results:- Up to about 60’000 jobs- More than 200 days of computing time in
less than 48 hours real time- Parameterized group difference potential- http://www.baldridge.unizh.ch/~wibke/
personal/pubs.html
• Participants:- UCSD/UniZH/ETHZ: W. Sudholt- Monash University, Australia: D. Abramson,
C. Enticott, S. Garic
W. Sudholt, K. Baldridge 28.09.2006 / 16http://www.baldridge.unizh.ch/nsf/ITR_RTIGNS/
W. Sudholt, K. Baldridge 28.09.2006 / 17
Collaborative Research Project with Swiss Re
• NatCat application:- Probabilistic modeling of losses for
insurance portfolios from natural catastrophes (earthquakes, tropical cyclones etc.) based on pre-simulated events at specific locations
- Java sources, Oracle database, test cases• Goals:
- Distribution of main Rate process over a computational grid
- Improvement of performance, scalability, stability, and fairness
- Testing of the DataSynapse GridServer and INRIA ProActive grid middleware tools
• Results:- Distribution over a number of Linux
machines by an event set-based algorithm- Performance considerably improved- Database access represents bottleneck- Some results already in production version- Currently working on improving the
distribution algorithm• Participants:
- Swiss Re: M. Spühler, P. Pfister et al.- UniZH: W. Sudholt, M. Packard, H.
Mahmood, M. Dänzer, M. Monroe, K. Baldridge
0
2000
4000
6000
8000
10000
12000
14000
medium_alm_no_inuringmedium_dlm_with_inuring
large_dlm_no_inuring_1large_dlm_no_inuring_2large_dlm_no_inuring_5large_dlm_with_inuring_4
Rate time/s
Local
2 Nodes
4 Nodes
8 Nodes
W. Sudholt, K. Baldridge 28.09.2006 / 18
Summary
• Conclusions:- Grid computing is important for our domain-specific as well as our computer science research and
has helped us to establish new local and international collaborations.- Our group and coworkers now have a lot of experience in grid computing including project
participation and organization, infrastructure setup, software development, and application to scientific problems.
- By developing grid user interfaces, we try to make grids easier accessible for the domain scientists.- We are building a record of grid projects in computational chemistry and also reach out to new
fields, concepts, and collaborations.- Grid computing still requires a lot of effort, but the future is bright if we learn from the successes
and failures, are aware of the limits, have clearly defined needs and goals, and do not reinvent the wheel.
• Thanks for funding:- UniZH, UCSD, SDSC, NSF, DAAD, ETHZ, EU, Swiss Re and others
• Swiss Grid Initiative:- This is an important initiative, and we are interested in participating.- We could provide our open source software and expertise.- We would contribute personnel and hardware resources only for well-defined, collaborative, and
financed projects.- We expect knowledge exchange and dissemination, new collaborations, access to resources and
funding, and not much organizational overhead.- This has to be a win-win situation with mutual trust, clear goals, and freedom for research.