Grid Challenges It’s the vision, stupid …but it NEEDS TO be followed by operational standards...
-
date post
21-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Grid Challenges It’s the vision, stupid …but it NEEDS TO be followed by operational standards...
Grid ChallengesGrid ChallengesIt’s the It’s the visionvision, stupid, stupid…but it NEEDS TO be followed …but it NEEDS TO be followed by by operational standardsoperational standards based on based on real applicationsreal applications……The Global Grid ForumThe Global Grid Forum25 June 200325 June 2003Gordon BellGordon BellMicrosoft CorporationMicrosoft Corporation
A quick look at A quick look at some past visionssome past visionsand a challengeand a challenge
NREN >> InternetNREN >> Internet
WWWWWW
Challenge: Will match any Grid enabled Challenge: Will match any Grid enabled application that wins a Gordon Bell application that wins a Gordon Bell Prize for parallelismPrize for parallelism
3
FCCSET NREN Plan 11/1987
1988 1990 1992 1994 1996 1998 2000
10G-
1G-
100M-
10M-
1M-
100K-
10K-
56K
1.5 M
Phase 1
45 M
Phase 2
3 G
Optical
a factor of 1000 makes a difference
4
Originating Bandwidth (Gb/s)U.S. Interstate Comm. traffic L Roberts ’92
ARPAnet Goals c1972 = Grid Goals10,000-
1,000-
100-
10-
1-
|1990 | |2000 | |2010 | |2020
Broadcast TV
Voice
FAX
Video on Demand
Video Conf.
NSF bb•
5
Growth in hype vs reality
Infoway speculation“how great it’ll be” (politicians , telecoms & futurists)
Infoway regulation
conferences
WWW
Infoway addiction
lawsuits
c 1995 Data from Gordon’s WAG
books,newspapers
6
Articles per newspaper versusorders per second sent via Internet
orders per second
articles per newspaper
c 1995 Data from Gordon’s WAG
7
Articles about security, privacy, & fraud versus commerce ($M)
actual commerce
articles about risk and NOT doing commerce
organized crime on Internet
c 1995 Data from Gordon’s WAG
8
Telnet & FTP
WWW Audio Video
Voice!Voice!
StandardsStandardsIPIP
Increase Capacity(circuits & bw)
Lower response time
Create newservice
Increased Demand
The virtuous cycle of bandwidth supply and demand
GridsVideo Conf.FTPWeb Svcs
From Carl Kesselman, ISI
For More Information
Grid Book c1998 from 1996- www.mkp.com/grids
The Globus Project™- www.globus.org
OGSA- www.globus.org/ogsa
Global Grid Forum- www.gridforum.org
Grid Computing 2003
651 pp. 22 chapters, 41 authors
1080 pages 43 chapters, O(100) authors
Progress...a review Progress...a review Grid started out with great promise…c1998Grid started out with great promise…c1998Interesting use at NASA for coupled programsInteresting use at NASA for coupled programs
NMI (National Middleware Infrastructure)NMI (National Middleware Infrastructure)…State_Tools.gov, funded by NSF.gov…State_Tools.gov, funded by NSF.govclearly open, clearly not “free” clearly open, clearly not “free” notnot IETF model IETF model
Tools vs. standards & evolving working codeTools vs. standards & evolving working code
Some examples: Some examples: C1980: Seti@home, folding@home, >> Napster p2pC1980: Seti@home, folding@home, >> Napster p2p
2001 15 TB Terraserver > Terraservice w/Web Services2001 15 TB Terraserver > Terraservice w/Web Services
2003 Alex Szelay & Jim Gray: Skyserver/skyservice2003 Alex Szelay & Jim Gray: Skyserver/skyservice
Cornell Theory Center Web Services based appsCornell Theory Center Web Services based apps
NEES—good poster child. An XML task NEES—good poster child. An XML task
GRADs and Teragrid… dream or research or just $$s?GRADs and Teragrid… dream or research or just $$s?
TerraServer c1998:TerraServer c1998:The “Whole Earth” The “Whole Earth”
DatabaseDatabase
TerraServer ExperienceTerraServer Experiencec2001c2001
Successful Web SiteSuccessful Web Site 50,000 daily users satisfied with “human-50,000 daily users satisfied with “human-
accessible” dataaccessible” data 59 GB imagery transmitted daily59 GB imagery transmitted daily
New Feature RequestsNew Feature Requests Programmable access to meta-dataProgrammable access to meta-data User selectable image sizes, i.e. “a map server”User selectable image sizes, i.e. “a map server” Permission to use TerraServer data within server Permission to use TerraServer data within server
applicationsapplications
To the rescue!
.NET TerraService Architecture.NET TerraService Architecture
ExistingDB Server
SQL 2000
1.0 TB Db
SQL 2000
1.0 TB Db
SQL 2000
1.0 TB Db
668 m Rows668 m Rows
ADO.NETADO.NETADO.NETADO.NET
TerraServerTerraServer Web ServiceWeb Service
OLEDB
Map ServerMap Server Http HandlerHttp Handler
Map UIMap UI Web FormsWeb FormsStandardStandard
BrowsersBrowsersStandardStandardBrowsersBrowsers
SmartSmartClientsClientsSmartSmartClientsClients
WindowsWindowsFormsForms
WindowsWindowsFormsForms
.NET.NETFrameworkFramework
.NET.NETFrameworkFramework
XML
HTMLImage/jpeg
Image/jpeg
Data Intensive Science: the Next Frontier
Alex SzalayThe Johns Hopkins University
Department of Physics and Astronomy
The W.M. Keck Fellowsin Advanced Scientific
Data Analysis
National Virtual Observatory
NSF ITR project, “Building the Framework for the National Virtual Observatory” is a collaboration of 17 funded and 3 unfunded organizations
Astronomy data centersNational observatoriesSupercomputer centersUniversity departmentsComputer science/information technology specialists
PI and project director: Alex Szalay (JHU)CoPI: Roy Williams (Caltech/CACR)
Scientific Data Exploration
1. Thousand years ago: science was empirical describing natural phenomena
2. Last few hundred years: theoretical branch using models, generalizations
3. Last few decades: a computational branch simulating complex phenomena
4. Today: data exploration is emerging synthesizing theory, experiment and computation with advanced data management and statistics
Living in an Exponential World
Astronomers have a few hundred TB now1 pixel (byte) / sq arc second ~ 4TBMulti-spectral, temporal, … → 1PB
They mine it looking for new (kinds of) objects, more of interesting ones (quasars), density variations in 400-D space, correlations in 400-D space
Data doubles every yearData is public after 1 yearSo, 50% of the data is publicSame trend appears in all sciences
Why Is Astronomy Special?
It has no commercial value No privacy concerns, freely share results with others Great for experimenting with algorithms
It is real and well documented High-dimensional (with confidence intervals) Spatial, temporal
Diverse and distributed Many different instruments from
many different places and many different times
The questions are interesting There is a lot of it (soon petabytes)GB: It is not over-funded aka it’s poor
IRAS 100
ROSAT ~keV
DSS Optical
2MASS 2
IRAS 25
NVSS 20cm
WENSS 92cm
GB 6cm
Making Discoveries
When and where are discoveries made?Always at the edges and boundariesGoing deeper, collecting more data, using more colors….
Metcalfe’s lawUtility of computer networks grows as the number of possible connections: O(N2)
VO: Federation of N archivesPossibilities for new discoveries grow as O(N2)
Current sky surveys have proven thisVery early discoveries from SDSS, 2MASS, DPOSS
What can be learned from Sky Server?
It’s about data, not about harvesting flops1-2 hr. query programs versus 1 wk programs based on grep10 minute runs versus 3 day compute & searchesDatabase viewpoint. 100x speed-ups
Avoid costly re-computation and searchesUse indices and PARALLEL I/O. Read / Write >>1. Parallelism is automatic, transparent, and just depends on the number of computers/disks.
Limited experience and talent to use dbases.
Soon: The Virtual Observatory
Many new surveys are comingSDSS is a dry run for the next onesLSST will be 5TB/night
All the data will be on the Internetftp, web services…
Data and applications will be associated with the instruments
Distributed world wide, cross-indexedFederation is a must
Will be the best telescope in the worldWorld Wide Telescope
Finds the “needle in the haystack”Successful demonstrations in Jan’03
Emerging Concepts
Standardizing distributed data accessWeb Services, supported on all platformsXML: Extensible Markup LanguageSOAP: Simple Object Access ProtocolWSDL: Web Services Description Language
Standardizing distributed computingGrid ServicesCustom configure remote computing dynamicallyBuild your own remote computer, and discardVirtual Data: new data sets on demand
Both needed for Data Exploration
Computational Science Computational Science Simulations based on Simulations based on Web ServicesWeb Services
Gerd HeberGerd HeberCornell Theory CenterCornell Theory [email protected]@tc.cornell.edu
International Conference on Computational Science 2003International Conference on Computational Science 2003
Three Flavors of AdaptivityThree Flavors of Adaptivity
Application-levelApplication-level Mathematical modelMathematical model High/low confidenceHigh/low confidence
Algorithm-levelAlgorithm-level Discretization methodDiscretization method Solution techniqueSolution technique
System-levelSystem-level Resource availabilityResource availability Fault toleranceFault tolerance
International Conference on Computational Science 2003International Conference on Computational Science 2003
The ProblemThe Problem Do Do distributed,distributed, coupledcoupled and and adaptiveadaptive
multi-physics simulations ofmulti-physics simulations of Mechanics of chemically-reacting flowsMechanics of chemically-reacting flows (Damage) Thermo-Mechanics of solids(Damage) Thermo-Mechanics of solids
ComponentsComponents provided as provided as Web ServicesWeb Services
International Conference on Computational Science 2003International Conference on Computational Science 2003
International Conference on Computational Science 2003International Conference on Computational Science 2003
GeographyGeography
Cornell UniversityCornell University Theory CenterTheory Center Department of Computer ScienceDepartment of Computer Science Department of Civil EngineeringDepartment of Civil Engineering
University of AlabamaUniversity of Alabama Mississippi State UniversityMississippi State University College of William and MaryCollege of William and Mary
International Conference on Computational Science 2003International Conference on Computational Science 2003
WorkflowWorkflow
International Conference on Computational Science 2003International Conference on Computational Science 2003
ComponentsComponents
MiniCADMiniCAD MeshersMeshers
Surface (Delaunay, quality guarantees)Surface (Delaunay, quality guarantees) Volume (Dmesh, Jmesh, Gmesh)Volume (Dmesh, Jmesh, Gmesh)
Fluid/Thermal simulation (Loci, CHEM)Fluid/Thermal simulation (Loci, CHEM) Thermo-mechanical component (CPTC)Thermo-mechanical component (CPTC) Fracture mechanicsFracture mechanics Visualization (OpenDX + SQL Server)Visualization (OpenDX + SQL Server)
International Conference on Computational Science 2003International Conference on Computational Science 2003
Web ServicesWeb Services
““Web Services are self-contained, Web Services are self-contained, modular applications that can be modular applications that can be described, published, located, and described, published, located, and invoked over a network, …” invoked over a network, …” (IBM)(IBM)
Service oriented architecture: Service oriented architecture: Publish, Publish, find, bindfind, bind
XML, SOAP, UDDI, WSDLXML, SOAP, UDDI, WSDL
International Conference on Computational Science 2003International Conference on Computational Science 2003
Features and RequirementsFeatures and Requirements Distributed expertiseDistributed expertise
No portingNo porting Network accessibility (“firewall Network accessibility (“firewall
compliant”)compliant”) Platform and language neutralityPlatform and language neutrality
SecuritySecurity Industry standardsIndustry standards MetadataMetadata StateState Students shouldn’t waste too much Students shouldn’t waste too much
time with coding!time with coding!
GrADS Vision
• Build a National Problem-Solving System on the Grid—Transparent to the user, who sees a problem-solving
system
• Software Support for Application Development on Grids—Goal: Design and build programming systems for the Grid
that broaden the community of users who can develop and run applications in this complex environment
• Challenges:—Presenting a high-level application development interface*
– If programming is hard, the Grid will not not reach its potential
—Designing and constructing applications for adaptability—Late mapping of applications to Grid resources—Monitoring and control of performance
– When should the application be interrupted and remapped?
*GB note: This is a superset of the previously unsolved clusters programming problem!
GrADSoft Architecture
Whole-ProgramCompiler
LibrariesBinder
Real-timePerformance
Monitor
PerformanceProblem
ResourceNegotiator
Scheduler
GridRuntimeSystem
SourceAppli-cation
Config-urableObject
Program
SoftwareComponents
Performance Feedback
Negotiation
From Carl Kesselman, ISI
Network for Earthquake Eng. Simulation
NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other
On-demand access to experiments, data streams, computing, archives, collaboration
NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org
A Universal Architecture A Universal Architecture for Web Services… for Web Services… Microsoft VisionMicrosoft Vision
““Scales Down”Scales Down”to devicesto devices
““Scales In” Scales In” on a machineon a machine
““Scales Up” Scales Up” on large on large systemssystems
““Scales Away” Scales Away” spans organizations spans organizations
& geographies& geographies
““Scales Out”Scales Out”by adding by adding machinesmachines
3939
SecuritySecurityReliable MessagingReliable MessagingTransactionsTransactionsRoutingRouting……
Messaging InfrastructureMessaging InfrastructureDistributed applicationsDistributed applicationsVertical processesVertical processesEmbedded systemsEmbedded systemsNetwork equipmentNetwork equipment……
Web Services: Level IWeb Services: Level IFoundation to Build UponFoundation to Build Upon
Basic profileBasic profile
Defined by WS-IDefined by WS-I
XML, SOAP, WSDL, XML, SOAP, WSDL, UDDIUDDI
Broad vendor Broad vendor supportsupport
WS-I assures WS-I assures widespread widespread compatibility compatibility
Level II Level II Secure, Reliable, TransactedSecure, Reliable, Transacted
Connected ApplicationsConnected Applications
MessagingMessaging
XMLXML
TransportsTransports
SecureSecure ReliableReliable TransactedTransacted
Meta
data
Meta
data
ManagementManagement BusinesBusiness s
ProcessProcess……
Level IIILevel IIIFrom Infrastructure to SolutionsFrom Infrastructure to Solutions
Application schemasApplication schemas
Domain specific profilesDomain specific profiles
Vertical industry servicesVertical industry services
Vison: Community/Data-Centric ComputingVison: Community/Data-Centric ComputingVersus Machine-Centered CentersVersus Machine-Centered Centers
Goal: Enable technical communities to create and take Goal: Enable technical communities to create and take responsibility for their own computing environments responsibility for their own computing environments of personal, data, and program collaboration and of personal, data, and program collaboration and distributiondistributionDesign based on technology and cost, e.g. networking, Design based on technology and cost, e.g. networking, apps programs maintenance, databases, and providing apps programs maintenance, databases, and providing 24x7 web and other services24x7 web and other servicesMany alternative styles and locations are possibleMany alternative styles and locations are possible
Service from existing centers, including many state centersService from existing centers, including many state centersSoftware vendors could be encouraged to supply apps web Software vendors could be encouraged to supply apps web servicesservicesNCAR style center based on shared data and appsNCAR style center based on shared data and appsInstrument- and model-based databases. Both central & Instrument- and model-based databases. Both central & distributed when multiple viewpoints create the whole.distributed when multiple viewpoints create the whole.Wholly distributed services supplied by many individual Wholly distributed services supplied by many individual groupsgroups
Community/Data Centric: “web service”Community/Data Centric: “web service”Community is responsibleCommunity is responsible
Planned & budget as resourcesPlanned & budget as resourcesResponsible for its infrastructureResponsible for its infrastructureApps are from communityApps are from communityComputing is integral to workComputing is integral to work
In sync with technologiesIn sync with technologies1-3 Tflops/$M; 1-3 PBytes/$M 1-3 Tflops/$M; 1-3 PBytes/$M to buy smallish Tflops & PBytes.to buy smallish Tflops & PBytes.
New scalables are “centers” New scalables are “centers” Community can afford and evolveCommunity can afford and evolveDedicated to a communityDedicated to a communityProgram, data & database centricProgram, data & database centricMay be aligned with instruments or other community May be aligned with instruments or other community activitiesactivities
Output = web service; Output = web service; Can communities form Can communities form that can supply servicesthat can supply services??
Commitment to standardsCommitment to standards
A general architecture comes much from A general architecture comes much from understanding the problemsunderstanding the problems
Understanding the problems comes from Understanding the problems comes from actually solving such problemsactually solving such problems
This is bottom-up, based on experience This is bottom-up, based on experience
Microsoft is committed to develop Microsoft is committed to develop community-wide web services community-wide web services standards…standards…
Is the Grid Forum equally committed?Is the Grid Forum equally committed?
The End
How can GRIDs become a real, useful, computer structure?
Get a life.Use the standards and tools. Adopt an application and/or community…now!