Introduction of Grid Systems

48
1 Grid Basics Grid Basics CS-780-3 Notes CS-780-3 Notes courtesy of Chaman Singh Verma courtesy of Chaman Singh Verma

Transcript of Introduction of Grid Systems

Page 1: Introduction of Grid Systems

11

Grid BasicsGrid Basics

CS-780-3 Notes CS-780-3 Notes

In courtesy of Chaman Singh VermaIn courtesy of Chaman Singh Verma

Page 2: Introduction of Grid Systems

3

Power GenerationPower GenerationPast:Past: Till the end of 19Till the end of 19thth Century, power generation was considered Century, power generation was considered

a local luxury. Only rich could generate them into their a local luxury. Only rich could generate them into their backyards.backyards.

Present:Present: We take electricity for granted, without knowing the sources We take electricity for granted, without knowing the sources

and complexities of distribution. We use the services and pay and complexities of distribution. We use the services and pay for it. Many of the countries provide high Quality of Service.for it. Many of the countries provide high Quality of Service.

Future:Future: We want to break the 19We want to break the 19thth century model in computer usages. century model in computer usages.

We wantWe want to provide a service model in computation and storage similar to provide a service model in computation and storage similar

to powerto power generation.generation.

Page 3: Introduction of Grid Systems

4

What is Grid ? ChecklistWhat is Grid ? Checklist

A Grid is a system thatA Grid is a system that

► Coordinates resources that are not subject to Coordinates resources that are not subject to centralized control (not for each single node)centralized control (not for each single node)

► Uses standard, open, general-purpose Uses standard, open, general-purpose protocols and interfaces.protocols and interfaces.

► Provide high quality of servicesProvide high quality of services

Reference: What is the Grid ? By Ian FosterReference: What is the Grid ? By Ian Foster

Page 4: Introduction of Grid Systems

5

Grid: A Virtual OrganizationGrid: A Virtual Organization

Grid resource sharing paradigm has greater Grid resource sharing paradigm has greater scope than P2P system. Grid implicitly allow scope than P2P system. Grid implicitly allow direct access to computers, software, data direct access to computers, software, data and any other resources. and any other resources.

Both providers and consumers define clearly Both providers and consumers define clearly what they will share, who can share and what they will share, who can share and conditions under which sharing will take place. conditions under which sharing will take place.

A set of individuals and/or institutions defined A set of individuals and/or institutions defined by such sharing rules form what we call by such sharing rules form what we call Virtual Organization.Virtual Organization.

Page 5: Introduction of Grid Systems

6

Grid: An Evolution, not revolutionGrid: An Evolution, not revolution Source: IBM Grid ComputingSource: IBM Grid Computing

Grid can be seen as the latest and most complete evolution of more Grid can be seen as the latest and most complete evolution of more familiarfamiliar

development.development.

► Like the Web:Like the Web: Grid keeps complexity hidden: multiple users enjoy a single unified Grid keeps complexity hidden: multiple users enjoy a single unified

experience.experience.► Unlike the Web:Unlike the Web: enables full collaboration toward real business goal. enables full collaboration toward real business goal. ► Like Peer-to-PeerLike Peer-to-Peer It allows user to share files.It allows user to share files.► Unlike Peer-to-PeerUnlike Peer-to-Peer Not only files, but everything which could be shared .Not only files, but everything which could be shared .► Like Clusters and distributed computingLike Clusters and distributed computing It bring computing resource together.It bring computing resource together.► Unlike Clusters and distributed ComputingUnlike Clusters and distributed Computing Grid can be geographically distributed and heterogeneous.Grid can be geographically distributed and heterogeneous.► Like Virtualization technologiesLike Virtualization technologies enables virtualization of IT resources.enables virtualization of IT resources.► Unlike Virtualization technologiesUnlike Virtualization technologies It can enable virtualization of vast and disparate resources.It can enable virtualization of vast and disparate resources.

Page 6: Introduction of Grid Systems

7

Originally Targeted ApplicationsOriginally Targeted Applications

What types of applications will grid be used What types of applications will grid be used for ?for ?

►Distributed SupercomputingDistributed Supercomputing► High-throughput ComputingHigh-throughput Computing Cracking cryptosystemsCracking cryptosystems

►On-demand ComputingOn-demand Computing

NetSolve, large archivesNetSolve, large archives ►Data-Intensive ComputingData-Intensive Computing

SloanSloan Digital Sky Survey, Weather forecastingDigital Sky Survey, Weather forecasting ► Collaborative ComputingCollaborative Computing Insors, GriPhyN, SciRUNInsors, GriPhyN, SciRUN

Page 7: Introduction of Grid Systems

8

Grid Problem Defined:Grid Problem Defined:

► Grid problem is defined as “Coordinated Grid problem is defined as “Coordinated resource sharing and problem solving in resource sharing and problem solving in dynamic, multi-institutional virtual dynamic, multi-institutional virtual organizations”.organizations”.

► The sharing raises many issues which were not The sharing raises many issues which were not addressed by distributed computing for addressed by distributed computing for exampleexample

1.1. How to structure flexible transient relationships.How to structure flexible transient relationships.2.2. How to structure fine grained access control over How to structure fine grained access control over

resources taking care of local and global policies.resources taking care of local and global policies.3.3. How to agree on quality of service, scheduling and co-How to agree on quality of service, scheduling and co-

allocation.allocation.

Page 8: Introduction of Grid Systems

9

Top 500 Supercomputers (June 2003)Top 500 Supercomputers (June 2003)

Earth Simulator: NEC : Yokohama : 35.86 TFlops

ASCI Q: LANL: Los Alamos: HP Alphaserver SC: 13.88 TFlops

MCR Linux Cluster: LLNL Livermore, 7.634 TFlops

ASCI White: LLNL, Livermore IBM SP Power3, 7.304 TFlops

Seaborg: NERSC/LBNL, Berkeley, IBM SP Power3, 7.303 TFlops

Source : http://www.top500.org

Page 9: Introduction of Grid Systems

10

Latest News Nov 8,2003Latest News Nov 8,2003

► Virginia Tech. Virginia Tech. Big MacBig Mac replaced 3 replaced 3rdrd position. position. It consists of 1100 Macintosh PCs and It consists of 1100 Macintosh PCs and performed 17 TFlops.performed 17 TFlops.

Page 10: Introduction of Grid Systems

11

General highlights from Top 500 (June General highlights from Top 500 (June 2003)2003)► 157 systems reported to have peak performance above 1 157 systems reported to have peak performance above 1

TFlops.TFlops.

► Total accumulated performance is 375 TFlops. ( up from Total accumulated performance is 375 TFlops. ( up from 293 TFlops )293 TFlops )

► Entry level performance is 245.1 GFlops. (Up from 195.8)Entry level performance is 245.1 GFlops. (Up from 195.8) ► A Total of 119 systems (up from 56) uses Intel A Total of 119 systems (up from 56) uses Intel

processors.processors.

► 149 systems are now labeled as clusters ( up from 53 )149 systems are now labeled as clusters ( up from 53 )

► 23 of them are self-made ( Up from 14 )23 of them are self-made ( Up from 14 )

► Among top 10, 7 from US, 2 from Japan, 1 from France.Among top 10, 7 from US, 2 from Japan, 1 from France.

Page 11: Introduction of Grid Systems

12

Economics and ControlEconomics and Control

The infrastructures are very expensive and The infrastructures are very expensive and require years of hard work. require years of hard work.

The shear force of economics will require The shear force of economics will require that these resources are under strict control that these resources are under strict control and are optimally utilized.and are optimally utilized.

Many times freedom is costly and chaotic.Many times freedom is costly and chaotic. This is the starting what we call This is the starting what we call Grid Grid

ComputingComputing

Page 12: Introduction of Grid Systems

13

Changing face of Enterprise ComputingChanging face of Enterprise Computing

► Most of the recent, enterprise systems are collection of Most of the recent, enterprise systems are collection of heterogeneous resources. heterogeneous resources.

► Quality of services traditionally associated with Quality of services traditionally associated with mainframe centric computing are now essential to the mainframe centric computing are now essential to the effective conduct of e-business across distributed effective conduct of e-business across distributed resources, inside as well as outside the enterprise.resources, inside as well as outside the enterprise.

► Recently there is upsurge of services providers of Recently there is upsurge of services providers of various types such as web-hosting SP, storage SP, various types such as web-hosting SP, storage SP, application SPapplication SP

All these require standardization.All these require standardization.

Page 13: Introduction of Grid Systems

14

Bird’s Eye viewBird’s Eye view

In the next few slides, we will get some In the next few slides, we will get some broader picture followed by technical broader picture followed by technical details.details.

Page 14: Introduction of Grid Systems

15

Web Services ArchitectureWeb Services Architecture

Universal Description, Discovery and Integration (UDDI) allows us to find Web Services which meet certain requirements.

Web Services Description LanguageWeb-Services must be Self-describing and shouldTell the invoker about operations it supports andHow to invoke it.

Simple Object Access ProtocolMessage passing between client and server usingSOAP.

Note: UDDI, WSDL, SOAP and HTTP are just an examples. Different implementations can use different technologies.

Page 15: Introduction of Grid Systems

16

A Typical Web Service Invocation:A Typical Web Service Invocation:

Page 16: Introduction of Grid Systems

17

End User’s perspectiveEnd User’s perspective

Page 17: Introduction of Grid Systems

18

Stateless machinesStateless machines

The above model is stateless. It can not remember what is done from one invocation to another.

One client can mess up the another clients operations.

Page 18: Introduction of Grid Systems

19

Factories: Factories:

The concept of factories solves the problems mentioned earlier.• Make Grid Stateful Machine• Create transient services

Page 19: Introduction of Grid Systems

20

Web Service Application:Web Service Application:

Client and Server stubs are generated automatically from the specifications.

Page 20: Introduction of Grid Systems

21

Technical Details:Technical Details:

Service:Service: A service is a network-enabled entity that A service is a network-enabled entity that provides a specific capability. ( example: the ability provides a specific capability. ( example: the ability to move files, create processes or verifying access to move files, create processes or verifying access rights. rights.

Service = protocols + behaviorService = protocols + behavior

Grid services are defined by OGSA ( Open Grid Services Grid services are defined by OGSA ( Open Grid Services Architecture). (OpenGrid Forum)Architecture). (OpenGrid Forum)

Grid services are specified by OGSI ( Open Grid Grid services are specified by OGSI ( Open Grid Services Infrastructure)Services Infrastructure)

Globus Toolkit is the most popular open Globus Toolkit is the most popular open implementation of OGSA.implementation of OGSA.

Page 21: Introduction of Grid Systems

22

Major Players in Grid Service WorldMajor Players in Grid Service World

Page 22: Introduction of Grid Systems

23

Example from NetSolveExample from NetSolve

Suppose you want to multiply Matrix A and Suppose you want to multiply Matrix A and Matrix B. There is one site which provides Matrix B. There is one site which provides the facility. You may want to directly the facility. You may want to directly integrate the function in your software. integrate the function in your software.

request = netsolve( “matmul”, a, b)request = netsolve( “matmul”, a, b)

C = netsolve( “wait”, request)C = netsolve( “wait”, request)

Page 23: Introduction of Grid Systems

24

Nature of Grid ArchitectureNature of Grid Architecture

Grid architecture is a set of protocols for Grid architecture is a set of protocols for establishment, management and usage of establishment, management and usage of dynamic, cross-organizational virtual dynamic, cross-organizational virtual organizations. organizations.

The main issues in the architecture areThe main issues in the architecture are► InteroperabilityInteroperability► Standard ProtocolsStandard Protocols► ServicesServices► Application Programming Interface( API) and Application Programming Interface( API) and

Software Development Kits (SDK)Software Development Kits (SDK)

Page 24: Introduction of Grid Systems

25

Hourglass ModelHourglass Model

Narrow neck of glass defines a small set of core abstractionsand protocols. It consists of protocols for• Connectivity• Resource Management

These protocols must be chosenso as to capture the fundamentalmechanism of sharing acrossmany different types.

Page 25: Introduction of Grid Systems

26

Grid ArchitectureGrid Architecture

Fabric layer implements the local, resource Specific operations that occurs on specific Resources.Connectivity protocols are concerned with communication and authentication.Resource protocols are concerned with negotiating access to individual resourcesCollective protocols and services are concerned with coordinating use of multiple resources.

Page 26: Introduction of Grid Systems

27

General list of servicesGeneral list of services

► Identity & AuthenticationIdentity & Authentication

► Authorization & policyAuthorization & policy

► Resource discoveryResource discovery

► Resource characterizationResource characterization

► Resource allocationResource allocation

► Co-reservation, workflowCo-reservation, workflow

► High-Speed data transferHigh-Speed data transfer

► Remote data accessRemote data access

► Performance Performance

guaranteesguarantees

► MonitoringMonitoring

► AdaptationAdaptation

► Intrusion detectionIntrusion detection

► Resource ManagementResource Management

► Accounting and Accounting and

paymentpayment

► Fault managementFault management

Page 27: Introduction of Grid Systems

28

Resource ManagementResource Management

At the minimum the following resource should be At the minimum the following resource should be

available for queryavailable for query► Computational: Computational: Mechanism for starting program, monitoring and controlling the execution, advanced Mechanism for starting program, monitoring and controlling the execution, advanced

reservations, hardware and software characteristics, state information such as reservations, hardware and software characteristics, state information such as current load etc.current load etc.

► StorageStorage Mechanism for putting and getting files, state information such as available space Mechanism for putting and getting files, state information such as available space

and bandwidth utilization.and bandwidth utilization.

►NetworkNetwork Mechanism for control over resource allocation for network transfer, information Mechanism for control over resource allocation for network transfer, information

about network characteristics and loadabout network characteristics and load

► Code RepositoriesCode Repositories Management for versioned source and object code. ( CVS style)Management for versioned source and object code. ( CVS style)

► CatalogsCatalogs

Page 28: Introduction of Grid Systems

29

Connectivity LayerConnectivity Layer

This layer defines core communications and This layer defines core communications and authentication protocols.authentication protocols.

Communication protocols enable the Communication protocols enable the exchange of data between different fabric exchange of data between different fabric layers. It include transport, routing and layers. It include transport, routing and naming services.naming services.

Authentications protocols build on Authentications protocols build on communication services to provide communication services to provide cryptographically secure mechanisms for cryptographically secure mechanisms for verifying the identity of users and resources.verifying the identity of users and resources.

Page 29: Introduction of Grid Systems

30

Authentication CharacteristicsAuthentication Characteristics

► Single sign onSingle sign on Single “log on” should be sufficient for access to multiple grid Single “log on” should be sufficient for access to multiple grid

resources.resources.

► Delegation Delegation run a program on user’s behalf.run a program on user’s behalf.

► Integration with local securityIntegration with local security example : Kerberos or Unix securityexample : Kerberos or Unix security

► User-based trust relationships.User-based trust relationships. If an user uses services from multiple service providers at the same If an user uses services from multiple service providers at the same

time, the security mechanism should not require that each of the time, the security mechanism should not require that each of the resource providers to cooperate and interact with each other.resource providers to cooperate and interact with each other.

Page 30: Introduction of Grid Systems

31

Resource layerResource layer

It is built on top of communications. It It is built on top of communications. It defines protocols fordefines protocols for

► Secure negotiationSecure negotiation► InitiationsInitiations►MonitoringMonitoring► Control Control ► AccountingAccounting► Payment for sharing resources.Payment for sharing resources.

Page 31: Introduction of Grid Systems

32

Resource LayerResource Layer

► Information protocols Information protocols are used for are used for obtaining information about structure and obtaining information about structure and state of a resource. ( current load, usage state of a resource. ( current load, usage policy, configuration etc)policy, configuration etc)

► Management protocolsManagement protocols are used to are used to negotiate access to shared resource, negotiate access to shared resource, specifying resource requirements specifying resource requirements

1.1. Advanced reservationAdvanced reservation2.2. Quality of service.Quality of service.3. Operations to performperform

Page 32: Introduction of Grid Systems

33

Collective: Coordinating Multiple Collective: Coordinating Multiple ResourcesResources►Directory Services: Directory Services: A user may query for resource by name and/or by its attributes A user may query for resource by name and/or by its attributes

such as type, availability, load.such as type, availability, load.

► Co-allocation, scheduling and brokering Co-allocation, scheduling and brokering servicesservices

allow VO participants to request for some specific resources for allow VO participants to request for some specific resources for some specific purpose and duration.some specific purpose and duration.

►Monitoring and Diagnostic servicesMonitoring and Diagnostic servicesallows monitoring for resource failure, attacks, overload etc…allows monitoring for resource failure, attacks, overload etc…

►Data replication servicesData replication services allows management of VO storage to maximize data access allows management of VO storage to maximize data access

performance with respect to some metric such as response performance with respect to some metric such as response time, reliability and cost.time, reliability and cost.

Page 33: Introduction of Grid Systems

34

Collective …Collective …►Grid-enabled programming systemsGrid-enabled programming systems enable familiar programming models to be used in Grid enable familiar programming models to be used in Grid

environment using other grid services such as resource environment using other grid services such as resource discovery, security etc. etc.discovery, security etc. etc.

example: Globus MPIexample: Globus MPI

►Workload management and collaborationWorkload management and collaboration Allow problem solving environment. Allow problem solving environment.

► Software discoverySoftware discovery allows selection of the best software implementations and allows selection of the best software implementations and

execution platform. Example NetSolve and Ninfexecution platform. Example NetSolve and Ninf

► Accounting and payment services:Accounting and payment services: gather usage information for the purpose of accounting, gather usage information for the purpose of accounting,

payment for the services.payment for the services.

Page 34: Introduction of Grid Systems

35

CollectiveCollective

Page 35: Introduction of Grid Systems

36

OGSAOGSA

Build on both Grid and Web-Services communities, OGSA Build on both Grid and Web-Services communities, OGSA defines defines uniform service semantic called Grid Servicesuniform service semantic called Grid Services..

► OGSA defines few persistent and many transient servicesOGSA defines few persistent and many transient services ► OGSA defines interfaces for managing Grid service instances.OGSA defines interfaces for managing Grid service instances.

Factory, registry, discovery, lifetimeFactory, registry, discovery, lifetime ► The OGSA defines interfaces and behavior forThe OGSA defines interfaces and behavior for Reliable invocation, lifetime management, discovery, authorization, Reliable invocation, lifetime management, discovery, authorization, notification, upgradeability, concurrency, manageabilitynotification, upgradeability, concurrency, manageability

► OGSA also defines WSDL interface and associated convention.OGSA also defines WSDL interface and associated convention.

► Protocols for reliable and secure management of distributed Protocols for reliable and secure management of distributed state.state.

Page 36: Introduction of Grid Systems

37

Need for service oriented viewNeed for service oriented view► It allows us to address the need for standard interface It allows us to address the need for standard interface

definition, local/remote transparency and adaptation definition, local/remote transparency and adaptation to local OS.to local OS.

► It allows multiple protocols bindings to facilitate It allows multiple protocols bindings to facilitate localized optimization of services.localized optimization of services.

► It simplify virtualization which in turn also allows It simplify virtualization which in turn also allows consistent resource access multiple heterogeneous consistent resource access multiple heterogeneous platform.platform.

► With service oriented view, we can partition the With service oriented view, we can partition the interoperability into two sub-problems, namely the interoperability into two sub-problems, namely the definition of service interface and identification of definition of service interface and identification of protocols that can be used to invoke a particular protocols that can be used to invoke a particular interfaceinterface

Page 37: Introduction of Grid Systems

38

Globus ToolkitGlobus Toolkit

Globus toolkit is an open-architecture and open-source Globus toolkit is an open-architecture and open-source set of services and software libraries that support Grid set of services and software libraries that support Grid and Grid applications. and Grid applications.

This toolkit address issues of security, information This toolkit address issues of security, information discovery, resource management, data management, discovery, resource management, data management, communication, fault detection and portability. communication, fault detection and portability.

GRAM:GRAM: Grid Resource Allocation and ManagementGrid Resource Allocation and Management MDS :MDS : Meta Directory ServiceMeta Directory Service GSI :GSI : Grid Security InfrastructureGrid Security Infrastructure

This toolkit will be described in detail in the next This toolkit will be described in detail in the next presentation, therefore I will skip any more description.presentation, therefore I will skip any more description.

Page 38: Introduction of Grid Systems

39

Nature of ServiceNature of Service

► Services are location transparent.Services are location transparent.

► Services are created and destroyed Services are created and destroyed dynamically.dynamically.

► Services are stateful. Every service is Services are stateful. Every service is assigned a globally unique name, called assigned a globally unique name, called GridGrid Service HandleService Handle (GSH) (GSH)

►Grid services can change during their lifetime Grid services can change during their lifetime ( for example support new protocols). ( for example support new protocols).

Page 39: Introduction of Grid Systems

40

Web ServicesWeb Services

► Web services are the basis for Grid services which Web services are the basis for Grid services which are the cornerstones of OGSA and OGSI.are the cornerstones of OGSA and OGSI.

► Web Services use simple Internet based protocols to Web Services use simple Internet based protocols to address heterogeneous distributed computing.address heterogeneous distributed computing.

► Web Services define a technique for describing Web Services define a technique for describing software components to be accessed, methods for software components to be accessed, methods for accessing them and discovery about the accessing them and discovery about the components. components.

► Web Services are language, programming model and Web Services are language, programming model and system software neutral.system software neutral.

Page 40: Introduction of Grid Systems

46

UpgradeabilityUpgradeability

► Services within the complex systems must be Services within the complex systems must be independently upgradeable.independently upgradeable.

► Versioning and compatibility between services must Versioning and compatibility between services must be managed and expressed so that clients can be managed and expressed so that clients can discover not only the specific service versions but discover not only the specific service versions but also compatible services.also compatible services.

► OGSA defines conventions that allow us to identify OGSA defines conventions that allow us to identify when a service changes and when those changes are when a service changes and when those changes are backwardly compatible with respect to interface and backwardly compatible with respect to interface and semantics.semantics.

Page 41: Introduction of Grid Systems

48

Some myths (Some myths (misunderstandingmisunderstanding) about ) about Grid ComputingGrid Computing

►Grid is next generation Internet.Grid is next generation Internet.► The grid is a source of free cycles.The grid is a source of free cycles.►Grid requires a distributed operating Grid requires a distributed operating

system.system.►Grid requires a new programming model.Grid requires a new programming model.►Grid makes high-performance computing Grid makes high-performance computing

superfluous.superfluous.

Page 42: Introduction of Grid Systems

49

Distributed ComputingDistributed Computing EconomicsEconomics (Views of Jim Gray)(Views of Jim Gray)

An equivalent price An equivalent price for following items:for following items: one data base accessone data base access 10 bytes of internet traffic10 bytes of internet traffic 100,000 instructions 100,000 instructions 10 bytes of disk storage10 bytes of disk storage a megabyte of disk bandwidtha megabyte of disk bandwidth

Break-even point isBreak-even point is 10,000 instructions / byte. 10,000 instructions / byte. This serves a basis how we do cost-effective This serves a basis how we do cost-effective

Internet-based computing, such as grid Internet-based computing, such as grid computing. computing.

Page 43: Introduction of Grid Systems

50

How are the numbers computed?How are the numbers computed? A 2GH CPU with 2 GB RAM box: $2,000 A 2GH CPU with 2 GB RAM box: $2,000 A 200 GB disk,100 accesses/s, or 50MB/s: A 200 GB disk,100 accesses/s, or 50MB/s:

$200 $200 1 Mbps WAN link: $100/month1 Mbps WAN link: $100/month $1 is equivalent to:$1 is equivalent to:

3.24 GB sent over WAN (7.2 hours) 3.24 GB sent over WAN (7.2 hours) 100+ Tera CPU instructions = 7.2 hours of CPU 100+ Tera CPU instructions = 7.2 hours of CPU

time time 1 GB disk1 GB disk 2.592 million database accesses (in 7.2 hours)2.592 million database accesses (in 7.2 hours) 1.296 Tera Byte disk bandwidths (in 7.2 hours) 1.296 Tera Byte disk bandwidths (in 7.2 hours)

Page 44: Introduction of Grid Systems

51

Cycle-based Computing is Almost Cycle-based Computing is Almost FreeFree

The accumulated cycles inThe accumulated cycles in SETI@Home SETI@Home are 54 are 54 Teraflops.Teraflops.

Google freely provides a trillion searches a year Google freely provides a trillion searches a year from the largest database (2 peterbytes).from the largest database (2 peterbytes).

Hotmail freely carries a trillion e-mails per year. Hotmail freely carries a trillion e-mails per year. Amazon.com offers a free book search tool.Amazon.com offers a free book search tool. Many well-known media sites offer free news … Many well-known media sites offer free news … The maintenance prices paid are low and The maintenance prices paid are low and

worthy. worthy.

Page 45: Introduction of Grid Systems

52

What is SETI@Home?What is SETI@Home? It uses millions of computers in homes/offices world It uses millions of computers in homes/offices world

wide to analyze radio signals from space.wide to analyze radio signals from space. SETI: SETI: Search for Extraterrestrial IntelligenceSearch for Extraterrestrial Intelligence is to is to

detect intelligent life outside Earth. detect intelligent life outside Earth. Uses Uses radio telescoperadio telescope to listen (collect) for narrow- to listen (collect) for narrow-

bandwidth radio signals from space. bandwidth radio signals from space. Data analysisData analysis: (1) computing power spectrums, (2) : (1) computing power spectrums, (2)

finding ``candidate signals”, (3) eliminating finding ``candidate signals”, (3) eliminating meaningless signals. meaningless signals.

Embarrassing ParallelismEmbarrassing Parallelism: CPU and Data Intensive, : CPU and Data Intensive, but infrequent communications. (high bandwidths but infrequent communications. (high bandwidths interconnects in supercomputers are not interconnects in supercomputers are not necessary!) necessary!)

Page 46: Introduction of Grid Systems

53

Who are paying the``free” Who are paying the``free” ComputingComputing

Advertisers pay it. Advertisers pay it. Google, hotmail, amazon.com Google, hotmail, amazon.com collect $1collect $1 from a from a

company for profits if its site is visited 1,000 time via company for profits if its site is visited 1,000 time via these ``free” services: Cost Per thousand iMpressions these ``free” services: Cost Per thousand iMpressions ((CPMCPM). ).

Big companies are eager to pay maintenance.Big companies are eager to pay maintenance. Low cost but very effective promotion. Low cost but very effective promotion. A Web site almost becomes the only ``Spoke-man”. A Web site almost becomes the only ``Spoke-man”.

SETI@Home rely on donated cycles world wide.SETI@Home rely on donated cycles world wide. It provided a 1,300 years of free computing on 2/3/03. It provided a 1,300 years of free computing on 2/3/03.

Page 47: Introduction of Grid Systems

54

Cases for Grid ComputingCases for Grid Computing: at least 10,000 : at least 10,000 Ins/ByteIns/Byte

A cryptographic search problem: A cryptographic search problem: only a few Kbytes input/output, but computing for only a few Kbytes input/output, but computing for

days. days. A representative job submitted to A representative job submitted to

SETI@Home: SETI@Home: computing on 12 hours on 1/2 Mbytes of input computing on 12 hours on 1/2 Mbytes of input

A CFD computation at Cornell:A CFD computation at Cornell: 7 years computing for 100 MB of input, 10 GB 7 years computing for 100 MB of input, 10 GB

output.output. Making animated movie of Making animated movie of Toy StoryToy Story::

a 200 MB image to take several hours to render. a 200 MB image to take several hours to render. (200,000-600,000 Ins/Byte).(200,000-600,000 Ins/Byte).

Page 48: Introduction of Grid Systems

55

Grid Computing Should Follow the Grid Computing Should Follow the EconomicsEconomics

Suitable Applications can be very limited. Suitable Applications can be very limited. A good solutionA good solution: to send a GB over Internet to save : to send a GB over Internet to save

years of computing. years of computing. It isIt is notnot economiceconomic to send a KB to send a KB if the result can be computed locally in a second. if the result can be computed locally in a second.

If Internet cost drops slower than Moore’s Law, If Internet cost drops slower than Moore’s Law, the analysis becomes stronger. the analysis becomes stronger. Over the 40 years, network cost fallen much slower. Over the 40 years, network cost fallen much slower.

Cluster computing has different economicsCluster computing has different economics a GBps Ethernet costs $200/port, delivers 50 MBpsa GBps Ethernet costs $200/port, delivers 50 MBps it is it is comparable to disk bandwidth costcomparable to disk bandwidth cost, , 10,000 10,000

lowerlower than Internet costs. (so the CFD fits better on than Internet costs. (so the CFD fits better on clusters).clusters).