Grids Challenged by a Web 2.0 and Multicore Sandwich

53
1 Grids Challenged by a Web 2.0 and Multicore Sandwich CCGrid 2007 Windsor Barra Hotel Rio de Janeiro Brazil May 15 2007 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http://www.infomall.org

description

Grids Challenged by a Web 2.0 and Multicore Sandwich. CCGrid 2007 Windsor Barra Hotel Rio de Janeiro Brazil May 15 2007 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http://www.infomall.org. - PowerPoint PPT Presentation

Transcript of Grids Challenged by a Web 2.0 and Multicore Sandwich

Page 1: Grids Challenged by a Web 2.0 and Multicore Sandwich

11

Grids Challenged by a Web 2.0 and Multicore Sandwich

CCGrid 2007Windsor Barra HotelRio de Janeiro Brazil

May 15 2007

Geoffrey Fox

Computer Science, Informatics, PhysicsPervasive Technology Laboratories

Indiana University Bloomington IN 47401

[email protected]://www.infomall.org

Page 2: Grids Challenged by a Web 2.0 and Multicore Sandwich

22

Abstract Grids provide managed support for distributed Internet Scale services

and although this is clearly a broadly important capability, adoption of Grids has been slower than perhaps expected.

Two important trends are Web 2.0 and Multicore that have tremendous momentum and large natural communities and that both overlap in important ways with Grids.

Web 2.0 has services but does not require some of the strict protocols needed by Grids or even Web Services. Web 2.0 offers new approaches to composition and portals with a rich set of user oriented services.

Multicore anticipates 100’s of core per chip and the ability and need to “build a Grid on a chip”. This will use functional parallelism that is likely to derive its technologies from parallel computing and not the Grid realm.

We discuss a Grid future that embraces Web 2.0 and multicore and suggests how it might need to change.

Virtual Machines (Virtualization) is another important development which has attracted more interest than Grids in Enterprise scene

Page 3: Grids Challenged by a Web 2.0 and Multicore Sandwich

33

e-moreorlessanything is an Application ‘e-Science is about global collaboration in key areas of science,

and the next generation of infrastructure that will enable it.’ from its inventor John Taylor Director General of Research Councils UK, Office of Science and Technology

e-Science is about developing tools and technologies that allow scientists to do ‘faster, better or different’ research

Similarly e-Business captures an emerging view of corporations as dynamic virtual organizations linking employees, customers and stakeholders across the world.

This generalizes to e-moreorlessanything A deluge of data of unprecedented and inevitable size must be

managed and understood. People (see Web 2.0), computers, data and instruments must be

linked. On demand assignment of experts, computers, networks and

storage resources must be supported

Page 4: Grids Challenged by a Web 2.0 and Multicore Sandwich

44

Role of Cyberinfrastructure Supports distributed science – data, people, computers Exploits Internet technology (Web2.0) adding (via Grid

technology) management, security, supercomputers etc. It has two aspects: parallel – low latency (microseconds)

between nodes and distributed – highish latency (milliseconds) between nodes

Parallel needed to get high performance on individual 3D simulations, data analysis etc.; must decompose problem

Distributed aspect integrates already distinct components Cyberinfrastructure is in general a distributed collection of

parallel systems Cyberinfrastructure is made of services (often Web services)

that are “just” programs or data sources packaged for distributed access

Page 5: Grids Challenged by a Web 2.0 and Multicore Sandwich

Not so controversial Ideas Distributed software systems are being “revolutionized” by

developments from e-commerce, e-Science and the consumer Internet. There is rapid progress in technology families termed “Web services”, “Grids” and “Web 2.0”

The emerging distributed system picture is of distributed services with advertised interfaces but opaque implementations communicating by streams of messages over a variety of protocols• Complete systems are built by combining either services or predefined/pre-

existing collections of services together to achieve new capabilities

Note messaging (MPI and some thread systems) interesting in parallel computing to support either “safe concurrency without side effects” or distributed memory

We can use the term Grids strictly (Narrow or even more strictly OGSA Grids) or just call any collections of services as “Broad Grids” which actually is quite often done – in this talk Grid means Narrow or Web service Grid

Page 6: Grids Challenged by a Web 2.0 and Multicore Sandwich

Web 2.0 and Web Services I Web Services have clearly defined protocols (SOAP) and a well

defined mechanism (WSDL) to define service interfaces• There is good .NET and Java support• The so-called WS-* specifications provide a rich sophisticated but

complicated standard set of capabilities for security, fault tolerance, meta-data, discovery, notification etc.

“Narrow Grids” build on Web Services and provide a robust managed environment with growing adoption in Enterprise systems and distributed science (so called e-Science)

Web 2.0 supports a similar architecture to Web services but has developed in a more chaotic but remarkably successful fashion with a service architecture with a variety of protocols including those of Web and Grid services• Over 400 Interfaces defined at http://www.programmableweb.com/apis

Web 2.0 also has many well known capabilities with Google Maps and Amazon Compute/Storage services of clear general relevance

There are also Web 2.0 services supporting novel collaboration modes and user interaction with the web as seen in social networking sites, portals, MySpace, YouTube,

Page 7: Grids Challenged by a Web 2.0 and Multicore Sandwich

Web 2.0 and Web Services II I once thought Web Services were inevitable but this is

no longer clear to me Web services are complicated, slow and non functional

• WS-Security is unnecessarily slow and pedantic (canonicalization of XML)

• WS-RM (Reliable Messaging) seems to have poor adoption and doesn’t work well in collaboration

• WSDM (distributed management) specifies a lot There are de facto standards like Google Maps and

powerful suppliers like Google which “define the rules” One can easily combine SOAP (Web Service) based

services/systems with HTTP messages but the “lowest common denominator” suggests additional structure/complexity of SOAP will not easily survive

Page 8: Grids Challenged by a Web 2.0 and Multicore Sandwich

Applications, Infrastructure, Technologies

The discussion is confused by inconsistent use of terminology – this is what I mean

Multicore, Narrow and Broad Grids and Web 2.0 (Enterprise 2.0) are technologies

These technologies combine and compete to build infrastructures termed e-infrastructure or Cyberinfrastructure• Although multicore can and will support “standalone” clients probably

most important client and server applications of the future will be internet enhanced/enabled so key aspect of multicore is its role and integration in e-infrastructure

e-moreorlessanything is an emerging application area of broad importance that is hosted on the infrastructures e-infrastructure or Cyberinfrastructure

Page 9: Grids Challenged by a Web 2.0 and Multicore Sandwich

Attack of the Killer Multicores Today commodity Intel systems are sold with 8 cores spread over

two processors Specialized chips such as GPU’s and IBM Cell processor have

substantially more cores Moore’s Law implies and will be satisfied by and imply

exponentially increasing number of cores doubling every 1.5-3 Years• Modest increase in clock speed• Intel has already prototyped a 80 core Server chip ready in

2011? Huge activity in parallel computing programming (recycled from

the past?)• Some programming models and application styles similar to

Grids We will have a Grid on a chip …………….

99

Page 10: Grids Challenged by a Web 2.0 and Multicore Sandwich

PC07Intro [email protected] 10

IBM Cell Processor• This supports pipelined

(through 8 cores) or data parallel operations distributed on 8 SPE’s

Applications running well on

Cell or AMD GPU should run scalably on future mainline

multicore chipsFocus on memory

bandwidth key(dataflow not

deltaflow)

Page 11: Grids Challenged by a Web 2.0 and Multicore Sandwich

Grids meet Multicore Systems The expected rapid growth in the number of cores per chip has

important implications for Grids With 16-128 cores on a single commodity system 5 years from

now one will both be able to build a Grid like application on a chip and indeed must build such an application to get the Moore’s law performance increase• Otherwise you will “waste” cores …..

One will not want to reprogram as you move your application from a 64 node cluster or transcontinental implementation to a single chip Grid

However multicore chips have a very different architecture from Grids• Shared not Distributed Memory• Latencies measured in microseconds not milliseconds

Thus Grid and multicore technologies will need to “converge” and converged technology model will have different requirements from current Grid assumptions 1111

Page 12: Grids Challenged by a Web 2.0 and Multicore Sandwich

Grid versus Multicore Applications It seems likely that future multicore applications will

involve a loosely coupled mix of multiple modules that fall into three classes• Data access/query/store

• Analysis and/or simulation

• User visualization and interaction This is precisely mix that Grids support but Grids of

course involve distributed modules Grids and Web 2.0 use service oriented architectures to

describe system at module level – is this appropriate model for multicore programming?

Where do multicore systems get their data from?1212

Page 13: Grids Challenged by a Web 2.0 and Multicore Sandwich

Pradeep K. Dubey, [email protected] 13

Tomorrow

What is …? What if …?Is it …?

Recognition Mining Synthesis

Create a model instance

RMS: Recognition Mining SynthesisRMS: Recognition Mining Synthesis

Model-basedmultimodalrecognition

Find a modelinstance

Model

Real-time analytics ondynamic, unstructured,

multimodal datasets

Photo-realism andphysics-based

animation

Today

Model-less Real-time streaming andtransactions on

static – structured datasets

Very limited realism

Intel has probably most sophisticated analysis of future “killer” multicore applications

Page 14: Grids Challenged by a Web 2.0 and Multicore Sandwich

Pradeep K. Dubey, [email protected] 14

What is a tumor? Is there a tumor here? What if the tumor progresses?

It is all about dealing efficiently with complex multimodal datasetsIt is all about dealing efficiently with complex multimodal datasets

Recognition Mining Synthesis

Images courtesy: http://splweb.bwh.harvard.edu:8000/pages/images_movies.html

Page 15: Grids Challenged by a Web 2.0 and Multicore Sandwich

PC07Intro [email protected] 15Intel’s Application Stack

Page 16: Grids Challenged by a Web 2.0 and Multicore Sandwich

Role of Data in Grid/Multicore I One typically is told to place compute (analysis) at the

data but most of the computing power is in multicore clients on the edge

These multicore clients can get data from the internet i.e. distributed sources• This could be personal interests of client and used by client to

help user interact with world

• It could be cached or copied

• It could be a standalone calculation or part of a distributed coordinated computation (SETI@Home)

Or they could get data from set of local sensors (video-cams and environmental sensors) naturally stored on client or locally to client

1616

Page 17: Grids Challenged by a Web 2.0 and Multicore Sandwich

Role of Data in Grid/Multicore Note that as you increase sophistication of data

analysis, you increase ratio of compute to I/O• Typical modern datamining approach like Support Vector

Machine is sophisticated (dense) matrix algebra and not just text matching

• http://grids.ucs.indiana.edu/ptliupages/presentations/PC2007/PC07BYOPA.ppt

Time complexity of Sophisticated data analysis will make it more attractive to fetch data from the Internet and cache/store on client• It will also help with memory bandwidth problems in

multicore chips In this vision, the Grid “just” acts as a source of data

and the Grid application runs locally1717

Page 18: Grids Challenged by a Web 2.0 and Multicore Sandwich

PC07Intro [email protected] 18

Three styles of Multicore “Jobs”

• Totally independent or nearly so (B C E F) – This used to be called embarrassingly parallel and is now pleasingly so– This is preserve of job scheduling community and one gets efficiency by

statistical mechanisms with (fair) assignment of jobs to cores– “Parameter Searches” generate this class but these are often not optimal

way to search for “best parameters”– “Multiple users” of a server is an important class of this type– No significant synchronization and/or communication latency constraints

• Loosely coupled (D) is “Metaproblem” with several components orchestrated with pipeline, dataflow or not very tight constraints– This is preserve of Grid workflow or mashups– Synchronization and/or communication latencies in millisecond to second

or more range

• Tightly coupled (A) is classic parallel computing program with components synchronizing often and with tight timing constraints– Synchronization and/or communication latencies around a microsecond

A1 A2 A3 A4 CB E D1 D2 F

Page 19: Grids Challenged by a Web 2.0 and Multicore Sandwich

PC07Intro [email protected] 19

Multicore Programming Paradigms• At a very high level, there are three broad classes of

parallelism• Coarse grain functional parallelism typified by workflow

and often used to build composite “metaproblems” whose parts are also parallel– This area has several good solutions getting better– Pleasingly parallel applications can be considered special cases of

functional parallelism

• Large Scale loosely synchronous data parallelism where dynamic irregular work has clear synchronization points as in most large scale scientific and engineering problems

• Fine grain thread parallelism as used in search algorithms which are often data parallel (over choices) but don’t have universal synchronization points– Discrete Event Simulations are either a fourth class or a variant of

thread parallelism

Page 20: Grids Challenged by a Web 2.0 and Multicore Sandwich

Programming Models • So the Fine grain thread parallelism and Large Scale loosely

synchronous data parallelism styles are distinctive to parallel computing while

• Coarse grain functional parallelism of multicore overlaps with workflows from Grids and Mashups from Web 2.0

• Seems plausible that a more uniform approach evolve for coarse grain case although this is least constrained of programming styles as typically latency issues are not critical– Multicore would have strongest performance constraints– Web 2.0 and Multicore the most important usability constraints

• A possible model for broad use of multicores is that the difficult parallel algorithms are coded as libraries (Fine grain thread parallelism and Large Scale loosely synchronous data parallelism styles) while the general user uses composes with visual interfaces, scripting and systems like Google MapReduce

Page 21: Grids Challenged by a Web 2.0 and Multicore Sandwich

PC07Intro [email protected] 21

Google MapReduceSimplified Data Processing on Large Clusters• http://labs.google.com/papers/mapreduce.html• This is a dataflow model between services where services can do useful

document oriented data parallel applications including reductions• The decomposition of services onto cluster engines is automated• The large I/O requirements of datasets changes efficiency analysis in favor of

dataflow• Services (count words in example) can obviously be extended to general

parallel applications• There are many alternatives to language expressing either dataflow and/or

parallel operations and indeed one should support multiple languages in spirit of services

Page 22: Grids Challenged by a Web 2.0 and Multicore Sandwich

Old and New (Web 2.0) Community Tools e-mail and list-serves are oldest and best used Kazaa, Instant Messengers, Skype, Napster, BitTorrent for P2P

Collaboration – text, audio-video conferencing, files del.icio.us, Connotea, Citeulike, Bibsonomy, Biolicious manage

shared bookmarks MySpace, YouTube, Bebo, Hotornot, Facebook, or similar sites

allow you to create (upload) community resources and share them; Friendster, LinkedIn create networks• http://en.wikipedia.org/wiki/List_of_social_networking_websites

Writely, Wikis and Blogs are powerful specialized shared document systems

ConferenceXP and WebEx share general applications Google Scholar tells you who has cited your papers while

publisher sites tell you about co-authors• Windows Live Academic Search has similar goals

Note sharing resources creates (implicit) communities• Social network tools study graphs to both define communities

and extract their properties

Page 23: Grids Challenged by a Web 2.0 and Multicore Sandwich

2323

“Best Web 2.0 Sites” -- 2006 Extracted from http://web2.wsj2.com/ Social Networking

Start Pages

Social Bookmarking

Peer Production News

Social Media Sharing

Online Storage (Computing)

Page 24: Grids Challenged by a Web 2.0 and Multicore Sandwich

Web 2.0 Systems are Portals, Services, Resources Captures the incredible development of interactive Web

sites enabling people to create and collaborate

Page 25: Grids Challenged by a Web 2.0 and Multicore Sandwich

2525

Mashups v Workflow? Mashup Tools are reviewed at http://blogs.zdnet.com/Hinchcliffe/?p=63 Workflow Tools are reviewed by Gannon and Fox

http://grids.ucs.indiana.edu/ptliupages/publications/Workflow-overview.pdf Both include

scripting in PHP, Python, sh etc. as both implement distributed programming at level of services

Mashups use all types of service interfaces and do not have the potential robustness (security) of Grid service approach

Typically “pure” HTTP (REST)

Page 26: Grids Challenged by a Web 2.0 and Multicore Sandwich

2626

Grid Workflow Datamining in Earth Science Work with Scripps Institute Grid services controlled by workflow process real time

data from ~70 GPS Sensors in Southern California

Streaming DataSupport

TransformationsData Checking

Hidden MarkovDatamining (JPL)

Display (GIS)

NASA GPS

Earthquake

Real Time

Archival

Page 27: Grids Challenged by a Web 2.0 and Multicore Sandwich

2727

Web 2.0 uses all types of Services Here a Gadget Mashup uses a 3 service workflow with

a JavaScript Gadget Client

Page 28: Grids Challenged by a Web 2.0 and Multicore Sandwich

Web 2.0 APIs

http://www.programmableweb.com/apis has (May 14 2007) 431 Web 2.0 APIs with GoogleMaps the most often used in Mashups

This site acts as a “UDDI” for Web 2.0

Page 29: Grids Challenged by a Web 2.0 and Multicore Sandwich

The List of Web 2.0 API’s Each site has API and

its features Divided into broad

categories Only a few used a lot

(42 API’s used in more than 10 mashups)

RSS feed of new APIs Amazon S3 growing

in popularity

Page 30: Grids Challenged by a Web 2.0 and Multicore Sandwich

APIs/Mashups per Protocol Distribution

REST SOAP XML-RPC REST,XML-RPC

REST,XML-RPC,

SOAP

REST,SOAP

JS Other

google google mapsmaps

netvibesnetvibes

live.comlive.com

virtual virtual earthearth

google google searchsearch

amazon S3amazon S3

amazon amazon ECSECS

flickrflickrebayebay

youtubeyoutube

411sync411syncdel.icio.usdel.icio.us

yahoo! searchyahoo! searchyahoo! geocodingyahoo! geocoding

technoratitechnorati

yahoo! imagesyahoo! imagestrynttrynt

yahoo! localyahoo! local

Number ofMashups

Number ofAPIs

Page 31: Grids Challenged by a Web 2.0 and Multicore Sandwich

4 more Mashups each day For a total of 1906

April 17 2007 (4.0 a day over last month)

Note ClearForest runs Semantic Web Services Mashup competitions (not workflow competitions)

Some Mashup types: aggregators, search aggregators, visualizers, mobile, maps, gamesGrowing number of commercial Mashup Tools

Page 32: Grids Challenged by a Web 2.0 and Multicore Sandwich

32

Mash Planet

Web 2.0 Architecture

http://www.imagine-it.org/mashplanetDisplay too large to be a Gadget

Page 33: Grids Challenged by a Web 2.0 and Multicore Sandwich

33

Searched on Transit/TransportationSearched on Transit/Transportation

Page 34: Grids Challenged by a Web 2.0 and Multicore Sandwich

34

Browser +Google Map API

Cass County Map Server

(OGC Web Map Server)

Hamilton County Map Server(AutoDesk)

Marion County Map Server

(ESRI ArcIMS)

Browser client fetches image tiles for the bounding box using Google Map API. Tile Server

Cache Server

Adapter Adapter Adapter

Tile Server requests map tiles at all zoom levels with all layers. These are converted to uniform projection, indexed, and stored. Overlapping images are combined.

Must provide adapters for each Map Server type .

The cache server fulfills Google map calls with cached tiles at the requested bounding box that fill the bounding box.

Google Maps Server

A “Grid” Workflow(built in Java!)

Uses Google Maps clients and server and non Google map APIs

Page 35: Grids Challenged by a Web 2.0 and Multicore Sandwich

35

GIS Grid of “Indiana Map” and ~10 Indiana counties with accessible Map (Feature) Servers from different vendors. Grids federate different data repositories (cf Astronomy VO federating different observatory collections)

Indiana Map Grid Workflow/Mashup

Page 36: Grids Challenged by a Web 2.0 and Multicore Sandwich

Now to Portals3636

Grid-style portal as used in Earthquake GridThe Portal is built from portlets

– providing user interface fragments for each service that are composed into the full interface – uses OGCE technology as does planetary science VLAB portal with University of Minnesota

Page 37: Grids Challenged by a Web 2.0 and Multicore Sandwich

3737

Portlets v. Google Gadgets Portals for Grid Systems are built using portlets with

software like GridSphere integrating these on the server-side into a single web-page

Google (at least) offers the Google sidebar and Google home page which support Web 2.0 services and do not use a server side aggregator

Google is more user friendly! The many Web 2.0 competitions is an interesting model

for promoting development in the world-wide distributed collection of Web 2.0 developers

I guess Web 2.0 model will win!

Note the many competitions powering Web 2.0 Mashup Development

Page 38: Grids Challenged by a Web 2.0 and Multicore Sandwich

Typical Google Gadget Structure

… Lots of HTML and JavaScript </Content> </Module>Portlets build User Interfaces by combining fragments in a standalone Java ServerGoogle Gadgets build User Interfaces by combining fragments with JavaScript on the client

Google Gadgets are an example of Start Page technologySee http://blogs.zdnet.com/Hinchcliffe/?p=8

Page 39: Grids Challenged by a Web 2.0 and Multicore Sandwich

Web 2.0 v Narrow Grid I Web 2.0 allows people to nurture the Internet Cloud and such

people got Time’s person of year award Whereas Narrow Grids support Internet scale Distributed

Services with similar architecture Maybe Narrow Grids focus on (number of) Services (there aren’t

many scientists) and Web 2.0 focuses on number of People Both agree on service oriented architectures but have different

emphasis Narrow Grids have a strong emphasis on standards and

structure; Web 2.0 lets a 1000 flowers (protocols) and a million developers bloom and focuses on functionality, broad usability and simplicity• Semantic Web/Grid has structure to allow reasoning• Annotation in sites like del.icio.us and uploading to

MySpace/YouTube is unstructured and free text search replaces structured ontologies

Page 40: Grids Challenged by a Web 2.0 and Multicore Sandwich

Web 2.0 v Narrow Grid II Web 2.0 has a set of major services like GoogleMaps or Flickr

but the world is composing Mashups that make new composite services• End-point standards are set by end-point owners• Many different protocols covering a variety of de-facto standards

Narrow Grids have a set of major software systems like Condor and Globus and a different world is extending with custom services and linking with workflow

Popular Web 2.0 technologies are PHP, JavaScript, JSON, AJAX and REST with “Start Page” e.g. (Google Gadgets) interfaces

Popular Narrow Grid technologies are Apache Axis, BPEL WSDL and SOAP with portlet interfaces

Robustness of Grids demanded by the Enterprise? Not so clear that Web 2.0 won’t eventually dominate other

application areas and with Enterprise 2.0 it’s invading GridsThe world does itself in large numbers!

Page 41: Grids Challenged by a Web 2.0 and Multicore Sandwich

Implication for Grid Technology of Multicore and Web 2.0 I

Web 2.0 and Grids are addressing a similar application class although Web 2.0 has focused on user interactions• So technology has similar requirements

Multicore differs significantly from Grids in component location and this seems particularly significant for data• Not clear therefore how similar applications will be• Intel RMS multicore application class pretty similar

to Grids Multicore has more stringent software requirements

than Grids as latter has intrinsic network overhead

4141

Page 42: Grids Challenged by a Web 2.0 and Multicore Sandwich

Implication for Grid Technology of Multicore and Web 2.0 II

Multicore chips require low overhead protocols to exploit low latency that suggests simplicity• We need to simplify MPI AND Grids!

Web 2.0 chooses simplicity (REST rather than SOAP) to lower barrier to everyone participating

Web 2.0 and Multicore tend to use traditional (possibly visual) (scripting) languages for equivalent of workflow whereas Grids use visual interface backend recorded in BPEL• Google MapReduce illustrates a popular Web 2.0

and Multicore approach to dataflow4242

Page 43: Grids Challenged by a Web 2.0 and Multicore Sandwich

Implication for Grid Technology of Multicore and Web 2.0 III

Web 2.0 and Grids both use SOA Service Oriented Architectures• Seems likely that Multicore will also adopt although a more

conventional object oriented approach also possible• Services should help multicore applications integrate

modules from different sources• Multicore will use fine grain objects but coarse grain

services “System of Systems”: Grids, Web 2.0 and Multicore are likely

to build systems hierarchically out of smaller systems• We need to support Grids of Grids, Webs of Grids, Grids of

Multicores etc. i.e. systems of systems of all sorts4343

Page 44: Grids Challenged by a Web 2.0 and Multicore Sandwich

Implication for Grid Technology of Multicore and Web 2.0 IV

Portals are likely to feature both Web and “desktop client” technology although it is possible that Web approach will be adopted more or less uniformly

Web 2.0 has a very active portal activity which has similar architecture to Grids

• A page has multiple user interface fragments Web 2.0 user interface integration is typically Client side using

Gadgets AJAX and JavaScript while• Grids are in a special JSR168 portal server side using Portlets WSRP and

Java

Multicore doesn’t put special constraints on portal technology but it could tend to favor non browser client or client side Web browser integrated portals

4444

Page 45: Grids Challenged by a Web 2.0 and Multicore Sandwich

The “Momentum” Effects Web 2.0 has momentum as it is driven by success of social web

sites and the user friendly protocols attracting many developers of mashups

Grids momentum driven by the success of eScience and the commercial web service thrusts largely aimed at Enterprise

• Enterprise software area not quite as dominant as in past

• Grid technical requirements are a bit soft and could be compromised if sandwiched by Web 2.0 and Multicore

• Will commercial interest in Web Services survive? Multicore driven by expectation that all servers and clients will

have many cores

• Multicore latency requirements imply cannot compromise in some technology choices

Simplicity, supporting many developers and stringent multicore requirements are the forces pressuring Grids!

Page 46: Grids Challenged by a Web 2.0 and Multicore Sandwich

The Ten areas covered by the 60 core WS-* Specifications

WS-* Specification Area Typical Grid/Web Service Examples

1: Core Service Model XML, WSDL, SOAP

2: Service Internet WS-Addressing, WS-MessageDelivery; Reliable Messaging WSRM; Efficient Messaging MOTM

3: Notification WS-Notification, WS-Eventing (Publish-Subscribe)

4: Workflow and Transactions BPEL, WS-Choreography, WS-Coordination

5: Security WS-Security, WS-Trust, WS-Federation, SAML, WS-SecureConversation

6: Service Discovery UDDI, WS-Discovery

7: System Metadata and State WSRF, WS-MetadataExchange, WS-Context

8: Management WSDM, WS-Management, WS-Transfer

9: Policy and Agreements WS-Policy, WS-Agreement

10: Portals and User Interfaces WSRP (Remote Portlets)

Page 47: Grids Challenged by a Web 2.0 and Multicore Sandwich

WS-* Areas and Web 2.0 WS-* Specification Area Web 2.0 Approach

1: Core Service Model XML becomes optional but still usefulSOAP becomes JSON RSS ATOM WSDL becomes REST with API as GET PUT etc.Axis becomes XmlHttpRequest

2: Service Internet No special QoS. Use JMS or equivalent?

3: Notification Hard with HTTP without polling– JMS perhaps?

4: Workflow and Transactions (no Transactions in Web 2.0)

Mashups, Google MapReduceScripting with PHP JavaScript ….

5: Security SSL, HTTP Authentication/Authorization, OpenID is Web 2.0 Single Sign on

6: Service Discovery http://www.programmableweb.com

7: System Metadata and State Processed by application – no system state – Microformats are a universal metadata approach

8: Management==Interaction WS-Transfer style Protocols GET PUT etc.

9: Policy and Agreements Service dependent. Processed by application

10: Portals and User Interfaces Start Pages, AJAX and Widgets(Netvibes) Gadgets

Page 48: Grids Challenged by a Web 2.0 and Multicore Sandwich

WS-* Areas and Multicore WS-* Specification Area Typical Grid/Web Service Examples

1: Core Service Model Fine grain Java C# C++ Objects and coarse grain services as in DSS. Information passed explicitly or by handles. MPI needs to be updated to handle non scientific applications as in CCR

2: Service Internet Not so important intrachip

3: Notification Publish-Subscribe for events and Interrupts

4: Workflow and Transactions Many approaches; scripting languages popular

5: Security Not so important intrachip

6: Service Discovery Use libraries

7: System Metadata and State Environment Variables

8: Management == Interaction Interaction between objects key issue in parallel programming trading off efficiency versus performance

9: Policy and Agreements Handled by application

10: Portals and User Interfaces Web 2.0 technology popular

Page 49: Grids Challenged by a Web 2.0 and Multicore Sandwich

CCR as an example of a Cross Paradigm Run Time

• Naturally supports fine grain thread switching with message passing with around 4 microsecond latency for 4 threads switching to 4 others on an AMD PC with C#. Threads spawned – no rendezvous

• Has around 50 microsecond latency for coarse grain service interactions with DSS extension which supports Web 2.0 style messaging

• MPI Collectives – Shift and Exchange vary from 10 to 20 microsecond latency in rendezvous mode

• Not as good as best MPI’s but managed code and supports Grids Web 2.0 and Parallel Computing ……

• See http://grids.ucs.indiana.edu/ptliupages/publications/CCRApril16open.pdf

Page 50: Grids Challenged by a Web 2.0 and Multicore Sandwich

PC07Intro [email protected] 50

Microsoft CCR• Supports exchange of messages between threads using named

ports• FromHandler: Spawn threads without reading ports• Receive: Each handler reads one item from a single port• MultipleItemReceive: Each handler reads a prescribed number of

items of a given type from a given port. Note items in a port can be general structures but all must have same type.

• MultiplePortReceive: Each handler reads a one item of a given type from multiple ports.

• JoinedReceive: Each handler reads one item from each of two ports. The items can be of different type.

• Choice: Execute a choice of two or more port-handler pairings• Interleave: Consists of a set of arbiters (port -- handler pairs) of 3

types that are Concurrent, Exclusive or Teardown (called at end for clean up). Concurrent arbiters are run concurrently but exclusive handlers are

• http://msdn.microsoft.com/robotics/

Page 51: Grids Challenged by a Web 2.0 and Multicore Sandwich

0

5

10

15

20

25

0 2 4 6 8 10

Tim

e

Overhead (latency) of AMD 4-core PC with 4 execution threads on MPI style Rendezvous Messaging for Shift and Exchange implemented either as two shifts or as custom CCR pattern. Compute time is 10 seconds divided by number of stages

Stages (millions)

Time Microseconds

Rendezvous exchange as two shiftsRendezvous exchange customized for MPIRendezvous Shift

Page 52: Grids Challenged by a Web 2.0 and Multicore Sandwich

0

10

20

30

40

50

60

70

80

90

0 0.2 0.4 0.6 0.8 1

Tim

e

INTELEXch INTEL Exch as 2 Shifts INTEL Shift

Overhead (latency) of INTEL 8-core PC with 8 execution threads on MPI style Rendezvous Messaging for Shift and Exchange implemented either as two shifts or as custom CCR pattern. Compute time is 15 seconds divided by number of stages

Stages (millions)

Time Microseconds

Rendezvous exchange as two shiftsRendezvous exchange customized for MPIRendezvous Shift

Page 53: Grids Challenged by a Web 2.0 and Multicore Sandwich

PC07Intro [email protected] [email protected] 5353

0

50

100

150

200

250

300

350

1 10 100 1000 10000

Round trips

Av

era

ge

ru

n t

ime

(m

icro

se

co

nd

s)

Timing of HP Opteron Multicore as a function of number of simultaneous two-way service messages processed (November 2006 DSS Release)

CGL Measurements of Axis 2 shows about 500 microseconds – DSS is 10 times better

DSS Service Measurements