Converged Web/Grid Services for Content-based Networking, Event Notification, Resource Management...

Converged Web/Grid ServicesConverged Web/Grid Servicesfor Content-based Networking, for Content-based Networking, Event Notification, Resource Event Notification, Resource Management and WorkflowManagement and Workflow

Short Title: Event-Driven WorkflowsShort Title: Event-Driven Workflows

Craig A. LeeThe Aerospace Corporation

IntroductionIntroduction• Goals

– Automatically detect, ingest, and disseminate input data events

– Automatically analyze the events and known data with minimal human-in-the-

loop interaction

– Automatically plan responses

– Execute distributed workflows to enact the response

• Focus: Event-Driven WorkflowsEvent-Driven Workflows

– aka Dynamic Workflowsaka Dynamic Workflows

– Events delivered to decision-making elements that need to know

– Decision makers plan responses as determined by policy

– Responses executed as distributed workflows

Outline and ApproachOutline and Approach

• Motivation– DDDAS – Dynamic Data-Driven Application Systems

• Present Top-Level Concept and Scenario– Event Notification– Workflow Management– Event-Driven Workflows

• Discuss Required Technologies– General Concepts– State of current implementations– Outstanding Issues

ExperimentMeasurements

Field-DataUser

Theory

(First P

rincip

les)

Simula

tions

(Math

.Modeli

ng

Phenomenol

ogy) ExperimentMeasurements

Field-DataUser

Theory

(First P

rincip

les)

Simula

tions

(Math

.Modelin

g

Phenomenolo

gy

Observ

ation M

odeling

Design)

OLD

(serialized and static)

NEW PARADIGM

(Dynamic Data-Driven Simulation Systems)

Challenges:Application Simulations DevelopmentAlgorithms Computing Systems Support

Dynam

ic

Feed

back

& C

ontro

l

Loop

What is DDDAS

Frederica Darema, NSF

ExamplesExamples of Applications benefiting from the new paradigmof Applications benefiting from the new paradigm

• Engineering (Design and Control) – aircraft design, oil exploration, semiconductor mfg, structural eng– computing systems hardware and software design

(performance engineering)

• Crisis Management– transportation systems (planning, accident response)– weather, hurricanes/tornadoes, floods, fire propagation

• Medical– customized surgery, radiation treatment, etc– BioMechanics /BioEngineering

• Manufacturing/Business/Finance– Supply Chain (Production Planning and Control)– Financial Trading (Stock Mkt, Portfolio Analysis)

DDDAS has the potential to revolutionize science, engineering, & management systems

Fire ModelFire Model• Sensible and latent heat fluxes

from ground and canopy fire -> heat fluxes in the atmospheric model.

• Fire’s heat fluxes are absorbed by air over a specified extinction depth.

• 56% fuel mass -> H20 vapor

• 3% of sensible heat used to dry ground fuel.

• Ground heat flux used to dry and ignite the canopy.

Kirk Complex Fire. U.S.F.S. photoSlide Courtesy of Cohen/NCAR

Coupled atmospheric and wildfire modelsCoupled atmospheric and wildfire models

Slide Courtesy of Cohen/NCAR

AMAT Centura Chemical Vapor Deposition ReactorAMAT Centura Chemical Vapor Deposition ReactorAMAT Centura Chemical Vapor Deposition ReactorAMAT Centura Chemical Vapor Deposition ReactorOperating ConditionsReactor Pressure 1 atmInlet Gas Temperature 698 KSurface Temperature 1173 KInlet Gas-Phase Velocity 46.6 cm/sec

SiCl3H HCl + SiCl2

SiCl2H2 SiCl2 + H2 SiCl2H2 HSiCl + HClH2ClSiSiCl3 SiCl4 + SiH2

H2ClSiSiCl3 SiCl3H + HSiClH2ClSiSiCl3 SiCl2H2 + SiCl2

Si2Cl5H SiCl4 + HSiClSi2Cl5H SiCl3H + SiCl2

Si2Cl6 SiCl4 + SiCl2

Gas Phase ReactionsGas Phase ReactionsSiCl3H + 4s Si(B) + sH + 3sClSiCl2H2 + 4s Si(B) + 2sH + 2sClSiCl4 + 4s Si(B) + 4sClHSiCl + 2s Si(B) + sH + sClSiCl2 + 2s Si(B) + 2sCl2sCl + Si(B) SiCl2 + 2sH2 + 2s 2sH2sH 2s + H2

HCl + 2s sH + sClsH + sCl 2s + HCl

Surface ReactionsSurface Reactions

Slide Courtesy of McRae/MIT

A DDDAS ModelA DDDAS Model(Dynamic, Data-Driven Application Systems)(Dynamic, Data-Driven Application Systems)

Spectrum of Physical SystemsSpectrum of Physical Systems

Humans:3 Hz.

Cosmological:10e-20 Hz.

Subatomic:10e+20 Hz.

ComputationalInfrastructure(grids, perhaps?)

ModelsModels

ComputationsComputations

Discover, Ingest, Interact

Discover,Ingest,Interact

sensors & actuators sensors & actuators sensors & actuators

Loads a Loads a behaviorbehavior into intothe infrastructurethe infrastructure

Craig Lee, IPDPS panel, 2003

Top-Level ConceptTop-Level Concept

• A Combined Event Notification and Workflow Management System– A highly flexible Event Notification system

automatically delivers events of all manner to necessary recipients

– A Workflow Management system schedules and coordinates all necessary actions in response to known events

CommunicationDomain


DecisionMaker

DecisionMaker

Policy

SensedEvents

ResourceInformation

Service

discovery

Response

Abstract Plan

DecisionMaker

Concrete Action

register

CommunicationDomain


DecisionMaker

DecisionMaker

Policy

SensedEvents

ResourceInformation

Service

discovery

Response

Abstract Plan

DecisionMaker

Concrete Action

register

Content-Based Routing Domain

CommunicationDomain


DecisionMaker

DecisionMaker

Policy

SensedEvents

ResourceInformation

Service

discovery

Response

Abstract Plan

DecisionMaker

Concrete Action

register

PersistentDecision-makingComputationsDeterminedby Policy

CommunicationDomain


DecisionMaker

DecisionMaker

Policy

SensedEvents

ResourceInformation

Service

discovery

Response

Abstract Plan

DecisionMaker

Concrete Action

register

Grid Information Service

CommunicationDomain


DecisionMaker

DecisionMaker

Policy

SensedEvents

ResourceInformation

Service

discovery

Response

Abstract Plan

DecisionMaker

Concrete Action

register

Dynamic Grid Workflow Management

Required TechnologiesRequired Technologies• Events delivered to decision-making elements that need to know

– Event Notification Service Managed by Publish/SubscribePublish/Subscribe• Pre-defined Topics• Publication Advertisements• User-defined Attributes

– Content-Based RoutingContent-Based Routing• Distributed Hash TablesDistributed Hash Tables• Composible Name SpacesComposible Name Spaces

• Decision makers plan responses as determined by policy– Event analysis could require rule-based or other systems for deducing the

“meaning” of sets of events– Planning requires “path construction” from current state to goal state– Semantic Analysis and Planning are Out of Scope for this briefing

• Responses executed as distributed workflows– Workflow EngineWorkflow Engine independently manages

• Scheduling of Data TransferScheduling of Data Transfer• Scheduling of Process ExecutionScheduling of Process Execution

What Must an Event Service Provide?What Must an Event Service Provide?• Standard event representation

– What events look like– Extensible Metadata Schema

• Delivery Properties– Security, Reliability

• Registry and Discovery Services– Enable event producers and consumers to find each other– Registries directly support Service-Oriented Architectures

• Direct Addressing– When producers/consumers are well-known to each other

• Publish/SubscribePublish/Subscribe– When consumers need certain types of events rather than events

from a particular producer

Managing Publish/SubscribeManaging Publish/Subscribe• Pre-defined Topics

– Topic is a well-known, named channel carrying events of a pre-defined type– Producers must know on which channel to publish an event– Consumers must know which channel carries desired events– Well-known, named channel is similar to a multicast group– Channel creation and termination relatively infrequent

• Publication Advertisements– Event Producer advertises that it produces events of type X– Event Consumers must discover Producers based on interesting event type– Consumers subscribe to interesting Producers by making direct connection– Consumers and Producers are explicitly known to each other

• User-defined Attributes– User specifies desired events by specifying their attributes, or content– Attributes must be sufficiently specified to get what you want– Requires well-known attribute or meta-data schema– Producers and Consumers do not know each other– AKA, Content-Based RoutingContent-Based Routing

Content-Based RoutingContent-Based Routing

• AKA, Message-Passing with Associative Addressing– Requires an associative matching operation

• A fundamental and powerful capability– Enables a number of very useful capabilities and services

(as we shall see)

• But notoriously expensive to implement– How can matching be done efficiently in a wide-area grid

environment???

• Can users and apps find a “sweet-spot” where content-based routing is constrained enough to be practical and provide capabilities that can’t be accomplished any other way?– Scale of DeployabilityScale of Deployability

Uses for Content-Based EventsUses for Content-Based Events

• A general Grid Event Service

• Resource Discovery

• Fault Tolerance

• Topology-Enhanced Communication

• Distributed Simulations

General Architecture for Content-Based General Architecture for Content-Based Event Notification SystemEvent Notification System

Peer-to-Peer Network

Events are published to the P2P network which are then routed to subscribers

Subscription “signals” propagate through the P2P Network

Example: Distributed SimulationsExample: Distributed Simulations

• DMSO HLA– Defense Modeling and Simulation Office– High Level Architecture– Defines several services to enable federation of

distributed simulations

• Data Distribution Management– AKA Interest Management– Events are only distributed to those simulated entities

that need the event, i.e., are interested in it– Can greatly reduce communication volume by not

broadcasting all data to all hosts– Based on hyper-box intersection over some set of

dimensions

• Content-Based Routing -- Filtering

Interest Management Based on Interest Management Based on Hyperbox Intersection ModelHyperbox Intersection Model

• Events defined over set of n attributes

• Interest defined by set of n attribute subranges

• These subranges form an n-dimensional hyperbox

• Events are conveyed when hyperboxes intersect

Update Region U1

SubscriptionS1

Subscription S2

Events produced by U1 and consumed by S1

Attribute X

Att

ribut

e Y

Red Tank Platoon B Red Tank Platoon A Blue Airstrike

Tank/Jet Fighter EngagementTank/Jet Fighter Engagement

DARPA Active Networks Demo, Dec. 2000, Zabele, Braden, Murphy, Lee

Y

Y

N

NGot

Want

ColorKey

Fundamental Design IssuesFundamental Design Issues• Two Major Issues

(1) Local Matching Problem(2) Peer Network Design

• Local Matching as a Database Query problem– Subscriptions are data– Events are queries on subscriptions– Many one- and multi-dimensional indexing schemes

• Peer Network Design– Propagation of subscriptions, publication advertisements,

and events throughout the network

• Previous systems to study– CORBA Event Service– Java Event Service

Middleware Design ElementsMiddleware Design Elements• Middleware separates

– Application message-passing and logic

– Topology construction, routing protocol management, and forwarding

• Topology Construction– Build interconnected peer groups and peer network

• Routing Protocols– Distributed resource name and peer associations

across the peer network– Dynamic resource discovery

• Forwarding Engine– Hop-by-hop request (or message) forwarding from

source peer to destination peer through peer network over paths established by the routing protocols

CBR Implementation ApproachesCBR Implementation Approaches

• Content-based routing attempts to deliver data using a publish/subscribe paradigm– Define data in different packet types– Apply different routing based on these packet types

• Avoids complexity of traditional methods, while retaining power of publish/subscribe paradigm– No complex frameworks– Uses events

• Two Major Implementation Approaches– Distributed Hash Tables (DHTs)– Composible Name Spaces

CBR Using DHTsCBR Using DHTs• Routing and construction

– d-dimension Cartesian coordinate space on a d-torus– This space is dynamically partitioned by nodes– Space holds (key,value) pairs hashed to points within it– Nodes are added by splitting existing zones, removed by joining zones

• Pros– Scalable routing– Scalable indexing

• Cons– Resulting overlay may or may not be observe important features of

underlying physical network– Hashing Function in use must be pre-defined– No security considerations

• Malicious nodes can act as malicious client, router AND server

Distributed HashesDistributed Hashes

HashingFunction

• Distributed hashes redistribute nodes based on their IDs– Break physical and administrative

locality– Resultant structure is dependent on

logical ID assignment

• Element addressed by logical ID– Independent from physical location

Current Work in DHT-based CBRCurrent Work in DHT-based CBR

• Pastry– Rowstron and Druschel– Rice University

• Chord– Stoica, Morris, Karger, Kaashoek and Balakrishnan– MIT

• Tapestry– Zhao, Kubiatowicz and Joseph– UC Berkeley

PastryPastry

• Routing and construction– Nodes are assigned 128-bit nodeID indicating position in a circular

space– Random assignment– Nodes maintain a routing table, neighborhood set and leaf set

information– Node addition involves initializing node tables and announcing

arrival– Node removal involves using neighborhood nodes to fill in missing

routing information• Pros

– Employs locality properties– Self-organizing

• Cons– Security

TapestryTapestry

• Routing and construction– Based on Plaxton mesh using neighbor maps– Original Plaxton mesh is a small, static data structure

enabling routing across an arbitrarily-sized network

• Pros– Fault-tolerant through redundancy– Scalable

• Cons– Security

ChordChord

• Routing and construction– Keys are mapped to nodes using a distributed hash function– Ring organization– Queries are forwarded along the ring– Node additions and removals require successors to be

notified and finger tables to be updated

• Pros– Operates even with incorrect tables and missing nodes

• Cons– All nodes must explicit know each other– Security

Chord Look-up IllustrationChord Look-up Illustration

n0n15

n10

data n9 n6

ComparisonComparison

• Common Features– Systems are overlays on

existing networks (logical, location-independent organization)

• “Dog leg” paths possible

– Uses distributed hash tables in construction

– Tries to provide scalable wide-area infrastructure

• Differences– Each system targets a

slightly different set of optimizations to solve the general P2P problem

– Each system has slightly different strengths

CBR Based on Composible Name SpacesCBR Based on Composible Name Spaces

• Location Independent Name Space– Maps resource names to a set of equivalent peers

• A peer system’s name space is unique to itself• Every resource in the peer system is uniquely named

(globally unique w/in the peer system)• Two types of names:

– Complete name: The name of a single resource• http://www.aero.org/CSRD/ActiveNets/wombats.html

– Name space region: The name of a group of resources, indicated by a trailing wildcard

• http://www.aero.org/CSRD/*• http://*.org/

• Name space could be as general as an XML DTD

FLAPPS:FLAPPS: Forwarding Layer for Forwarding Layer for Application-level Peer-to-Peer ServicesApplication-level Peer-to-Peer Services

Each peer sends, receives requests using service’s own resource name space

Multiple peer services operate on top of FLAPPS application sublayer

Framework approach to toolkit implementation

Individual peer’s needs determine deployed topology construction protocols, routing protocol, forwarding behaviors and directives

FLAPPS Design Elements (1/2)FLAPPS Design Elements (1/2)• Location Independent, Service-specific Name Space

– Decomposable name space used to represent resources• A resource is an object or function offered by a remote peer

– Name is a concatenation of name components• Name: n1 n2 n3 ... ni

• Prefix name: n1 n2 n3 ... *

– Service provides the name decomposition function

• Peer Network and Topology Construction– Exploits overlay network systems to:

• Local peers organize into peer groups

• Interconnected peer groups create peer network

– Variety in overlay network system allows service-specific peer network topology construction

FLAPPS Design Elements (2/2)FLAPPS Design Elements (2/2)• Routing Protocols

– Establishes local peer reachability, forwarding path to remote peer resources

– Reachability builds name to equivalent next-hop peer sets over time, dynamic resource discovery

– Reachability updates customizable, provide data for forwarding behaviors

• Forwarding– Hop-by-hop request, message relay from source peer

through transit peers to remote peer– Next hop peer determined by longest prefix matching

over decomposable name– Forwarding behavior: Next-hop peer selection function– Forwarding directive: sequentially applied forwarding

behaviors

Persistent GRID

sensoractuator

Ad Hoc GRID

MDS• Bastion peer advertises aggregated resource names• Manages power-aware routing and forwarding• Understands ad hoc topology management

• Edge peers interface with persistent grid• Utilizes MDS to manage ad hoc configuration• Hoards ad hoc information based on activity • Understands interest-based routing

weather.<lat,lon:lat,lon>[1]



tracking.<lat,lon>[obj_id]

tracking.<lat,lon>[obj_id]

Managing a Wired and Ad Hoc GridManaging a Wired and Ad Hoc Grid(in the field) with a FLAPPS Namespace(in the field) with a FLAPPS Namespace

Grid Workflow ManagementGrid Workflow Management

• Dynamic organization of computing services– Applications typically built with task organization "hard-coded"– Workflow enables this to be decided "on-the-fly”

• Independent scheduling of data transfer and process Independent scheduling of data transfer and process executionexecution

– Key Capability for all Workflow tools– Subsequent task may not exist when previous task completes– Where subsequent task is to execute may not even be decided– Output data may have to be buffered until it is needed/can be

used

• “Process programming” in a distributed environment

Workflow Design ConsiderationsWorkflow Design Considerations

• Representation?– DAGs– XML

• Creation– Eager vs. lazy binding of service to physical resources

• Discovery– Eager vs. lazy binding of workflow to service

• Data Persistence and Lifetime– How long does the data live where it is?

• Workflow Engine – manages the workflow– Centralized?– Decentralized?

Survey of Grid Projects Involving WorkflowSurvey of Grid Projects Involving Workflowhttp://www.extreme.indiana.edu/swf-surveyhttp://www.extreme.indiana.edu/swf-survey

Workflow in GridRPCWorkflow in GridRPC• GridRPC is a grid-enabled Remote Procedure Call

– RPC is an established, widely used distributed programming tool

• GridRPC API supports service discovery

• Discovered service is represented by a function handle

• Data transferred between client and service or between services is represented by a data handle

• Function and data handles allow data transfer and service execution to be managed independently

• This is a centralized approach

• GridRPC under standardization at the Global Grid Forum– GGF the international standards body for grid computing– www.ggf.org, forge.gridforum.org/projects/gridrpc-wg

Data Handle ProposalData Handle Proposal• A data handle is a reference to data that may reside

anywhere

• Data and data handles may be created separately

• Binding is the fundamental operation on DHs

– DHs could be bound to existing data

– DHs could be bound to where you want the data to be

• Binding can be:– ExplicitExplicit: user explicitly specifies bind operations and lifetime

– ImplicitImplicit: user specifies use modes where the run-time system manages the bind (GRAAL approach)

Operations on Data Handles, v2Operations on Data Handles, v2 (General, operational semantics without using exact function signatures)(General, operational semantics without using exact function signatures)

• create()

– Create a new DH and bind it to a specific machine. DHs are always created bound. DHs may be bound the local host or to a remote host. The data referenced by the DH is not valid after this call completes.

• write()

– Write data to the machine, referenced by the DH, that is maintaining storage for it. This storage does not necessarily have to be pre-allocated nor does the length have to be known in advance. If the DH is bound to the local host, then an actual data copy is not necessary. If the DH is bound to a remote host, then the data is copied to the remote host. The data referenced by the DH is valid after this call completes.

• read()

– Read the data referenced by the DH from whatever machine is maintaining the data. While reading remote data is implicitly making a copy of this data, this copy is not guaranteed to have any persistence properties or to be remotely accessible itself. Reading on an invalid DH is an error.

• inspect()

– Allow the user to determine if the data referenced by the DH is valid, what machine is referenced, the length of the data, and possibly its structure. Could be returned as XML.

• delete_data()

– Free the data (storage) referenced by the DH.

• delete_handle()

– Free just the DH.

Generic Lifecycle of Data and a Data HandleGeneric Lifecycle of Data and a Data Handle

Create and Bind DH

Machine X

Data

time

Write to Data Handle

Delete Data

Delete Data Handle

Data is Invalid

Data is Valid

Data is again Invalid

Read Data

Simple RPC with Data on ClientSimple RPC with Data on Client

create input_DH bound to local hostcreate input data

write input data to input_DHcreate output_DH bound to local host

call( input_DH, output_DH )

read input_DH

data sent

Client Svc A

execute service

write output dataon output_DH

(neither input nor outputdata is subsequently

available on this server)

delete input_DHdelete input data

delete output_DHdelete output data

In Dataon Client

Out Dataon Client

Simple RPC where the Input and Output Data Simple RPC where the Input and Output Data Remain on the ServerRemain on the Server

call( input_DH, output_DH )

Send input data

Client Svc AIn Data

on Svc A

execute service

delete input_DHdelete input data

delete output_DHdelete output data

create input_DH and bind to Svc Acreate input data

write input data to input_DH

create output_DH and bind to Svc A

Write data on output_DH

read output_DH

return output_DH

data sent

(input and outputdata still available

on this server)

(data no longer available)

Out Dataon Svc A

Two Successive RPCs on the Same ServerTwo Successive RPCs on the Same Server

Client Svc A

execute service

create input_DH and bind to localhostcreate input data

create output1_DH and bind to Svc A

Write data on output1_DH

return output_DH

delete all data handlesdelete all data

Out1 Dataon Svc A

call( input_DH, output1_DH )

read input_DH

data sent

call( outpu1t_DH, output2_DH)

write data onoutput2_DH

(output data still availableon this server)create output2_DH and bind to local host

execute service

In Dataon Client

Out2 Dataon Client

Two Successive RPCs on Different ServersTwo Successive RPCs on Different Servers

Client Svc A Svc B

call( output1_DH, output2_DH)

return output1_DH

write data onoutput2_DH

(output data still availableon this server)

execute service

read output1_DH

data sent

create input_DH and bind to localhostcreate input data

create output1_DH and bind to Svc A

call( input_DH, output1_DH )

read input_DH

data sent

execute service

Write data on output1_DH

create output2_DH and bind to local host

delete all data handlesdelete all data

Combining Events and Workflow: Combining Events and Workflow: Dynamic Event-Driven WorkflowsDynamic Event-Driven Workflows

• Besides events precipitating an initial response workflow, subsequent events may alter an existing workflow that is underway– Current amount of workflow completed must be determined– Current tasks on the “leading edge” of the workflow must be terminated or

allowed to complete– Status and disposition of data referenced by data handles must be

determined– “Classical” storage management issues reoccur

• Dangling references to no data or stale data• Unaccessible data referenced by no one

• Such event-driven task mgmt is similar to fault tolerance– Similar mechanisms could be used to detect and respond to faults (failed

servers, networks, etc.)

• Directly Supports DDDAS ConceptDirectly Supports DDDAS Concept

Responding to Events under Responding to Events under Centralized Workflow ControlCentralized Workflow Control

ClientMaking Decision

(CentralizedControl)

What is State of Workflow What is State of Workflow When Event Received?When Event Received?

EventSubscription

EventNotification

Possible Actions after Event:Possible Actions after Event:• Do NothingDo Nothing• Cancel Entire WorkflowCancel Entire Workflow• Cancel Part of WorkflowCancel Part of Workflow• Conditional WorkflowConditional Workflow

How is Workflow Decided?How is Workflow Decided?• Client statically decides workflow Client statically decides workflow

services and servers prior to start-time services and servers prior to start-time • Client incrementally decides services Client incrementally decides services

and servers during run-timeand servers during run-time

In General, Nested or Recursive In General, Nested or Recursive Workflows will be PossibleWorkflows will be Possible

ClientMaking Decision

(CentralizedControl)

What is State of Workflow What is State of Workflow When Event Received?When Event Received?

EventSubscription

EventNotification

Even if Control is Centralized,Even if Control is Centralized,Client May Not Know Entire Workflow StateClient May Not Know Entire Workflow State

How Can a Centralized Client How Can a Centralized Client Control the Workflow?Control the Workflow?• Terminate part or all of current (possibly nested

or recursive) workflow– Synchronously

• Wait for current (top-level) calls to complete

– Asynchronously• Follow call-tree to terminate• Use events to short-circuit

• Discard remaining part of workflow• Conditionally execute alternative workflow

Event-Driven, Call Tree-Based CancellationEvent-Driven, Call Tree-Based Cancellation

• Cancellation could propagate down a branch• Rejection could propagate up

• Rejection up one branch could precipitate cancellation down another branch

Event-Driven, Pub/Sub Group-Based Event-Driven, Pub/Sub Group-Based CancellationCancellation

• Number and location of descendents may not be known

• All members of group “listen” for appropriate cancellation events

• Pub/Sub “short-circuits” tree-based cancellation

Responding to Events under Responding to Events under Decentralized Workflow ControlDecentralized Workflow Control

ClientMaking Decision(Decentralized

Control)

EventSubscription

EventNotification

• Workflow RepresentationWorkflow Representation passed among workflow services passed among workflow services• Initiating Client does not explicitly manage each serviceInitiating Client does not explicitly manage each service• Nested, recursive workflows still possiblyNested, recursive workflows still possibly•This approach similar to This approach similar to Active AgentsActive Agents

Summary and ConclusionsSummary and Conclusions

System Design and Implementation IssuesSystem Design and Implementation Issues• CBR Metadata Design

– Enterprise-wide schema?– Ontologies to facilitate interoperable schemas?

• Scalable (Large Scale) Deployment– Constrain/aggregate application event space for hit performance goals– Lossy event specs– Per application basis? Per instance basis?

• Fault Tolerance & Survivability are crucial for large, distributed systems– Even when not under attack, such systems will have failures somewhere– Fault recovery will require sophisticated planning and plan repair

• Mechanisms ay be highly domain-dependent • Use of soft-state to enhance fault tolerance

– Event-driven Workflow Repair• Domain events and Fault Detection events

• Security– Integrated at all levels

Advanced System Design ConceptsAdvanced System Design Concepts• Dynamic, Event-driven Workflows directly support DDDAS

– Incorporates many concepts, including Active Agents

• Web Service Resource Framework (WSRF)– Converge Grid/Web framework– Under development by Global Grid Forum and OASIS

• Representational State Transfer (REST)– Representational State Transfer developed as model to understand

agency and latency• Latency introduces “light cone” of knowable universe• Centralized, Distributed, Estimated, and Decentralized state

– Abstract design approach under development at UC Irvine

• Both WSRF and REST allow the independent management of state and service

– WSRF as an implementation approach for REST and dynamic workflows?

• Event distribution in RESTful regimes – ARREST (Asynchronous Routed REST)

Questions?Questions?

Converged Web/Grid Services for Content-based Networking, Event Notification, Resource Management...

Documents

Transcript of Converged Web/Grid Services for Content-based Networking, Event Notification, Resource Management...