Converged Web/Grid Services for Content-based Networking, Event Notification, Resource Management...
-
date post
20-Dec-2015 -
Category
Documents
-
view
220 -
download
2
Transcript of Converged Web/Grid Services for Content-based Networking, Event Notification, Resource Management...
Converged Web/Grid ServicesConverged Web/Grid Servicesfor Content-based Networking, for Content-based Networking, Event Notification, Resource Event Notification, Resource Management and WorkflowManagement and Workflow
Short Title: Event-Driven WorkflowsShort Title: Event-Driven Workflows
Craig A. LeeThe Aerospace Corporation
IntroductionIntroduction• Goals
– Automatically detect, ingest, and disseminate input data events
– Automatically analyze the events and known data with minimal human-in-the-
loop interaction
– Automatically plan responses
– Execute distributed workflows to enact the response
• Focus: Event-Driven WorkflowsEvent-Driven Workflows
– aka Dynamic Workflowsaka Dynamic Workflows
– Events delivered to decision-making elements that need to know
– Decision makers plan responses as determined by policy
– Responses executed as distributed workflows
Outline and ApproachOutline and Approach
• Motivation– DDDAS – Dynamic Data-Driven Application Systems
• Present Top-Level Concept and Scenario– Event Notification– Workflow Management– Event-Driven Workflows
• Discuss Required Technologies– General Concepts– State of current implementations– Outstanding Issues
ExperimentMeasurements
Field-DataUser
Theory
(First P
rincip
les)
Simula
tions
(Math
.Modeli
ng
Phenomenol
ogy) ExperimentMeasurements
Field-DataUser
Theory
(First P
rincip
les)
Simula
tions
(Math
.Modelin
g
Phenomenolo
gy
Observ
ation M
odeling
Design)
OLD
(serialized and static)
NEW PARADIGM
(Dynamic Data-Driven Simulation Systems)
Challenges:Application Simulations DevelopmentAlgorithms Computing Systems Support
Dynam
ic
Feed
back
& C
ontro
l
Loop
What is DDDAS
Frederica Darema, NSF
ExamplesExamples of Applications benefiting from the new paradigmof Applications benefiting from the new paradigm
• Engineering (Design and Control) – aircraft design, oil exploration, semiconductor mfg, structural eng– computing systems hardware and software design
(performance engineering)
• Crisis Management– transportation systems (planning, accident response)– weather, hurricanes/tornadoes, floods, fire propagation
• Medical– customized surgery, radiation treatment, etc– BioMechanics /BioEngineering
• Manufacturing/Business/Finance– Supply Chain (Production Planning and Control)– Financial Trading (Stock Mkt, Portfolio Analysis)
DDDAS has the potential to revolutionize science, engineering, & management systems
Fire ModelFire Model• Sensible and latent heat fluxes
from ground and canopy fire -> heat fluxes in the atmospheric model.
• Fire’s heat fluxes are absorbed by air over a specified extinction depth.
• 56% fuel mass -> H20 vapor
• 3% of sensible heat used to dry ground fuel.
• Ground heat flux used to dry and ignite the canopy.
Kirk Complex Fire. U.S.F.S. photoSlide Courtesy of Cohen/NCAR
Coupled atmospheric and wildfire modelsCoupled atmospheric and wildfire models
Slide Courtesy of Cohen/NCAR
AMAT Centura Chemical Vapor Deposition ReactorAMAT Centura Chemical Vapor Deposition ReactorAMAT Centura Chemical Vapor Deposition ReactorAMAT Centura Chemical Vapor Deposition ReactorOperating ConditionsReactor Pressure 1 atmInlet Gas Temperature 698 KSurface Temperature 1173 KInlet Gas-Phase Velocity 46.6 cm/sec
SiCl3H HCl + SiCl2
SiCl2H2 SiCl2 + H2 SiCl2H2 HSiCl + HClH2ClSiSiCl3 SiCl4 + SiH2
H2ClSiSiCl3 SiCl3H + HSiClH2ClSiSiCl3 SiCl2H2 + SiCl2
Si2Cl5H SiCl4 + HSiClSi2Cl5H SiCl3H + SiCl2
Si2Cl6 SiCl4 + SiCl2
Gas Phase ReactionsGas Phase ReactionsSiCl3H + 4s Si(B) + sH + 3sClSiCl2H2 + 4s Si(B) + 2sH + 2sClSiCl4 + 4s Si(B) + 4sClHSiCl + 2s Si(B) + sH + sClSiCl2 + 2s Si(B) + 2sCl2sCl + Si(B) SiCl2 + 2sH2 + 2s 2sH2sH 2s + H2
HCl + 2s sH + sClsH + sCl 2s + HCl
Surface ReactionsSurface Reactions
Slide Courtesy of McRae/MIT
A DDDAS ModelA DDDAS Model(Dynamic, Data-Driven Application Systems)(Dynamic, Data-Driven Application Systems)
Spectrum of Physical SystemsSpectrum of Physical Systems
Humans:3 Hz.
Cosmological:10e-20 Hz.
Subatomic:10e+20 Hz.
ComputationalInfrastructure(grids, perhaps?)
ModelsModels
ComputationsComputations
Discover, Ingest, Interact
Discover,Ingest,Interact
sensors & actuators sensors & actuators sensors & actuators
Loads a Loads a behaviorbehavior into intothe infrastructurethe infrastructure
Craig Lee, IPDPS panel, 2003
Top-Level ConceptTop-Level Concept
• A Combined Event Notification and Workflow Management System– A highly flexible Event Notification system
automatically delivers events of all manner to necessary recipients
– A Workflow Management system schedules and coordinates all necessary actions in response to known events
CommunicationDomain
Top-Level ConceptTop-Level Concept
DecisionMaker
DecisionMaker
Policy
SensedEvents
ResourceInformation
Service
discovery
Response
Abstract Plan
DecisionMaker
Concrete Action
register
CommunicationDomain
Top-Level ConceptTop-Level Concept
DecisionMaker
DecisionMaker
Policy
SensedEvents
ResourceInformation
Service
discovery
Response
Abstract Plan
DecisionMaker
Concrete Action
register
Content-Based Routing Domain
CommunicationDomain
Top-Level ConceptTop-Level Concept
DecisionMaker
DecisionMaker
Policy
SensedEvents
ResourceInformation
Service
discovery
Response
Abstract Plan
DecisionMaker
Concrete Action
register
PersistentDecision-makingComputationsDeterminedby Policy
CommunicationDomain
Top-Level ConceptTop-Level Concept
DecisionMaker
DecisionMaker
Policy
SensedEvents
ResourceInformation
Service
discovery
Response
Abstract Plan
DecisionMaker
Concrete Action
register
Grid Information Service
CommunicationDomain
Top-Level ConceptTop-Level Concept
DecisionMaker
DecisionMaker
Policy
SensedEvents
ResourceInformation
Service
discovery
Response
Abstract Plan
DecisionMaker
Concrete Action
register
Dynamic Grid Workflow Management
Required TechnologiesRequired Technologies• Events delivered to decision-making elements that need to know
– Event Notification Service Managed by Publish/SubscribePublish/Subscribe• Pre-defined Topics• Publication Advertisements• User-defined Attributes
– Content-Based RoutingContent-Based Routing• Distributed Hash TablesDistributed Hash Tables• Composible Name SpacesComposible Name Spaces
• Decision makers plan responses as determined by policy– Event analysis could require rule-based or other systems for deducing the
“meaning” of sets of events– Planning requires “path construction” from current state to goal state– Semantic Analysis and Planning are Out of Scope for this briefing
• Responses executed as distributed workflows– Workflow EngineWorkflow Engine independently manages
• Scheduling of Data TransferScheduling of Data Transfer• Scheduling of Process ExecutionScheduling of Process Execution
What Must an Event Service Provide?What Must an Event Service Provide?• Standard event representation
– What events look like– Extensible Metadata Schema
• Delivery Properties– Security, Reliability
• Registry and Discovery Services– Enable event producers and consumers to find each other– Registries directly support Service-Oriented Architectures
• Direct Addressing– When producers/consumers are well-known to each other
• Publish/SubscribePublish/Subscribe– When consumers need certain types of events rather than events
from a particular producer
Managing Publish/SubscribeManaging Publish/Subscribe• Pre-defined Topics
– Topic is a well-known, named channel carrying events of a pre-defined type– Producers must know on which channel to publish an event– Consumers must know which channel carries desired events– Well-known, named channel is similar to a multicast group– Channel creation and termination relatively infrequent
• Publication Advertisements– Event Producer advertises that it produces events of type X– Event Consumers must discover Producers based on interesting event type– Consumers subscribe to interesting Producers by making direct connection– Consumers and Producers are explicitly known to each other
• User-defined Attributes– User specifies desired events by specifying their attributes, or content– Attributes must be sufficiently specified to get what you want– Requires well-known attribute or meta-data schema– Producers and Consumers do not know each other– AKA, Content-Based RoutingContent-Based Routing
Content-Based RoutingContent-Based Routing
• AKA, Message-Passing with Associative Addressing– Requires an associative matching operation
• A fundamental and powerful capability– Enables a number of very useful capabilities and services
(as we shall see)
• But notoriously expensive to implement– How can matching be done efficiently in a wide-area grid
environment???
• Can users and apps find a “sweet-spot” where content-based routing is constrained enough to be practical and provide capabilities that can’t be accomplished any other way?– Scale of DeployabilityScale of Deployability
Uses for Content-Based EventsUses for Content-Based Events
• A general Grid Event Service
• Resource Discovery
• Fault Tolerance
• Topology-Enhanced Communication
• Distributed Simulations
General Architecture for Content-Based General Architecture for Content-Based Event Notification SystemEvent Notification System
Peer-to-Peer Network
Events are published to the P2P network which are then routed to subscribers
Subscription “signals” propagate through the P2P Network
Example: Distributed SimulationsExample: Distributed Simulations
• DMSO HLA– Defense Modeling and Simulation Office– High Level Architecture– Defines several services to enable federation of
distributed simulations
• Data Distribution Management– AKA Interest Management– Events are only distributed to those simulated entities
that need the event, i.e., are interested in it– Can greatly reduce communication volume by not
broadcasting all data to all hosts– Based on hyper-box intersection over some set of
dimensions
• Content-Based Routing -- Filtering
Interest Management Based on Interest Management Based on Hyperbox Intersection ModelHyperbox Intersection Model
• Events defined over set of n attributes
• Interest defined by set of n attribute subranges
• These subranges form an n-dimensional hyperbox
• Events are conveyed when hyperboxes intersect
Update Region U1
SubscriptionS1
Subscription S2
Events produced by U1 and consumed by S1
Attribute X
Att
ribut
e Y
Red Tank Platoon B Red Tank Platoon A Blue Airstrike
Tank/Jet Fighter EngagementTank/Jet Fighter Engagement
DARPA Active Networks Demo, Dec. 2000, Zabele, Braden, Murphy, Lee
Y
Y
N
NGot
Want
ColorKey
Fundamental Design IssuesFundamental Design Issues• Two Major Issues
(1) Local Matching Problem(2) Peer Network Design
• Local Matching as a Database Query problem– Subscriptions are data– Events are queries on subscriptions– Many one- and multi-dimensional indexing schemes
• Peer Network Design– Propagation of subscriptions, publication advertisements,
and events throughout the network
• Previous systems to study– CORBA Event Service– Java Event Service
Middleware Design ElementsMiddleware Design Elements• Middleware separates
– Application message-passing and logic
– Topology construction, routing protocol management, and forwarding
• Topology Construction– Build interconnected peer groups and peer network
• Routing Protocols– Distributed resource name and peer associations
across the peer network– Dynamic resource discovery
• Forwarding Engine– Hop-by-hop request (or message) forwarding from
source peer to destination peer through peer network over paths established by the routing protocols
CBR Implementation ApproachesCBR Implementation Approaches
• Content-based routing attempts to deliver data using a publish/subscribe paradigm– Define data in different packet types– Apply different routing based on these packet types
• Avoids complexity of traditional methods, while retaining power of publish/subscribe paradigm– No complex frameworks– Uses events
• Two Major Implementation Approaches– Distributed Hash Tables (DHTs)– Composible Name Spaces
CBR Using DHTsCBR Using DHTs• Routing and construction
– d-dimension Cartesian coordinate space on a d-torus– This space is dynamically partitioned by nodes– Space holds (key,value) pairs hashed to points within it– Nodes are added by splitting existing zones, removed by joining zones
• Pros– Scalable routing– Scalable indexing
• Cons– Resulting overlay may or may not be observe important features of
underlying physical network– Hashing Function in use must be pre-defined– No security considerations
• Malicious nodes can act as malicious client, router AND server
Distributed HashesDistributed Hashes
HashingFunction
• Distributed hashes redistribute nodes based on their IDs– Break physical and administrative
locality– Resultant structure is dependent on
logical ID assignment
• Element addressed by logical ID– Independent from physical location
Current Work in DHT-based CBRCurrent Work in DHT-based CBR
• Pastry– Rowstron and Druschel– Rice University
• Chord– Stoica, Morris, Karger, Kaashoek and Balakrishnan– MIT
• Tapestry– Zhao, Kubiatowicz and Joseph– UC Berkeley
PastryPastry
• Routing and construction– Nodes are assigned 128-bit nodeID indicating position in a circular
space– Random assignment– Nodes maintain a routing table, neighborhood set and leaf set
information– Node addition involves initializing node tables and announcing
arrival– Node removal involves using neighborhood nodes to fill in missing
routing information• Pros
– Employs locality properties– Self-organizing
• Cons– Security
TapestryTapestry
• Routing and construction– Based on Plaxton mesh using neighbor maps– Original Plaxton mesh is a small, static data structure
enabling routing across an arbitrarily-sized network
• Pros– Fault-tolerant through redundancy– Scalable
• Cons– Security
ChordChord
• Routing and construction– Keys are mapped to nodes using a distributed hash function– Ring organization– Queries are forwarded along the ring– Node additions and removals require successors to be
notified and finger tables to be updated
• Pros– Operates even with incorrect tables and missing nodes
• Cons– All nodes must explicit know each other– Security
Chord Look-up IllustrationChord Look-up Illustration
n0n15
n10
data n9 n6
ComparisonComparison
• Common Features– Systems are overlays on
existing networks (logical, location-independent organization)
• “Dog leg” paths possible
– Uses distributed hash tables in construction
– Tries to provide scalable wide-area infrastructure
• Differences– Each system targets a
slightly different set of optimizations to solve the general P2P problem
– Each system has slightly different strengths
CBR Based on Composible Name SpacesCBR Based on Composible Name Spaces
• Location Independent Name Space– Maps resource names to a set of equivalent peers
• A peer system’s name space is unique to itself• Every resource in the peer system is uniquely named
(globally unique w/in the peer system)• Two types of names:
– Complete name: The name of a single resource• http://www.aero.org/CSRD/ActiveNets/wombats.html
– Name space region: The name of a group of resources, indicated by a trailing wildcard
• http://www.aero.org/CSRD/*• http://*.org/
• Name space could be as general as an XML DTD
FLAPPS:FLAPPS: Forwarding Layer for Forwarding Layer for Application-level Peer-to-Peer ServicesApplication-level Peer-to-Peer Services
Each peer sends, receives requests using service’s own resource name space
Multiple peer services operate on top of FLAPPS application sublayer
Framework approach to toolkit implementation
Individual peer’s needs determine deployed topology construction protocols, routing protocol, forwarding behaviors and directives
FLAPPS Design Elements (1/2)FLAPPS Design Elements (1/2)• Location Independent, Service-specific Name Space
– Decomposable name space used to represent resources• A resource is an object or function offered by a remote peer
– Name is a concatenation of name components• Name: n1 n2 n3 ... ni
• Prefix name: n1 n2 n3 ... *
– Service provides the name decomposition function
• Peer Network and Topology Construction– Exploits overlay network systems to:
• Local peers organize into peer groups
• Interconnected peer groups create peer network
– Variety in overlay network system allows service-specific peer network topology construction
FLAPPS Design Elements (2/2)FLAPPS Design Elements (2/2)• Routing Protocols
– Establishes local peer reachability, forwarding path to remote peer resources
– Reachability builds name to equivalent next-hop peer sets over time, dynamic resource discovery
– Reachability updates customizable, provide data for forwarding behaviors
• Forwarding– Hop-by-hop request, message relay from source peer
through transit peers to remote peer– Next hop peer determined by longest prefix matching
over decomposable name– Forwarding behavior: Next-hop peer selection function– Forwarding directive: sequentially applied forwarding
behaviors
Persistent GRID
sensoractuator
Ad Hoc GRID
MDS• Bastion peer advertises aggregated resource names• Manages power-aware routing and forwarding• Understands ad hoc topology management
• Edge peers interface with persistent grid• Utilizes MDS to manage ad hoc configuration• Hoards ad hoc information based on activity • Understands interest-based routing
weather.<lat,lon:lat,lon>[1]
weather.<lat,lon:lat,lon>[2]
weather.<lat,lon:lat,lon>[12]
tracking.<lat,lon>[obj_id]
tracking.<lat,lon>[obj_id]
Managing a Wired and Ad Hoc GridManaging a Wired and Ad Hoc Grid(in the field) with a FLAPPS Namespace(in the field) with a FLAPPS Namespace
Grid Workflow ManagementGrid Workflow Management
• Dynamic organization of computing services– Applications typically built with task organization "hard-coded"– Workflow enables this to be decided "on-the-fly”
• Independent scheduling of data transfer and process Independent scheduling of data transfer and process executionexecution
– Key Capability for all Workflow tools– Subsequent task may not exist when previous task completes– Where subsequent task is to execute may not even be decided– Output data may have to be buffered until it is needed/can be
used
• “Process programming” in a distributed environment
Workflow Design ConsiderationsWorkflow Design Considerations
• Representation?– DAGs– XML
• Creation– Eager vs. lazy binding of service to physical resources
• Discovery– Eager vs. lazy binding of workflow to service
• Data Persistence and Lifetime– How long does the data live where it is?
• Workflow Engine – manages the workflow– Centralized?– Decentralized?
Survey of Grid Projects Involving WorkflowSurvey of Grid Projects Involving Workflowhttp://www.extreme.indiana.edu/swf-surveyhttp://www.extreme.indiana.edu/swf-survey
Workflow in GridRPCWorkflow in GridRPC• GridRPC is a grid-enabled Remote Procedure Call
– RPC is an established, widely used distributed programming tool
• GridRPC API supports service discovery
• Discovered service is represented by a function handle
• Data transferred between client and service or between services is represented by a data handle
• Function and data handles allow data transfer and service execution to be managed independently
• This is a centralized approach
• GridRPC under standardization at the Global Grid Forum– GGF the international standards body for grid computing– www.ggf.org, forge.gridforum.org/projects/gridrpc-wg
Data Handle ProposalData Handle Proposal• A data handle is a reference to data that may reside
anywhere
• Data and data handles may be created separately
• Binding is the fundamental operation on DHs
– DHs could be bound to existing data
– DHs could be bound to where you want the data to be
• Binding can be:– ExplicitExplicit: user explicitly specifies bind operations and lifetime
– ImplicitImplicit: user specifies use modes where the run-time system manages the bind (GRAAL approach)
Operations on Data Handles, v2Operations on Data Handles, v2 (General, operational semantics without using exact function signatures)(General, operational semantics without using exact function signatures)
• create()
– Create a new DH and bind it to a specific machine. DHs are always created bound. DHs may be bound the local host or to a remote host. The data referenced by the DH is not valid after this call completes.
• write()
– Write data to the machine, referenced by the DH, that is maintaining storage for it. This storage does not necessarily have to be pre-allocated nor does the length have to be known in advance. If the DH is bound to the local host, then an actual data copy is not necessary. If the DH is bound to a remote host, then the data is copied to the remote host. The data referenced by the DH is valid after this call completes.
• read()
– Read the data referenced by the DH from whatever machine is maintaining the data. While reading remote data is implicitly making a copy of this data, this copy is not guaranteed to have any persistence properties or to be remotely accessible itself. Reading on an invalid DH is an error.
• inspect()
– Allow the user to determine if the data referenced by the DH is valid, what machine is referenced, the length of the data, and possibly its structure. Could be returned as XML.
• delete_data()
– Free the data (storage) referenced by the DH.
• delete_handle()
– Free just the DH.
Generic Lifecycle of Data and a Data HandleGeneric Lifecycle of Data and a Data Handle
Create and Bind DH
Machine X
Data
time
Write to Data Handle
Delete Data
Delete Data Handle
Data is Invalid
Data is Valid
Data is again Invalid
Read Data
Simple RPC with Data on ClientSimple RPC with Data on Client
create input_DH bound to local hostcreate input data
write input data to input_DHcreate output_DH bound to local host
call( input_DH, output_DH )
read input_DH
data sent
Client Svc A
execute service
write output dataon output_DH
(neither input nor outputdata is subsequently
available on this server)
delete input_DHdelete input data
delete output_DHdelete output data
In Dataon Client
Out Dataon Client
Simple RPC where the Input and Output Data Simple RPC where the Input and Output Data Remain on the ServerRemain on the Server
call( input_DH, output_DH )
Send input data
Client Svc AIn Data
on Svc A
execute service
delete input_DHdelete input data
delete output_DHdelete output data
create input_DH and bind to Svc Acreate input data
write input data to input_DH
create output_DH and bind to Svc A
Write data on output_DH
read output_DH
return output_DH
data sent
(input and outputdata still available
on this server)
(data no longer available)
Out Dataon Svc A
Two Successive RPCs on the Same ServerTwo Successive RPCs on the Same Server
Client Svc A
execute service
create input_DH and bind to localhostcreate input data
create output1_DH and bind to Svc A
Write data on output1_DH
return output_DH
delete all data handlesdelete all data
Out1 Dataon Svc A
call( input_DH, output1_DH )
read input_DH
data sent
call( outpu1t_DH, output2_DH)
write data onoutput2_DH
(output data still availableon this server)create output2_DH and bind to local host
execute service
In Dataon Client
Out2 Dataon Client
Two Successive RPCs on Different ServersTwo Successive RPCs on Different Servers
Client Svc A Svc B
call( output1_DH, output2_DH)
return output1_DH
write data onoutput2_DH
(output data still availableon this server)
execute service
read output1_DH
data sent
create input_DH and bind to localhostcreate input data
create output1_DH and bind to Svc A
call( input_DH, output1_DH )
read input_DH
data sent
execute service
Write data on output1_DH
create output2_DH and bind to local host
delete all data handlesdelete all data
Combining Events and Workflow: Combining Events and Workflow: Dynamic Event-Driven WorkflowsDynamic Event-Driven Workflows
• Besides events precipitating an initial response workflow, subsequent events may alter an existing workflow that is underway– Current amount of workflow completed must be determined– Current tasks on the “leading edge” of the workflow must be terminated or
allowed to complete– Status and disposition of data referenced by data handles must be
determined– “Classical” storage management issues reoccur
• Dangling references to no data or stale data• Unaccessible data referenced by no one
• Such event-driven task mgmt is similar to fault tolerance– Similar mechanisms could be used to detect and respond to faults (failed
servers, networks, etc.)
• Directly Supports DDDAS ConceptDirectly Supports DDDAS Concept
Responding to Events under Responding to Events under Centralized Workflow ControlCentralized Workflow Control
ClientMaking Decision
(CentralizedControl)
What is State of Workflow What is State of Workflow When Event Received?When Event Received?
EventSubscription
EventNotification
Possible Actions after Event:Possible Actions after Event:• Do NothingDo Nothing• Cancel Entire WorkflowCancel Entire Workflow• Cancel Part of WorkflowCancel Part of Workflow• Conditional WorkflowConditional Workflow
How is Workflow Decided?How is Workflow Decided?• Client statically decides workflow Client statically decides workflow
services and servers prior to start-time services and servers prior to start-time • Client incrementally decides services Client incrementally decides services
and servers during run-timeand servers during run-time
In General, Nested or Recursive In General, Nested or Recursive Workflows will be PossibleWorkflows will be Possible
ClientMaking Decision
(CentralizedControl)
What is State of Workflow What is State of Workflow When Event Received?When Event Received?
EventSubscription
EventNotification
Even if Control is Centralized,Even if Control is Centralized,Client May Not Know Entire Workflow StateClient May Not Know Entire Workflow State
How Can a Centralized Client How Can a Centralized Client Control the Workflow?Control the Workflow?• Terminate part or all of current (possibly nested
or recursive) workflow– Synchronously
• Wait for current (top-level) calls to complete
– Asynchronously• Follow call-tree to terminate• Use events to short-circuit
• Discard remaining part of workflow• Conditionally execute alternative workflow
Event-Driven, Call Tree-Based CancellationEvent-Driven, Call Tree-Based Cancellation
• Cancellation could propagate down a branch• Rejection could propagate up
• Rejection up one branch could precipitate cancellation down another branch
Event-Driven, Pub/Sub Group-Based Event-Driven, Pub/Sub Group-Based CancellationCancellation
• Number and location of descendents may not be known
• All members of group “listen” for appropriate cancellation events
• Pub/Sub “short-circuits” tree-based cancellation
Responding to Events under Responding to Events under Decentralized Workflow ControlDecentralized Workflow Control
ClientMaking Decision(Decentralized
Control)
EventSubscription
EventNotification
• Workflow RepresentationWorkflow Representation passed among workflow services passed among workflow services• Initiating Client does not explicitly manage each serviceInitiating Client does not explicitly manage each service• Nested, recursive workflows still possiblyNested, recursive workflows still possibly•This approach similar to This approach similar to Active AgentsActive Agents
Summary and ConclusionsSummary and Conclusions
System Design and Implementation IssuesSystem Design and Implementation Issues• CBR Metadata Design
– Enterprise-wide schema?– Ontologies to facilitate interoperable schemas?
• Scalable (Large Scale) Deployment– Constrain/aggregate application event space for hit performance goals– Lossy event specs– Per application basis? Per instance basis?
• Fault Tolerance & Survivability are crucial for large, distributed systems– Even when not under attack, such systems will have failures somewhere– Fault recovery will require sophisticated planning and plan repair
• Mechanisms ay be highly domain-dependent • Use of soft-state to enhance fault tolerance
– Event-driven Workflow Repair• Domain events and Fault Detection events
• Security– Integrated at all levels
Advanced System Design ConceptsAdvanced System Design Concepts• Dynamic, Event-driven Workflows directly support DDDAS
– Incorporates many concepts, including Active Agents
• Web Service Resource Framework (WSRF)– Converge Grid/Web framework– Under development by Global Grid Forum and OASIS
• Representational State Transfer (REST)– Representational State Transfer developed as model to understand
agency and latency• Latency introduces “light cone” of knowable universe• Centralized, Distributed, Estimated, and Decentralized state
– Abstract design approach under development at UC Irvine
• Both WSRF and REST allow the independent management of state and service
– WSRF as an implementation approach for REST and dynamic workflows?
• Event distribution in RESTful regimes – ARREST (Asynchronous Routed REST)
Questions?Questions?