NaradaBrokering - SALSAHPCsalsahpc.indiana.edu/b534projects/sites/default/files/public/9... ·...

4
NaradaBrokering Santhosh Kumar Saminathan Indiana University Bloomington Abstract NaradaBrokering is a content distribution infrastructure for Publish / Subscribe architecture that was developed by Community Grids Lab at Indiana University. The system has no bounds on spatial matter and, rate and scope of the information encapsulated within these streams or on the number of entities within the system. NaradaBrokering provides support for the scalable and efficient dissemination of these data streams. The substrate incorporates capabilities to mitigate network-induced effects, and also to ensure that these streams are secure, reliable, ordered and jitter-reduced. All components within the system utilize globally synchronized timestamps. In this survey paper we are going to give details about the Naradabrokering system such as the framework, topology, the matching engine and the web services supported by the system. We primarily focus on the analysis of the system. 1. Introduction The Naradabroker is typically a brokering system for Publish/Subscribe messaging pattern. In general the Publish/Subscribe architecture is one designed to enable a selective topic filtering mechanism for end nodes to receive the required data. The popularity of this system is due to the ability of this architecture to decouple senders and receivers. This allows us a greater scalability in distribute environment over the traditional RPC methods. It is not necessary for this system to maintain a state on the publisher and subscriber, as any one of these can be up any point of time and send/receive the data. The receiver can be selective about the data it needs in this system. This selectivity is achieved using a message filtering mechanism by creating separate classes when uploading a new topic. Similar to traditional distributed systems, Naradabroker also works on multiple nodes. This case is similar to the DNS and thus reducing the chances of possible failure and improving the overall efficiency. 2. Overview The intersystem communication is established using messaging broker, which also hosts the information of the system. A dedicated queuing system ensures allocation of an independent queue for every different class. Once the data is placed in the allocated queue the sender may go down, but the broker will maintain this data unless until explicitly asked to remove the data from the queue.

Transcript of NaradaBrokering - SALSAHPCsalsahpc.indiana.edu/b534projects/sites/default/files/public/9... ·...

Page 1: NaradaBrokering - SALSAHPCsalsahpc.indiana.edu/b534projects/sites/default/files/public/9... · Publish/Subscribe messaging pattern. In general the Publish/Subscribe architecture is

NaradaBrokering

Santhosh Kumar Saminathan Indiana University Bloomington

Abstract

NaradaBrokering is a content distribution infrastructure for Publish / Subscribe architecture that was developed by Community Grids Lab at Indiana University. The system has no bounds on spatial matter and, rate and scope of the information encapsulated within these streams or on the number of entities within the system. NaradaBrokering provides support for the scalable and efficient dissemination of these data streams. The substrate incorporates capabilities to mitigate network-induced effects, and also to ensure that these streams are secure, reliable, ordered and jitter-reduced. All components within the system utilize globally synchronized timestamps. In this survey paper we are going to give details about the Naradabrokering system such as the framework, topology, the matching engine and the web services supported by the system. We primarily focus on the analysis of the system. 1. Introduction The Naradabroker is typically a brokering system for Publish/Subscribe messaging pattern. In general the Publish/Subscribe architecture is one designed to enable a selective topic filtering mechanism for end nodes to receive the required data. The popularity of this system is due to the ability of this architecture to decouple senders and receivers. This allows us a greater scalability in distribute environment over the traditional RPC methods. It is not necessary for this system to maintain a state on the publisher and subscriber, as any one of these can be up any point of time and send/receive the data. The receiver can be selective about the data it needs in this system. This selectivity is achieved using a message filtering mechanism by creating separate classes when uploading a new topic. Similar to traditional distributed systems, Naradabroker also works on multiple nodes. This case is similar to the DNS and thus reducing the chances of possible failure and improving the overall efficiency.

2. Overview The intersystem communication is established using messaging broker, which also hosts the information of the system. A dedicated queuing system ensures allocation of an independent queue for every different class. Once the data is placed in the allocated queue the sender may go down, but the broker will maintain this data unless until explicitly asked to remove the data from the queue.

Page 2: NaradaBrokering - SALSAHPCsalsahpc.indiana.edu/b534projects/sites/default/files/public/9... · Publish/Subscribe messaging pattern. In general the Publish/Subscribe architecture is

The Naradabroker is one such resource brokering system, which allows voluminous data to be sent at exceptionally quick rates. It supports a variety of protocols including TCP/IP, UDP, Multicast, SSL, HTTP, RTP and HHMS. In addition to such flexibility, it provides a number of useful features such as reliable storage interface, a mechanism to compress/decompress messages internally. Since the state is not stored, the failure of a single node due to abrupt crash due issues like imbalanced load distribution will not affect the data stored in each broken node. It is arranged as a cluster topology including many other broker nodes, thus reducing the load on the overall network. Broker nodes are hierarchically arranged in a form of a tree structure with each node holding a map (Broker Network Map) to facilitate traversal through the nodes.

An event is created in order to accommodate the deliver of matched events after reconnect. Narada allows you to subscribe to an event that matches to an event template. The event consumers may create consumer constraints to specify properties regarding the delivery of events. The consumer constraints can specify the conditions like reliable and ordered delivery of events. Similarly the producer can specify constraints related to publishing of messages. This includes message compression, fragmentation and security. The Time To Live (TTL) field is one of the most important parameters as it is mandatory to have a countable value to avoid the count to infinity problem and prevent unwanted deadlocks. Like most of the brokering system, the NaradaBroker also supports the message queue models in the API. An example of such API is Java Message Service (JMS). The Naradabrokering functionalities such as scalability, resilience and providing a distributed solution rather than having a single server are provided to JMS clients. Further it insulates JMS clients from knowledge of other brokers. The broker locators are used to find the valid brokers.

3. BROKER LOCATING AND ROUTING

As already mentioned earlier, the messaging broker is made up of a number of other broker nodes, which are all located in a tree pattern. Logically, each broker has knowledge on how to locate other brokers but however there is a problem in the form that the client must still locate the broker nodes and connect to it before downloading data.

There exists a separate broker locator program, which propagates certain location specific details to the client. Such details include the IP of the broker, the port on which it is listening to and the transport protocol, which it supports and which will be used. This is all the information that the client might need to establish a dedicated connection. One client will have only one dedicated connection to a single broker, however it is possible for a client to maintain a number of connections to many brokers at once thus providing load balance.

When maintaining subscriptions to information, there are two types of subscriptions to maintain, broker level and client level. The client level subscriptions refer to the various clients who have subscribed to different brokers and the broker level subscriptions refer to the various information that is maintained at a particular broker node. Even these client subscriptions are maintained in hierarchy within the system. The broker is left in charge with the client subscriptions whereas the cluster controller handles the broker subscriptions.

Page 3: NaradaBrokering - SALSAHPCsalsahpc.indiana.edu/b534projects/sites/default/files/public/9... · Publish/Subscribe messaging pattern. In general the Publish/Subscribe architecture is

There exists a matching engine that matches the received events to the profiles and the destinations that are computed. Each node maintains its profile separately. This profile is built based on the interest on certain events. There is a constraint included in the profile of each node, which must be satisfied before being routed. Each and every profile has a unique ID and a matching destination, which is most probably recognized as a subscription. There are many cases of such subscriptions hierarchically overloading a super super – cluster system but this can be handled by adding more cluster controllers. The choice of the constraints on the profile greatly affects the performance of the system in certain matching aspects, which will be seen later in this paper. The cluster controller is used to manage broker destinations while the broker manages the client destinations. Each broker has a set of destinations and the best path of these destinations is calculated based on the number of broker hops. 4. MATCHING ENGINES This forms the crux of the system. This is the part of the system that matches the requesting client to the required information at a broker node. There exists a possibility that there might be more than one matching engine per node. The efficiency of the matching as already mentioned earlier depends greatly on the constraints specified in the profile of the node. Different types of constraints can be specified on subscriptions like string based topics separated by a ‘/’ or integer topics, regular expressions, tag – value pairs or even XPath queries for that matter. String based constraint: * This includes simply mentioned constraints as a string and separates them using a ‘/’ character. * Eg. /Uniformity/Recursive * Subscription to a particular topic in this chain does not guarantee subscription to all sub topics. This is a factor that must be taken into account for. Tag – Value pair constraint: * This is similar to the string constraints but there is a fact that the tag value pairs are separated by the tag value pairs. * This mechanism also allows for a * wildcard to look for all tag, value pairs. XPath constraint: * These constraints are mainly used to search on XML documents. It is used to identify even parts if needed. * In addition these are also mentioned to search for XML advertisements across the broker.

Regular expressions constraint: * This is used to make the search effective in case of text based content. 5. Narada Transport Framework

The transport framework plays a vital role in the Narada. The broker nodes communicate through communication links. Different underlying transport protocols provide the service. The transport interface provides the links to the broker node by consulting the link factory using a transport handler. Based on the links obtained, the administrative link chooses an optimal transport link. Thus a reliable communication mechanism is provided in this system. 6. Naradabroker Capability

The Naradabroker provides various kinds of capabilities to work in sync with the existing technologies. It supports interfaces for several systems. The filters can be used to modify the properties of the messages to be coherent with many devices such as PDAs, palmtops, etc. Portability is one of the most attractive features of Naradabroker. Another capability of Naradabroker is its compatibility with P2P grids. In P2P it starts out without broker nodes using only file based web services. Extra brokers are added in later if a performance gain is needed or the initial architecture is a success.

Page 4: NaradaBrokering - SALSAHPCsalsahpc.indiana.edu/b534projects/sites/default/files/public/9... · Publish/Subscribe messaging pattern. In general the Publish/Subscribe architecture is

7. Analysis We begin by monitoring the performance of

Naradabroker. Upon noting the performance of Naradabroker while running against varying constraints in the profile of each client and broker node, we were able to find that as the specific constraints increased in size per profile, the CPU cost increased proportionally. It is very much feasible for use in real time applications as the delay between nodes increased only by one second as the string size doubled. Plotted below is the average delay scatter chart in case of video clients tested against Java Media Framework (JMF). It was seen that Naradabroker performs very much better as compared to JMF.

The above plotted 3 dimensional graph measures the

transmit delay with the size and rate at which the data is sent. This evaluation had been done using 22 brokers, 102 clients and varying matching frequencies. It can be clearly seen that the transmit delay increases with the match rate.

8. Conclusion Naradabroker has no significant disadvantage

compared to other technologies of similar use in the domain of distributed computing. The fact that this system is not being regularly updated to keep it at pace with other emerging technologies like Apache ActiveMQ, RabbitMQ puts Naradabroker at a slight disadvantage. However on comparing performances with other technologies it was found that there was no significant gain or loss for byte oriented data between Naradabroker and its competitors.

Platform dependence to Windows is still an issue,

which is being worked out, and it is another reason why complete portability of an existing code has not been achieved. A few stability and firewall chinks are being looked at for improving reliability and security.

Taking all factors into consideration it is fair to say

that even with these minor disadvantages Naradabrokering is a technology that is at par with other systems of its kind. 9. References [1] The Naradabroker: A flexible messaging infrastructure, Rahim Lakhoo, Distributed Systems Group, DSG Seminar, April 2004. [2] http://en.wikipedia.com [3] NaradaBrokering: Grid Computing: Making the global infrastructure a reality, Dr. Russ Miller, Chapter 22 [4] NaradaBrokering User’s Guide, Community Grids Lab, Indiana University [5] Deploying the NaradaBrokering substrate in aiding efficient web and grid service interactions, Geoffrey Fox and Shrideep Pallikara [6] NaradaBrokering User’s Guide, Community Grids Lab, Indiana University [7] NaradaBrokering and its Applications, Community Grids Laboratory (CGL), Indiana University, http://grids.ucs.indiana.edu/ptliupages/ [8] Support for high performance real-time collaboration within the NaradaBrokering Substrate, Shrideep Pallikara, Hasan Bulut, Pete Burnap, Geoffrey Fox, Ahmet Uyar, David Walker, Community Grids Lab, Indiana University