Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

23
Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong

Transcript of Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

Page 1: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

Web Service ClusteringBuilding Homogenous Service Communities

Wei LiuWilson Wong

Page 2: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

2 SOCASE@AAMAS2008

22-Jun-05

Outline• A brief introduction on –Web services– Text mining

• Web Service Clustering – The motivation– The challenges– The process– The results

Page 3: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

3 SOCASE@AAMAS2008

22-Jun-05

What are Web Services• It is software designed to be used by other software via

Internet protocols and formats (Forrester)• Web services are self-describing components that can

discover and engage other web services or applications to complete complex tasks over the Internet. (Sun Microsystems, Inc)

• Web Services are loosely coupled software components delivered over the Internet via standards-based technologies like XML, and SOAP. (Gartner)

• Self-describing, self-contained, modular unit of application logic that provides some business functionality to other applications through an Internet connection… (UDDI.org)

• Web services are Internet-based, modular applications that perform a specific business task and conform to a particular technical format. (IBM)

• A web service is application logic that is programmatically available, exposed using the Internet. (Microsoft)

Page 4: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

4 SOCASE@AAMAS2008

22-Jun-05 4

• Web services are applications accessible via the Web to be consumed by clients.

• Clients of a Web Service are usually refer as service requester.

• Technologies standardized by the W3C to support Web service applications are:

Web Service Description Language (WSDL) Simple Object Access Protocol (SOAP) Universal Discovery, Description, and Integration (UDDI)

The Web Service Triangle

Page 5: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

5 SOCASE@AAMAS2008

• Broadly defined as “the act of locating a machine-processable description of a web service that may have been unknown and that meets certain functional criteria”

• Originated from agent match-making paradigm (middle agents and brokers), later moved onto UDDI [2]

• The discovery mechanisms differ according what languages are used for describing the service (WSDL or OWL-S)

What is Web Service Discovery

[2] Garofalakis, J., Panagis, Y., Sakkopoulos, E., Tsakalidis, A.: Web service discovery mechanisms: Looking for a needle in a haystack? In: International Workshop on Web Engineering, Hypermedia Development and Web Engineering Principles and Techniques: Put them in use, in conjunction with ACM Hypertext, Santa Cruz (2004)

Page 6: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

6 SOCASE@AAMAS2008

• Static and Not scalable– The registry can become a bottle neck– New services have to be added through a

laborious process to ensure “correct” categorisation, which deters people from using it

• Search is keyword based– Ontology supported semantic search are only

available agent and semantic web services

Ill-fated Registry Based Structure

Page 7: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

7 SOCASE@AAMAS2008

• Make use of the wsdl files collected by Google

• Automatically cluster these files into functionally similar groups using text mining methods– linguistic analysis, and statistical techniques

combined

• The resulting clusters will help service discovery by reducing the size of the haystacks

What we propose

Page 8: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

8 SOCASE@AAMAS2008

• Traditional Information Retrieval and Document Clustering techniques cannot be borrowed directly, because of the following observations– web service files do not usually contain sufficiently large

number of words for use as index terms or features. – Moreover, the small number of words present in the web

service files are erratic and unreliable. – Related web pages that describe the WSDL service are also

considered. GoogleAPI for discovering web page referral or citation. However, most of the WSDL files do not have related web pages that provide hyperlinks to them. The few that have hyperlinks referring to them are typically examples teaching how to program in a service-oriented paradigm. Observations are concurred by [9]

Challenges

Page 9: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

9 SOCASE@AAMAS2008

System Architecture

22-Jun-05

Page 10: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

10 SOCASE@AAMAS2008

Collected WSDL File

22-Jun-05

Page 11: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

11 SOCASE@AAMAS2008

• Content – Parse the WSDL file for service descriptions in natural language

• Context – Relate documents by looking at parent/grandparent directories– Tokenising, stemming, – Remove function words*– Remove programming terms*

Obtaining Content and Context

Page 12: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

12 SOCASE@AAMAS2008

• One of the properties of content words is that they tend to “clump” or to re-occur whenever they have appeared once [10].

• On the other hand, the occurrence of function words tend to be independent of one another.

• Very often, such contrasting property can be captured through the inability of the Poisson distribution to model word occurrences in documents [11].

• In other words, unlike content words, function words tend to be Poisson distributed.

Content Words vs. Function Words

Page 13: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

13 SOCASE@AAMAS2008

Remove Function Words

22-Jun-05

A segment of the output during content-word recognition performed on the word tokens in the web service context set for the service QuranService.

(single parameter poisson distribution)

Page 14: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

14 SOCASE@AAMAS2008

Remove Programming Terms

22-Jun-05

Using term clustering methods that based on Normalised Google Distance to identify programming term clusters using our Tree-Traversing Ants featureless term clustering [12]

Page 15: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

15 SOCASE@AAMAS2008

Clustering Results for QuranService

22-Jun-05

A small oracle:runtime,

webservice,developer,module,

data

Page 16: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

16 SOCASE@AAMAS2008

• The service host is the second and top-level portion of the domain name (i.e. a segment of the authority part of the URI) of the host containing the WSDL file, and

• The service name is the name of the WSDL file.

• As one may note, the four features are by no means the best or the only ones available for describing a web service.

• However they are the most accessible and feasible ones to use in this case.

The service host and the service name

Page 17: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

17 SOCASE@AAMAS2008

Combining the four features

Page 18: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

18 SOCASE@AAMAS2008

Web Service Clusters

22-Jun-05

Page 19: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

19 SOCASE@AAMAS2008

Conclusions• The paper presented techniques for automatic discovery of

web services of similar functionalities.• We term such service clusters as homogeneous service

communities. If the crawling and the clustering process are in continuous operation like a typical search engine does, the approach has the potential of enabling self-organisation of the Web as proposed in [3].

• The proposed web service clustering approach assumes no registries, and can automatically reduce the search space of web services effectively. Therefore, it can be seen as a predecessor for Web Service Discovery.

• This paper gathers real service description files from the Web instead of working on hypothetical examples.

• The resulting clusters not only provide a useful glimpse on what services are out there, but also an insight into the types of technologies which have proliferated in this area.

22-Jun-05

Page 20: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

20 SOCASE@AAMAS2008

22-Jun-05 20

Web service has become a new trend for doing business online. U.S. – 65% of companies will and have been working on

Web service projects. 2003 – $3 billions; 2008 – $15.8 billions Web services help in e-business and e-commerce

development.

The Web Service “Hype”

“Just as the Web revolutionized how users talk to applications, XML transforms how applications talk to each other.” (Bill Gates)

“Web services are expected to revolutionize our life in much the same way as the Internet has during the past decade or so.” (Gartner)

Page 21: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

21 SOCASE@AAMAS2008

• The UDDI Business Registry (UBR) was part of the UDDI Project announced in September 2000.

• The project goals were to define a set of specifications to enable description, discovery and integration and to prove interoperability through operational experience.

• The UBR ran for 5 years, demonstrating live, industrial strength UDDI implementations managing over 50,000 replicated entries.

Why IBM, Microsoft and SAP stopped UBR

Page 22: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

22 SOCASE@AAMAS2008

Is Popfly service-oriented?

Page 23: Web Service Clustering Building Homogenous Service Communities Wei Liu Wilson Wong.

23 SOCASE@AAMAS2008

Thank You