[IEEE 2006 International Conference on Advanced Computing and Communications - Mangalore, India...

4
A Fuzzy-Inference System Based Approach for the Prediction of Quality of Reusable Software Components Parvinder Singh Sandhu Assistant Professor, Computer Science & Engineering Guru Nanak Dev Engg. College, Ludhiana (Punjab) India parvinder.sandhu@ggmail. com Abstract- The requirement to improve software productivity has promoted the research on software metric technology. There are metrics for identifying the quality of reusable components. These metrics if identified in the design phase or even in the coding phase can help us to reduce the rework by improving quality of reuse of the component and hence improve the productivity due to probabilistic increase in the reuse level. A suit of metrics can be used to obtain the reusability in the modules. And the reusability can be obtained with the help of Fuzzy Logic. An algorithm has been proposed in which the inputs can be given to fuzzy system in form of Cyclometric Complexity, Volume, Regularity, Reuse- Frequency & Coupling, and output can be obtained in terms of reusability. Keywords- Halstead Metric, Cyclometric Complexity, Fuzzy System, Sugeno Fuzzy model. I. INTRODUCTION The aim of Software Metrics is to predict the quality of software products. Various attributes, which determine the quality of the software, include maintainability, defect density, fault proneness, normalized rework, understandability, reusability etc. The requirement today is to relate the quality attributes with the metrics. To achieve both the quality and productivity objectives it is always recommended to go for the software reuse that not only saves the time taken to develop the product from scratch but also delivers the almost error free code, as the code is already tested many times during its earlier reuse. A great deal of research over the past several years has been devoted to the development of methodologies to create reusable software components and component libraries, where there is an additional cost involved to create a reusable component from scratch. That additional cost could be avoided by identifying and extracting reusable components from the already developed large inventory of existing systems. But the issue of how to identify good reusable components from existing systems has remained relatively unexplored. The Literature suggests that the identification of reusable components within existing software systems is expensive and yield only limited benefits since the identified reusable components are not truly useful[2], which is quite discouraging fact. Our approach to identification of reusable software is based on software models and metrics. Fuzzy logic based approach s proposed to economically Hardeep Singh Professor, Computer Science & Engineering Deptt. Guru Nanak Dev University, Amritsar (Punjab) India hardeep_gndu@rediffnail. com determining quality of reusable components in existing systems as well as the reusable components that are in the design phase. II. PROPOSED APPROACH Fuzzy inference is the process of formulating the mapping from a given input to an output using fuzzy logic. The mapping then provides a basis for decision-making. The process of fuzzy inference involves all of the pieces like membership functions, fuzzy logic operators, and if-then rules. There are two types of fuzzy inference systems that can be implemented in the Matlab Fuzzy Logic Toolbox: Mamdani-type and Sugeno-type. In the purposed scheme Sugeno's fuzzy inference method has been used. Sugeno's fuzzy inference method is the most commonly seen fuzzy methodology. It expects the output membership functions to be fuzzy sets. After the aggregation process, there is a fuzzy set for each output variable that needs defuzzification. It's possible, and in many cases much more efficient, to use a single spike as the output membership functions rather than a distributed fuzzy set. It enhances the efficiency of the defuzzification process because it greatly simplifies the computation required by the more general Mamdani method. A fuzzy system can be considered to be a parameterized nonlinear map, calledf, which can be expressed as shown in (1). Reuse Frequency= q(C) [m Y q(si)] (1) where yt is a place of output singleton if Mamdani reasoning is applied or a constant if Sugeno reasoning is applied. The membership function PA'(x) corresponds to the input x=[ x1, X2, X3,... Xm] of the rule I The "and" connective in the premise is carried out by a product and defuzzification by the center-of-gravity method. Consider a Sugeno type of fuzzy system having the rule base Rulel: If xisAl andyisBl,thenfl= pix+qly +rl Rule2: If x is A2 and y is B2, then f2= p2x+ q2y + r2 Let the membership functions of fuzzy sets Ai, Bi, i=1,2, be, JAi, PBi. Evaluating the rule premises results in (2) Wi = 1 Ai(X) * JtfBi (Y) (2) 1-4244-0716-8/06/$20.00 ©2006 IEEE. 349

Transcript of [IEEE 2006 International Conference on Advanced Computing and Communications - Mangalore, India...

Page 1: [IEEE 2006 International Conference on Advanced Computing and Communications - Mangalore, India (2006.12.20-2006.12.23)] 2006 International Conference on Advanced Computing and Communications

A Fuzzy-Inference System Based Approach for the Prediction ofQuality of Reusable Software Components

Parvinder Singh SandhuAssistant Professor, Computer Science & Engineering

Guru Nanak Dev Engg. College, Ludhiana (Punjab) Indiaparvinder.sandhu@ggmail. com

Abstract- The requirement to improve software productivityhas promoted the research on software metric technology.There are metrics for identifying the quality of reusablecomponents. These metrics if identified in the design phase oreven in the coding phase can help us to reduce the rework byimproving quality of reuse of the component and henceimprove the productivity due to probabilistic increase in thereuse level. A suit of metrics can be used to obtain thereusability in the modules. And the reusability can beobtained with the help of Fuzzy Logic. An algorithm has beenproposed in which the inputs can be given to fuzzy system inform of Cyclometric Complexity, Volume, Regularity, Reuse-Frequency & Coupling, and output can be obtained in termsof reusability.

Keywords- Halstead Metric, Cyclometric Complexity, FuzzySystem, Sugeno Fuzzy model.

I. INTRODUCTION

The aim of Software Metrics is to predict the quality ofsoftware products. Various attributes, which determine thequality of the software, include maintainability, defectdensity, fault proneness, normalized rework,understandability, reusability etc. The requirement today isto relate the quality attributes with the metrics. To achieveboth the quality and productivity objectives it is alwaysrecommended to go for the software reuse that not onlysaves the time taken to develop the product from scratchbut also delivers the almost error free code, as the code isalready tested many times during its earlier reuse.A great deal of research over the past several years hasbeen devoted to the development of methodologies tocreate reusable software components and componentlibraries, where there is an additional cost involved tocreate a reusable component from scratch. That additionalcost could be avoided by identifying and extractingreusable components from the already developed largeinventory of existing systems. But the issue of how toidentify good reusable components from existing systemshas remained relatively unexplored. The Literaturesuggests that the identification of reusable componentswithin existing software systems is expensive and yieldonly limited benefits since the identified reusablecomponents are not truly useful[2], which is quitediscouraging fact. Our approach to identification ofreusable software is based on software models and metrics.Fuzzy logic based approach s proposed to economically

Hardeep SinghProfessor, Computer Science & Engineering Deptt.

Guru Nanak Dev University, Amritsar (Punjab) Indiahardeep_gndu@rediffnail. com

determining quality of reusable components in existingsystems as well as the reusable components that are in thedesign phase.

II. PROPOSED APPROACH

Fuzzy inference is the process of formulating the mappingfrom a given input to an output using fuzzy logic. Themapping then provides a basis for decision-making. Theprocess of fuzzy inference involves all of the pieces likemembership functions, fuzzy logic operators, and if-thenrules. There are two types of fuzzy inference systems thatcan be implemented in the Matlab Fuzzy Logic Toolbox:Mamdani-type and Sugeno-type. In the purposed schemeSugeno's fuzzy inference method has been used.Sugeno's fuzzy inference method is the most commonlyseen fuzzy methodology. It expects the output membershipfunctions to be fuzzy sets. After the aggregation process,there is a fuzzy set for each output variable that needsdefuzzification. It's possible, and in many cases muchmore efficient, to use a single spike as the outputmembership functions rather than a distributed fuzzy set. Itenhances the efficiency of the defuzzification processbecause it greatly simplifies the computation required bythe more general Mamdani method.A fuzzy system can be considered to be a parameterizednonlinear map, calledf, which can be expressed as shownin (1).

Reuse Frequency= q(C)

[m Yq(si)](1)

where yt is a place of output singleton if Mamdanireasoning is applied or a constant if Sugeno reasoning isapplied. The membership function PA'(x) corresponds tothe input x=[ x1, X2, X3,... Xm] of the rule I The "and"connective in the premise is carried out by a product anddefuzzification by the center-of-gravity method. Consider aSugeno type of fuzzy system having the rule base

Rulel: IfxisAl andyisBl,thenfl=pix+qly+rlRule2: If x is A2 and y is B2, then f2= p2x+ q2y + r2

Let the membership functions of fuzzy sets Ai, Bi, i=1,2,be, JAi, PBi. Evaluating the rule premises results in (2)

Wi = 1Ai(X) * JtfBi (Y) (2)

1-4244-0716-8/06/$20.00 ©2006 IEEE. 349

Page 2: [IEEE 2006 International Conference on Advanced Computing and Communications - Mangalore, India (2006.12.20-2006.12.23)] 2006 International Conference on Advanced Computing and Communications

where i = 1,2 for the rule rules stated above and evaluaithe implication and the rule consequences gives (3)-(6).

f (x,y) WI (X, fl (X,y)+ w2 (X,Y)f2 (x,Y)WI (x, y) + W2 (x, y)

Or

f WI fl+W2f2WI +W2

Let- Wi

Wi WI + W2Then, f can be written as

f =Wlfl+W2f2These all computations can be presented in a diagram foas shown in the fig. 1.

I AtWI

;)+ W.-

EM WI + W2tz =+%* § =~*f,+%

Figure 1A two-Input First-Order Sugeno Fuzzy Model with to rules

component can be extracted and put into the ReusableSoftware Reservoir for future reuse.

A. Metric Suit SelectionThe Following metrics are used for extracting the structuralproperties of a reusable software component.1) Cyclometric Complexity Using Mc Cabe's Measure:According to Mc Cabe [4], the value of CyclometricComplexity can be obtained using (7).

(5) Cyclometric Complexity = Predicate Nodes +1 (7)

where predicate nodes are the nodes of the directed graph,made for the component, where the decisions are made i.e.predicate nodes should have more than one arrow comingout of it.If the complexity is low then reuse of component will notrepay the cost. Otherwise high value of complexityindicates poor quality, high development cost, lowreadability, poor testability and prone to errors i.e. highrate of failure. Hence the value of CyclometricComplexity of a software component should be in betweenupper and lower bounds as an contribution towardsreusability.If Cyclometric complexity is high with high regularity ofimplementation then there exists high functionalusefulness.

2) Regularity. The notion behind Regularity is "How wellwe can predict length based on some regularityassumptions". As actual length (N) is sum of NI and N2.The estimated length is shown in (8).Estimated Length:= N' =1 log2 III + 112 10g2 112 (8)

The closeness of the estimate is a measure of theRegularity(r) of Component coding calculated using (9).

Regularity = 1- (N - N') IN} = N'/N

Figure 2Fuzzy Inference System with 5 Inputs & one Output

III. QUALITY ATTRIBUTES

The Following major steps are taken to determine thequality of Reusable parts from the existing softwaresystems:* Selecting the software system that needs to be processed.* Parsing of the software system to generate the Metainformation related to that Software (if the designinformation is not directly available).* Meta information will act as input to reusability modelthat uses certain metrics that generates the values which isgiven input to the inference Engine.* Fuzzy system determines the reusability value of thesoftware components. Considering the reusability value the

The above derivation indicates that Regularity is the ratioof estimated length to the actual length. High value ofRegularity indicates the high readability, low modificationcost and non-redundancy of the component implementation[1].Hence there should be some minimum level of Regularityof the component to indicate the reusability of thatcomponent.

3) Halstead Software Science Indicator. According to thismetric volume[3] of the source code of the softwarecomponent is equal to (10).

Volume = (N1 + N2)log2 ( 1I+ 112) (10)

Where n, is the number of distinct operators that appear inthe program; n2 is number of distinct operands that appearin the program; N1 is the total number of operatoroccurrences and N2 is the total number of operandoccurrences.

If the volume is high means that software component needsmore maintenance cost, correctness cost and modificationcost. On the other hand less volume increases theextraction cost, identification cost from the repository andpackaging cost of the component So the volume of the

350

(9)

Page 3: [IEEE 2006 International Conference on Advanced Computing and Communications - Mangalore, India (2006.12.20-2006.12.23)] 2006 International Conference on Advanced Computing and Communications

reusable component should be in between the twoextremes.

4) Reuse Frequency. Reuse frequency is calculated bycomparing number of static calls addressed to a componentwith number of calls addressed to the component whosereusability is to be measured. Let N user definedcomponents are X1, X2 ... XN in the system, where S1, S2... SM are the standard environment components e.g. printfin C language.

Reuse Frequency= (11)

[M @(si)]L_ i=O

The equation (11) shows that the "Reuse frequency" is themeasure of function usefulness of a component. Hencethere should be some minimum value of "ReuseFrequency" to make software component really reusable[1].

5) Coupling. Functions/methods that are loosely boundtend to be easier to remove and use in other contexts tanthose that depend heavily on other functions or non-localdata.Different types of coupling effects reusability withdifferent extent. Depending on the type of interfacebetween two functions coupling can be classified infollowing categories[5]:

* Data Coupling: Data coupling exists between twofunctions when functions communicate using elementarydata items that are passed as parameters between the two.

* Stamp Coupling: When two functions communicateusing composite data item e.g. structure in C language thenthat kind of coupling is called Stamp Coupling.

* Control Coupling: If data from one function is said todirect the order of instruction execution in another functionthen Control Coupling is there between those functions. Inother words functions share data items upon which controldecisions are made.

* Common Coupling: In case of Common Coupling thetwo functions share global data items.

B. Design ofFuzzification ModuleLinguistic variables are then assigned to the inputparameters based on their values. The assignment of thelinguistic variables depends on the range of the inputmeasurement.1) Linguistic Variables of Complexity. Values to thelinguistic variables are assigned in terms of complexity ofthe software component. Cyclometric Complexity isassigned three linguistic variables LOW, MEDIUM andHIGH in the range of 0 to 10.2) Linguistic Variables of Regularity. Values to thelinguistic variables are assigned in terms of level ofregularity for the software component under consideration.Regularity is assigned two linguistic variables LOW andHIGH in the range of 0 to 1.3) Linguistic Variables of Volume. Values to the linguisticvariables are assigned in terms of volume of the softwaremodule. Quality attribute Volume is assigned threelinguistic variables LOW, MEDIUM and HIGH in therange of 0 to 10.

4) Linguistic Variables ofReuse-Frequency. Values to thelinguistic variables are assigned in terms of number oftimes the software module is reused. Reuse-Frequency isassigned two linguistic variables LOW and HIGH in therange of 0 to 10.5) Linguistic Variables ofcoupling. Values to the linguisticvariables are assigned in terms of level of coupling of thesoftware module with other modules or the level ofdependency of the software module on other modules.Coupling is assigned three linguistic variables LOW,MEDIUM and HIGH in the range of 0 to 10.6) Linguistic Variables of Reusability. Values to thelinguistic variables are assigned in terms of "how reusablethe software module is?" As the output membershipfunctions are only linear or constant for Sugeno-type fuzzyinference. Reusability is assigned six linguistic variablesPERFECT, HIGH, MEDIUM, LOW, VERY-LOW andNIL as constants in the range of 0-100.

C. Design ofKnowledge Base

The knowledge base is then designed. It is a collection ofrule-base, which is formed, with the help of if-then rules.In the rules logical operators like 'or' or 'and' are used. Asthere are total five inputs, in which three inputs have threemembership functions each and other two inputs have twomembership functions each so after all the possiblecombinations the size of the rule base comes out to be3*3*3*2*2= 108. e.g. the rules could be:If (Coupling is LOW) and (Volume is MEDIUM) and(Complexity is MEDIUM) and (Regularity is HIGH) and(Reuse-Frequency is HIGH) then (Reusability isPERFECT)If (Coupling is HIGH) and (Volume is HIGH) and(Complexity is HIGH) and (Regularity is LOW) and(Reuse-Frequency is LOW) then (Reusability is NIL)And so forth.

D. DefuzzificationIn Sugeno Fuzzy Inference System all output membershipfunctions are singleton spikes rather than distributed fuzzysets. That are sometimes known as a singleton outputmembership functions, and that can be thought of as a pre-defuzzified fuzzy sets. It enhances the efficiency of thedefuzzification process because it greatly simplifies thecomputation. Rather than integrating across the two-dimensional function to find the centroid (as in the case ofMamdani-type inference systems), we use the weightedaverage of a few data points. The implication method issimply multiplication, and the aggregation operator justincludes all of the singletons. After the defuzzificationprocess the value obtained means quality of the softwaremodule in terms of its Reusability prospective.

IV. EXPERIMETAL RESULTS

All the three FIVE parameters Coupling, Volume,Regulaity, Reuse-Frequency and Complexity are directlyrelated to Reusability i.e. high is the Coupling, Volume,Complexity less will be Reusability and more is theRegulaity and Reuse-Frequency more will be the

351

Page 4: [IEEE 2006 International Conference on Advanced Computing and Communications - Mangalore, India (2006.12.20-2006.12.23)] 2006 International Conference on Advanced Computing and Communications

Reusability of the software module. After creating the rulebase to depict the true picture following results wereobtained as shown in the form of surface plots as shown infig. 3 that shows the surface view of Reusability and otherfive input parameters is obtained after the implication ofthe rules from the rule base. The monotonicity of thesurface depends on the rule base.

Surface plot between Coupling, Complexity and Reusability

To determine how much Reusable the software module isthese values are fed to the rule viewer and the role viewer

will appear to be like as shown in fig. 4. The Reusability ofthe module is 67.7 i.e. out of the scale of 0 to 100 theReusability Index of the Software module is 67.7 points.This value has been obtained after firing the rule base andafter Defuzzification.

VI. CONCLUSION

The research shows how metrics can be used to find thequality attributes of a software component using SugenoFuzzy Inference System. The fuzzy inputs as differentmetrics can be given to the system and depending on themetrics reusability can be calculated with the fuzzytoolbox. Other quality attributes can also be calculatedusing different combination of relevant metrics e.g. MOODmetrics, CK metric etc. as an extension to this project. TheSugeno system lends itself to the use of adaptivetechniques for constructing fuzzy models. These adaptivetechniques can be used to customize the membershipfunctions so that the fuzzy system best Models the Data.

r igure -

Surface plot between Coupling, Volume and Reusability

Figure 4Rule-Viewer

Surface plot between Volume, Complexity & Reusability

V. TESTING OF THE PROPOSED SYSTEM

The testing of the algorithm is done by viewing rules on arule viewer crisp value as input is given to the five inputparameters and the output is obtained. Let us consider asoftware module whose value for Coupling is 2.54, valuefor Complexity is 5, value for Volume is 5.28, value forRegularity is 0.864 and value for Reuse-Frequency is 8.06.

V. REFERENCES

[1] G. Caldiera and V. R. Basili, "Identifying andQualifying Reusable Software Components", IEEEComputer, February 1991.

[2] J. C. Esteva, "Automatic Identification of reusableComponents," IEEE, 1995.

[3] Maurice H. Halstead, Elements of Software Science,Elsevier North-Holland, New Tork, 1977.

[4] T. MaCabe, "A Software Complexity measure," IEEETrans. Software Engineering, vol. SE-2, December1976, pp. 308-320.

[5] R. S. Pressman, Software Engineering. APractitioner's Approach, McGraw-Hill, 2004.

[6] Selby, R. W., Empirically Analyzing Software Reusein a Production Environment, in Software Reuse:Emerging Technology, W. Tracz, ed, IEEE ComputerSociety Press, 1988.

352