Construction Cost Estimation Using a Case-Based Reasoning ...
Software Cost Estimation Using Data Mining:Revie Cost Estimation Using Data Mining:Review Objectives...
Transcript of Software Cost Estimation Using Data Mining:Revie Cost Estimation Using Data Mining:Review Objectives...
https://isma13in.wordpress.com
Software Cost Estimation Using Data Mining:Review
The Joint 13th CSI/IFPUG International Software Measurement & Analysis (ISMA13) Conference
Mumbai (India) – March 6, 2017
1. Miss Sumera w.AhmadDepartment of Computer Science & EnggP.R.M.I.T & R, BadneraAmravati, IndiaEmail: [email protected]
2. Dr.G.R.BamnoteDepartment of Computer science & EnggP.R.M.I.T & R, BadneraAmravati, India Email: [email protected]
Insert here a picture
2ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Software Cost Estimation Using Data Mining:Review
Objectives of Software Cost Estimation using Data Mining
To estimate the accurate cost of the project with the help of past
project data whose cost or effort is known.
To improve the efficiency of software cost estimation with the help of
the data mining.
To apply the various classification data mining algorithms and
machine learning techniques into models for software cost estimation to
improve the performance of a software cost estimation .
3ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Software Cost Estimation Using Data Mining: Review
Introduction to Software Cost Estimation using Data Mining
Software cost estimation is the process of estimating the probable cost of a software on the basis of available past history information. Predicting the resources required .
Successful estimation is critical for software industry.
Over estimation: killing promising projects.
Under estimation: Wasting entire effort.
.
4ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Software Estimation Accurate estimation is very important, in
software development poor estimation may lead to the failure of software development.
The estimation process includes size estimation, effort estimation, developing initial project schedules and finally estimating overall cost of the project development
5ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Fundamental Estimation Questions
How much effort is required to complete an activity?
How much calendar time is needed to complete an activity?
What is the total cost of an activity? Project estimation and scheduling and
interleaved management activities
6ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Software Estimation
Many estimation methods have beenproposed in last 30 years and almost allmethods require quantitativeinformation of productivity, size ofproject and other important factors thataffect the project.
7ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Estimation Techniques
Algorithmic cost modelling
Expert judgement
Estimation by analogy
Parkinson's Law
Pricing to win
8ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Algorithmic (Parametric) Model Examples:
COCOMO (COnstructive COst MOdel)
▪ Developed by Boehm in 1981
▪ Became one of the most popular and most transparent cost model
▪ Mathematical model .
COCOMO II
▪ Published in 1995
▪ To address issue on non-sequential and rapid development process models, reengineering, reuse driven approaches, object oriented approach etc
▪ Has three submodels – application composition, early design and post-architecture
9ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
The Scale Drivers of COCOMO II
Each scale driver is set to describe project these Scale Drivers determine the exponent used in the Effort Equation.
There are five scale drivers used in the cocomo II model and each scale driver play an important role in the estimation.
The 5 Scale Drivers are Precedentedness, Development Flexibility, Architecture/Risk Resolution, Team Cohesion, Process Maturity
10ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
In the cocomo II model, the cost drivers are divided into four groups
Personnel Factors: Analyst Capability, Programmer Capability, Applications Experience Platform Experience, Personnel Continuity, Use of Software Tools
Product cost driver: Required Software Reliability, Data Base Size, Required Reusability Documentation match to life-cycle needs etc
Platform Factors: Execution Time Constraint, Platform Volatility
Project Factors: Required Development Schedule, Multisite Development etc
11ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
COCOMO Effort Equation
The COCOMO Model II makes its estimates of required effort (Person month)based on the size (KSLOC).
Effort = 2.94 *EAF *(KSLOC)E
Where EAF is effort adjustment factor derived from cost drivers
E is the exponent factor derived from five scale drivers
Example: Nominal cost drivers &scale drivers would have anEAFof1.0 and exponent E
1.0997.assuming 9000 SLOC, cococmo II estimates
Effort=2.94*(1.0)*(9000)*1.0997
Effort=29.9 person-months.
12ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Data Mining Data mining at its core is the transformation of
large amounts of data into meaningful patterns
and rules
Data mining helps to classify the past project
data and generate the valuable information.
These knowledge or information applied in the
cost estimation models generate the
approximate estimation on the basis of past
project data.
13ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
14ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Aim
Data mining helps to classify the past project data and generate valuable information
Developing powerful means of analysis and extraction of interesting knowledge that could help in accurate estimation of software project.
15ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Author Name Year Description Title
1) Dharmesh , Mahesh
1997 They have reviewed on the bases of performance analysis of different ANNs and comparing the results of various ANN models in effort estimation.
“Artificial Neural Networks for Software Effort Estimation: A Review”
2) Adriano L.I., Oliveira,
2006 Comparison of SVR and RBFN ,from which SVR outperforms“journals.elsevier.com”
“Estimation of software project effort with support vector regression.”
3) Lefly &Sheppered
2003 applied genetic programming to improve software cost estimation on public datasets with great success.“Genetic and Evolutionary Computation —GECCO”
“ Programming to Improve Software Effort Estimation Based on General Data Sets”
4) Molokkhen& Jorgenson
2003 This paper summarizes estimation results from surveys on software estimation papers .There has not been conducted any structured review of the estimation surveys till then
A Review of Surveys on Software Effort Estimation
16ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Author Name Year Description Title
5) Barbara & Magne 2004 compared the evidence based software engg with evidence based medicine
IEEE int.cof
Evidence-based Software Engineering
6) Chen et.al 2005 They proposed cost modelers should perform data-pruning experiments after data collection and before model buildingIEEE computer Society
“Finding the right data for software cost modeling”
7)A.F.Sheta 2006 Two new model structures to estimate the effort required for development of software projects using Genetic Algorithms (GAs). Journal of Computer Sci
‘Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects”
8) Galoroth & Evans 2006 Perform an intensive search between 2100 internet sites and found 500 reasons for s/w failure
“Successful Software Planning, Measurement andControl.”
17ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Author Name Year Description Title
9) Magne Jørgensenand Martin Shepperd
2007ieee
A systematic review of previous work. The review identifies 304 software cost estimation papers in 76 journals and classifies the papers according to research topic, estimation approach, research approach, study context and data set.
“A Systematic Review of Software DevelopmentCost Estimation Studies.”
10) K. Vinaykumar, V. Ravi, M. Carr and N. Rajkiran
2008 Used wavelet neural network for prediction of S/W cost estimation.
“‖ Journal of Systems and Software”
“Software cost estimation using wavelet neural networks.”
11) Lum et.al 2008Ieee
2CEE is a tool for new software cost estimation models using data mining
techniques. The accuracy of these models has been validated internally through leave-
one out cross validation
“The Effects of Data Mining Techniques onSoftware Cost Estimation.”
12) Andreous & EfiPapateocharous
2008 Used fuzzy decision tree for predicting the required effort and s/w size in cost
estimation
Software Cost Estimation using Fuzzy Decision Trees
18ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Author Name Year Description Title
13) Reddy & Raju 2009 They found that Gaussian function is performing better than the trapezoidal function, as it demonstrates a smoother transition in its intervals, and the achieved results were closer to the actual effort.
“An Improved Fuzzy Approach for COCOMO’sEffort Estimation Using Gaussian MembershipFunction
14) Prasad & Reddy
2010 Proposed a model for software costestimation using Multi Objective (MO)Particle Swarm Optimization. It wasobserved that the model gives betterresults when compared with the standardCOCOMO model.
“Data Mining for Secure Software Engineering –Source Code Management Tool Case Study”
15)J.S.Pahariyaa,b, V. Ravia, M. Carra ,M. Vasua,b
2010 Proposed new computational intelligence sequential hybrid architectures involving Genetic Programming (GP) and Group Method of Data Handling (GMDH) viz. GP-GMDH, GMDH-GP and recurrentarchitecture for Genetic Programming (GP) for software cost estimation.
“Computational Intelligence Hybrids Applied to Software Cost Estimation”.
19ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Author Name Year Description Title
16) [Katholieke et.al 2012 Techniques inducing tree/rule-basedmodels like M5 and CART,
Linear models such as various types of linear regression, nonlinear models\
(MARS, multilayered perceptron neural networks, radial basis function
networks, and least squares support vector machines), and estimation
techniques that do not explicitly induce a model
Data Mining Techniques for Software Effort Estimation: A Comparative Study
17) Sweta & Pushkar 2013 Provides a relative study on support vector regression (SVR), Intermediate
COCOMO and Multiple Objective Particle Swarm Optimization (MOPSO) model .
Performance Analysis of the Software Cost Estimation Methods: A Review”
8) Batra & Trivedi 2013 This model uses a mathematical formula to predict the cost, project size, number
of engineers and many other process and product factors., an algorithmic software cost estimation model that
uses the regression formula with parameters that are derived from historical project data and current
project characteristics.
A Fuzzy Approach For Effort Estimation
20ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Summary of Related Work
Many of the above researchers have proposed various ways to estimate cost using different model in software engineering.
This research work will focus on designing an algorithm which will investigate the systemic cost estimation problems that have been identified and best performing machine learning techniques.
Seek to find the prediction accuracy model of the new model developed by the data mining
21ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Proposed Work
Finding out the common cost drivers and scale drivers which are used by COCOMO II model for determining the exponent used in the effort equations and cost drivers which are multiplicative factor to determine the effort required to complete project.
Scale Drivers and cost Drivers of the past projects which are stored in the software repositories are extracted through data mining (algorithm
22ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
IMPLICATION
It will encourage practitioner’s to shift from archaic estimation method and to select estimation tools that incorporate risk and uncertainty into estimation
IMPLICATION
23ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
References:
References:• [Abbasi & Soleimanian, 2011] Zeinab Abbasi Khalifehloua , Farhad Soleimanian Gharehchopogh,“A
Survey Of Data Mining Techniques in Software Cost Estimation” AWERProcedia Information Technology and Computer Science 2nd World Conference on Information Technology (WCIT-2011)
[Albrecht & Gaffney, 1983] A.A.J. Albrecht and J.E. Gaffney, “Software Function, Source Lines Of Code, And Development Effort Prediction”, A Software Science Validation, IEEE Transactions on Software Engineering, , pp. 639–647, 1983
[Batra & Trivedi, 2013] Geetika Batra and Mahima Trivedi, “A Fuzzy Approach For Effort Estimation”, International Journal on Cybernetics & Informatics ( IJCI) Vol.2, No.1, February 2013.
[Benala, et.at, 2012] T.R.Benala,S.Dehuri,S.Ch.Satapathy,.Madhurakshara, “Genetic Algorithm for Optimizing Functional Link Artificial Neural Network Based Software Cost Estimation”, Springer-Verlag Berlin Heidelberg , pp: 75–82,2012.
[Chen et.al, 2005] Chen, Z.; Menzies, T.; Port, D.; Boehm, B., "Finding The Right Data For Software Cost Modeling," Software, IEEE , vol.22, no.6, pp.38,46, Nov.-Dec. 2005
[Das1 et.al, 2011] Ms Anupama Das1, Ms.Kaberi Das2, Prof (Dr) B.Puthal, “Improving Software Development Process through Data Mining Techniques Embedding Alitheia Core Tool, ” International Journal of Computer Science and Information Technologies, Vol. 2 (2) , 2011, 629-632.
[Dejaeger et.al, 2011] Karel Dejaeger, Wouter Verbeke, David Martens, Bart Baesens, "Data Mining Techniques for Software Effort Estimation: A Comparative Study", IEEE Transactions on Software Engineering, vol.38, no. 2, pp. 375-397, March-April 2012, doi:10.1109/TSE.2011.55.
[Dizaji et.al, 2014] Zahra Ashegi Dizaji,Reza Ahmadi , Hojjat Gholizadeh ,Farhad Soleimanian, “ A Bee Colony Optimization Algorithm Approach for Software Cost Estimation” International Journal of Computer Applications (0975 – 8887) Volume 104 – No 12, October 2014.
24ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
[[Matson et.al, 1994] J.E. Matson, B.E Barrett and J.M. Mellichamp, “Software Development Cost Estimation Using Function Points,” IEEE Transactions on Software Engineering, 1994, pp. 275–287.
[Prasad & RamaKrishna, 2010] A.V. Krishna Prasad, Dr.S.RamaKrishna,” Data Mining for Secure Software Engineering –Source Code Management Tool Case Study” International Journal of Engineering Science and Technology Vol. 2 (7), 2667-2677, 2010.
[Putnam, 1978] L.H. Putnam, “A General Empirical Solution To The Macro Software Sizing And Estimation Problem”, IEEE Transactions on Software Engineering, pp. 345–361, July 1978.
[Sharma & Litoriya, 2012] Narendra Sharma, Ratnesh Litoriya “Incorporating Data Mining Technique on Software Cost Estimation :Validation and Improvement”. Volume 2, Issue 3, March 2012.
[Sheta, 2006] A.F.Sheta, “Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects”, Journal of Computer Science, vol.2, pp. 118-123, 2006.
[Taylor & Giraud, 2010] Quinn Taylor, Christophe Giraud-Carrier, “Applications of Data Mining in Software Engineering International Journal of Data Analysis Techniques and Strategies, Vol. 2, No. 3, 2010
25ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
[Hassan et.al, 2012] Hassan Najadat, Izzat Alsmadi,Yazan Shboul, “Predicting Software Projects Cost Estimation Based on Mining Historical Data” International Scholarly Research Network ISRN Software Engineering Volume 2012, Article ID 823437, 8 pages doi:10.5402/2012/823437.
[Katholieke et.al, 2012] Dejaeger,K.,Verbeke, W., Martens,D, Baesens,B.“Data Mining Techniques for Software Effort Estimation: A Comparative Study” Software Engineering, IEEE Transactions on , vol.38, no.2, pp.375,397, March-April 2012.
[Khatibi & Dayang , 2010] Vahid Khatibi, Dayang Software Cost Estimation Methods: A Review” Journal of Emerging Trends in Computing and Information Sciences Volume 2 No.1, ISSN 2079-8407 2010-11.
[Kocaguneli et.al, 2012] Kocaguneli, E.; Menzies, T.; Bener, A.; Keung, J.W., "Exploiting the Essential Assumptions of Analogy-Based Effort Estimation," Software Engineering, IEEE Transactions on , vol.38, no.2, pp.425,438, March-April 2012.
[Kumari & Pushkar, 2013] Sweta Kumari , Shashank Pushkar, “ Performance Analysis of the Software Cost Estimation Methods: A Review” International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 7, July 2013.
[Lum et.al, 2008] Karen T. Lum, Daniel R. Baker, and Jairus M. Hihn “The Effects of Data Mining Techniques on Software Cost Estimation” Engineering Management Conference, 2008. IEMC Europe 2008. IEEE International , vol., no., pp.1,5, 28-30 June 2008
26ISMA 13 – March 6, 2017 https://isma13in.wordpress.com
Thank You…!!!