Towards Detecting Performance Anti-patterns Using Classification Techniques
-
Upload
james-hill -
Category
Technology
-
view
245 -
download
1
description
Transcript of Towards Detecting Performance Anti-patterns Using Classification Techniques
Towards Detecting Performance Anti-‐Patterns Using Classi8ication Techniques Manjala Peiris and James H. Hill 1st Interna4onal Workshop on Machine Learning & Informa4on Retrieval for SoBware Evolu4on Nov 11, 2013, Silicon Valley, California, USA.
Motivation: Software Performance Anti-‐Patterns • Common design choices that have nega4ve consequences
• Solely focus on performance of the system • e.g.: throughput, response 4me
• Suggests solu4ons and refactoring • e.g., One lane bridge, Excessive dynamic alloca4ons, God Class
One Lane Bridge (Smith et al.) Reasons for An+-‐Pa.ern • Lack of concurrency • Limited to number of resources
• Not u4lizing available resources
Consequences • Low system throughput • High latency • High response 4me
One or few processes/threads are allowed to execute concurrently
Excessive Dynamic Allocations (Smith et al.)
Reason for Anti-Pattern • Objects are created when
they are first accessed and then destroyed when no longer needed.
Consequences • The cost of dynamic
allocations
N-‐ Number of Calls Sc ,Sd-‐ Costs for an object crea6on and dele6on
Why Automatic Detection of Performance Anti-‐Patterns • Difficult to manually analyze large amount of performance
data • Make sense of large amount of performance data rather
than just showing it to users • Provides intuitions to system designers where the
refactoring is required
Current Approaches for Anti-‐Pattern detection Approach based on so7ware design ar+facts 1. Annotate the soBware
design 2. Runs simula4ons and
gather performance data 3. Apply rules Approaches based on run+me data • Architecture dependent (e.g.: J2EE an4-‐paYerns)
• Requires architecture specific deployment details
Non-‐intrusive Performance Anti-‐Pattern Detector (NiPAD) • Collect system performance metrics
• SoBware execu4on with a performance an4-‐paYern (Class 0)
• SoBware execu4on without the performance an4-‐paYern (Class 1)
• Normalize the data • Train a classifier
• Naïve Bayes, Logis4c Regression, FLD, SVM (Linear), SVM (RBF)
• Predict for new performance data for which the class label is unknown
System level Metrics Metric Descrip+on
CPU Idle Time The 4me CPU is idle not doing any work
CPU User Time CPU u4liza4on for user applica4ons
CPU System Time CPU u4liza4on for system level programs
Free Memory Total free memory when invoking the applica4on
Cached Memory Total cached memory available when invoking the applica4on
Total Commits Total number of commits
• Metrics are collected every 1 second epochs
CPU Times with One Lane Bridge
CPU Times without One Lane Bridge
Experiments with Apache Web Server • Emula4ng One Lane Bridge An4-‐PaYern • Use Apache Benchmark to generate a load • Server configura4ons
One Lane Bridge Without One Lane Bridge
300 concurrent clients sending 1 million requests, server has 150 threads
300 concurrent clients sending 1 million request, server has 300 threads
200 records for training 400 records for testing
Classi8ication Results for One Lane Bridge
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Naïve Bayes Logis4c Regression
FLD SVM (Linear) SVM (RBF)
Accuracy
Classifier
One Lane Bridge
Naïve Bayes
Logis4c Regression
FLD
SVM (Linear)
SVM (RBF)
Classi8ication Results for One Lane Bridge with Noise
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Naïve Bayes Logis4c Regression
FLD SVM (Linear) SVM (RBF)
One Lane Bridge with Noise
Naïve Bayes
Logis4c Regression
FLD
SVM (Linear)
SVM (RBF)
Experiments with Apache Web Server • Emula4ng Excessive Dynamic Alloca4on an4-‐paYern • Server configura4ons
Excessive Dynamic Alloca+ons
Without Excessive Dynamic Alloca+ons
• 300 concurrent clients sending 1 million requests, server has 300 threads
• Memory pool size of 1kb
• 300 concurrent clients sending 1 million request, server has 300 threads
• Memory pool size of 1Mb
200 records for training 400 records for testing
Classi8ication Results for Excessive Dynamic Allocations
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Naïve Bayes Logis4c Regression FLD SVM (Linear) SVM (RBF)
Accuracy
Classifier
Excessive Dynamic Alloca+on
Naïve Bayes
Logis4c Regression
FLD
SVM (Linear)
SVM (RBF)
Reason for poor classification performance • Emery et al. shows custom memory allocation techniques
does not have much advantages
Cost Analysis for One Lane Bridge
• Posi4ve class is the one which does not have the An4-‐paYern • Predicts these situa4ons more accurately
• Cost of misclassifica4on depends on the nature of the soBware and soBware development cost
• This technique will eliminate unnecessary soBware tes4ng • Not good for real 4me soBware systems
Classifier Sensi+vity Specificity Precision Accuracy
Logis4c 0.95 0.62 0.53 0.76
FLD 0.95 0.66 0.56 0.7
Naïve Bayes 0.92 0.28 0.38 0.5
SVM (Linear) 0.98 0.92 0.84 0.94
SVM (RBF) 0.96 0.7 0.61 0.75
Cost Analysis for One Lane Bridge
• Posi4ve class is the one which does not have the An4-‐paYern • Predicts these situa4ons more accurately
• Cost of misclassifica4on depends on the nature of the soBware and soBware development cost
• This technique will eliminate unnecessary soBware tes4ng • Not good for real 4me soBware systems
Classifier Sensi+vity Specificity Precision Accuracy
Logis4c 0.95 0.62 0.53 0.76
FLD 0.95 0.66 0.56 0.7
Naïve Bayes 0.92 0.28 0.38 0.5
SVM (Linear) 0.98 0.92 0.84 0.94
SVM (RBF) 0.96 0.7 0.61 0.75
Concluding Remarks Limita+ons • System level performance metrics may not show enough varia4ons • e.g., Excessive Dynamic Alloca4ons
• Bad performance may be for some other reasons • e.g., Configura4on errors, bad user inputs
Future work • Currently including behavior of the soBware applica4on in analysis
• Applying this technique to other soBware applica4ons
Questions