, Sujuan - NDSS Symposium
Transcript of , Sujuan - NDSS Symposium
![Page 1: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/1.jpg)
ZhenLi1,DeqingZou1,ShouhuaiXu2,XinyuOu1,HaiJin1,SujuanWang1,ZhijunDeng1,YuyiZhong11HuazhongUniversityofScienceandTechnology(HUST),Wuhan,China2UniversityofTexasatSanAntonio(UTSA),SanAntonio,USA
![Page 2: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/2.jpg)
Automatic Software Vulnerability Detection
² Automaticdetectionofsoftwarevulnerabilitiesisanimportantresearchproblem
² Staticvulnerabilitydetectiontoolsandstudies
2
RATS
VUDDY (SP’17)
ReDeBug … VulDeePecker (ACSAC’16)
![Page 3: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/3.jpg)
Drawbacks of Existing Approaches
² First,imposingintenselaborofhumanexperts
ü Definefeatures
² Second,incurringhighfalsenegativerates
ü Twomostrecentvulnerabilitydetectionsystems
• VUDDY(SP’17):falsenegativerate=18.2%forApacheHTTPD2.4.23
• VulPecker(ACSAC’16):falsenegativerate=38%withrespectto455vulnerabilitysamples
3
![Page 4: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/4.jpg)
Research Problem
² Giventhesourcecodeofatargetprogram,howcanwedeterminewhetherornotthetargetprogramisvulnerableandifso,wherearethevulnerabilities?
4
Withoutaskinghumanexpertstomanuallydefinefeatures
Withoutincurringahighfalsenegativerateorfalsepositiverate
![Page 5: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/5.jpg)
Our Main Contribution
VulnerabilityDeepPecker(VulDeePecker):
Adeeplearning-basedsystemforautomatically
detectingvulnerabilitiesinprograms(sourcecode)
5
![Page 6: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/6.jpg)
Outline
² GuidingPrinciples² DesignofVulDeePecker² ExperimentsandResults
² Limitations
² Conclusion
6
![Page 7: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/7.jpg)
Outline
² GuidingPrinciples² DesignofVulDeePecker² ExperimentsandResults
² Limitations
² Conclusion
7
![Page 8: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/8.jpg)
Guiding Principles: three questions
8
Q1:Howtorepresentsoftwareprogramsfordeeplearning-basedvulnerabilitydetection?
Q2:Whatistheappropriategranularityfordeeplearning-basedvulnerabilitydetection?
Q3:Howtoselectaspecificneuralnetworkforvulnerabilitydetection?
![Page 9: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/9.jpg)
Guiding Principles
9
Q1:Howtorepresentsoftwareprogramsfordeeplearning-basedvulnerabilitydetection?
Preserve the semantic relationships between the programs’elements(e.g.,data-flowandcontrol-flowinformation).
![Page 10: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/10.jpg)
Guiding Principles
10
Q2:Whatistheappropriategranularityfordeeplearning-basedvulnerabilitydetection?
Represented at a finer granularity than treating a program or afunctionasaunit.
![Page 11: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/11.jpg)
Guiding Principles
11
Q3:Howtoselectaspecificneuralnetworkforvulnerabilitydetection?
Neural networks that can copewith contextsmaybe suitable forvulnerabilitydetection.
CNN
DBN
DNN
…
Traditional RNN
LSTM
GRU …
RNN Unidirectional LSTM
Bidirectional LSTM
LSTM
RNN Thispaper
![Page 12: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/12.jpg)
Outline
² GuidingPrinciples² DesignofVulDeePecker² ExperimentsandResults
² Limitations
² Conclusion
12
![Page 13: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/13.jpg)
Overview of VulDeePecker
13
![Page 14: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/14.jpg)
14
The Concept of Code Gadget
² Aunitforvulnerabilitydetection
² Anumberofprogramstatementsthataresemanticallyrelatedtoeachotherintermsofdatadependencyorcontroldependency
² Example:vulnerabilitiesrelatedtolibrary/APIfunctioncalls
![Page 15: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/15.jpg)
Step I: Generating Code Gadgets
15
Acodegadgetcorrespondingto
strcpy()
![Page 16: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/16.jpg)
² Eachcodegadgetislabeledas“1”(i.e.,vulnerable)or“0”(i.e.,notvulnerable).
16
According to the diff files
According to the vulnerable statements
Step II: Generating Ground Truth Labels
![Page 17: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/17.jpg)
Step III: Transforming Code Gadgets into Vectors
² Transformcodegadgetsintotheirsymbolicrepresentations² Encodethesymbolicrepresentationsintovectors
17
7 tokens
![Page 18: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/18.jpg)
Step IV: Training the BLSTM Neural Network
² TrainingprocessforlearningtheBLSTMneuralnetworkisstandard
18
![Page 19: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/19.jpg)
Steps V-VII: Detection Phase
19
![Page 20: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/20.jpg)
Outline
² GuidingPrinciples² DesignofVulDeePecker² ExperimentsandResults
² Limitations
² Conclusion
20
![Page 21: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/21.jpg)
Research Questions
21
RQ1:CanVulDeePeckerdealwithmultipletypesofvulnerabilitiesatthesametime?
RQ2:Canhumanintelligence(otherthandefiningfeatures)improvetheeffectivenessofVulDeePecker?
RQ3:HoweffectiveisVulDeePeckerwhencomparedwithotherapproaches?
² Metricsforevaluationü Falsepositiverate(FPR),falsenegativerate(FNR),recall,precision,F-measure
![Page 22: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/22.jpg)
Preparing Input to VulDeePecker
² ProgramscollectionforansweringtheRQsü Twosourcesofvulnerabilitydata
• 19C/C++opensourceproductswhichvulnerabilitiesaredescribedinNVD,andC/C++testcasesinSARD
ü Collect520opensourcesoftwareprogramfilesand8,122testcasesforthebuffererrorvulnerability(i.e.,CWE-119),and320opensourcesoftwareprogramfilesand1,729testcasesfortheresourcemanagementerrorvulnerability(i.e.,CWE-399)
² Trainingprogramsvs.targetprograms ü Randomlychoose80%oftheprogramswecollectastrainingprogramsandtherest20%astargetprograms
22
![Page 23: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/23.jpg)
Learning BLSTM Neural Networks
² DatasetsforansweringtheRQsü CodeGadgetDatabase(CGD):61,638codegadgetsü SixdatasetsofCGD
23
BE:BuffererrorvulnerabilitiesRM:ResourcemanagementvulnerabilitiesHY:Hybridoftheabovetwotypesof
vulnerabilities
ALL:Alllibrary/APIfunctioncallsSEL:Manuallyselectedlibrary/
APIfunctioncalls
![Page 24: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/24.jpg)
RQ1
² Insight:VulDeePeckercandetectmultipletypesofvulnerabilities,buttheeffectivenessissensitivetotheamountofdata(whichiscommontodeeplearning).
24
RM:16functioncallsrelatedtovulnerabilitiesBE:124functioncallsrelatedtovulnerabilities
RQ1:CanVulDeePeckerdealwithmultipletypesofvulnerabilitiesatthesametime?
![Page 25: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/25.jpg)
² Insight:HumanexpertisecanbeusedtoselectfunctioncallstoimprovetheeffectivenessofVulDeePecker.
25
RQ2:Canhumanintelligence(otherthandefiningfeatures)improvetheeffectivenessofVulDeePecker?
RQ2
![Page 26: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/26.jpg)
² Insight:Adeeplearning-basedvulnerabilitydetectionsystemcanbemoreeffectivebytakingadvantageofthedata-flowinformation.
26
RQ3: VulDeePecker vs. Static Analysis Tools
RQ3:HoweffectiveisVulDeePeckerwhencomparedwithotherapproaches?
![Page 27: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/27.jpg)
² Insight:VulDeePeckerismoreeffectivethancodesimilarity-basedapproaches
27
RQ3: VulDeePecker vs. Code Similarity-Based Approaches
RQ3:HoweffectiveisVulDeePeckerwhencomparedwithotherapproaches?
![Page 28: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/28.jpg)
² VulDeePeckerdetected4vulnerabilities,whichwerenotreportedintheNVD,butwere“silently”patchedbythevendors.
² Thesevulnerabilitiesaremissedbymostoftheothervulnerabilitydetectionsystemsmentionedabove
28
Using VulDeePecker in Practice
![Page 29: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/29.jpg)
Outline
² GuidingPrinciples² DesignofVulDeePecker² ExperimentsandResults
² Limitations
² Conclusion
29
![Page 30: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/30.jpg)
Limitations and Open Problems
² Presentdesignü Assumingsourcecodeisavailableü OnlydealingwithC/C++programsü Onlydealingwithvulnerabilitiesrelatedtolibrary/APIfunctioncallsü Onlyaccommodatingdata-flowinformation,butnotcontrol-flowinformationü Usingsomeheuristics
² Presentimplementationü LimittotheBLSTMneuralnetwork
² Presentevaluationü Thedatasetonlycontainsvulnerabilitiesaboutbuffererrorsandresourcemanagementerrors
30
![Page 31: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/31.jpg)
Outline
² GuidingPrinciples² DesignofVulDeePecker² ExperimentsandResults
² Limitations
² Conclusion
31
![Page 32: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/32.jpg)
Conclusion
² Weinitiatethestudyofusingdeeplearningforvulnerabilitydetection,anddiscusssomepreliminaryguidingprinciples
² WepresentVulDeePecker,andevaluateitfrom3perspectives
² Wepresentthefirstdatasetforevaluatingdeeplearning-basedvulnerabilitydetectionsystems² https://github.com/CGCL-codes/VulDeePecker
32
![Page 33: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/33.jpg)
New Results (after finishing the paper; in submission)
² Copewithallkindsofvulnerabilities(includinglibrary/APIfunctioncallsrelatedones)
² Accommodatebothdatadependencyandcontroldependency
² Detect7(potential)0-dayvulnerabilitiesand8silentlypatchedvulnerabilitiesfrom4softwareproducts
² Somedeepneuralnetworksaremorepowerfulthanothers 33
![Page 34: , Sujuan - NDSS Symposium](https://reader031.fdocuments.net/reader031/viewer/2022020621/61ea733cab18ef55b859cb81/html5/thumbnails/34.jpg)
Takeaways
² Thefirstdeeplearning-basedvulnerabilitydetectionsystemusingafiner-granularityunitcodegadget
² Guidingprinciplesfordeeplearning-basedvulnerabilitydetection
² Thefirstdatasetforevaluatingdeeplearning-basedvulnerabilitydetectionsystems
34