Ariu - Workshop on Multiple Classifier Systems 2011

24
A modular architecture for the analysis of HTTP payloads based on Multiple Classifiers Davide Ariu [email protected] Giorgio Giacinto [email protected] Department of Electric and Electronic Engineering University of Cagliari Pattern Recognition and Applications Group http://prag.diee.unica.it Group This research was sponsored by the Autonomous Region of Sardinia through a grant financed with the ”Sardinia PO FSE 2007‐2013” funds and provided according to the L.R. 7/2007 Napoli, 17 Giugno 2011

Transcript of Ariu - Workshop on Multiple Classifier Systems 2011

Page 1: Ariu - Workshop on Multiple Classifier Systems 2011

A modular architecture for the analysis of HTTP payloads based

on Multiple Classifiers

Davide Ariu [email protected]

Giorgio Giacinto [email protected]

Department of Electric and Electronic Engineering

University of Cagliari

Pattern Recognition and Applications Group http://prag.diee.unica.it

Group This research was sponsored by the Autonomous Region of Sardinia through a grant financed with the ”Sardinia PO FSE 2007‐2013” funds and provided according to the L.R. 7/2007 

Napoli, 17 Giugno 2011

Page 2: Ariu - Workshop on Multiple Classifier Systems 2011

Outline

•  Motivations

•  The proposed system

•  Experimental Setup and Results

•  Conclusions

2 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 3: Ariu - Workshop on Multiple Classifier Systems 2011

The objective

Design of an anomaly based Intrusion Detection System for the protection of Web Servers and Applications.

The HTTP traffic toward the web servers is inspected by a multiple classifier system.

3 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 4: Ariu - Workshop on Multiple Classifier Systems 2011

Why Web Applications?

4 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 5: Ariu - Workshop on Multiple Classifier Systems 2011

Why Anomaly Detection?

5 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 6: Ariu - Workshop on Multiple Classifier Systems 2011

A legitimate Payload...

GET /pra/ita/home.php HTTP/1.1 Host: prag.diee.unica.it Accept: text/*, text/html User-Agent: Mozilla/4.0

6 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 7: Ariu - Workshop on Multiple Classifier Systems 2011

A legitimate Payload...

GET /pra/ita/home.php HTTP/1.1 Host: prag.diee.unica.it Accept: text/*, text/html User-Agent: Mozilla/4.0

7 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Request Line

Page 8: Ariu - Workshop on Multiple Classifier Systems 2011

A legitimate Payload...

GET /pra/ita/home.php HTTP/1.1 Host: prag.diee.unica.it Accept: text/*, text/html User-Agent: Mozilla/4.0

8 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Request Line

Request Headers

Page 9: Ariu - Workshop on Multiple Classifier Systems 2011

...and some attacks

•  Long Request Buffer Overflow HEAD / aaaaaaa…aaaaaaaaaaaa

•  URL Decoding Error GET /d/winnt/sys32/cmd.exe?/c+dir HTTP/1.0 Host: www Connection: close

9 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 10: Ariu - Workshop on Multiple Classifier Systems 2011

Why Payload Analysis?

•  Detection of Web-based attacks based on the – Analysis of the Request-Line

•  Allows detecting only attacks that exploit input-validation flows e.g. Spectrogram ([Song,2009]), HMM-Web ([Corona,2009])

– HTTP Payload Analysis •  Takes into account the whole HTTP-request, and thus it can (in principle) detect any kind of attack

10 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 11: Ariu - Workshop on Multiple Classifier Systems 2011

SOA - Payload Analysis

•  Payl [Wang,2004] –  n-grams to represent byte statistics

•  McPAD [Perdisci,2009] –  Ensemble of one-class SVM trained on ν-grams

•  Spectrogram [Wang,2009] –  Ensemble of Markov Chains to analyze the request-Line

•  HMMPayl [Ariu,2011] –  Ensemble of HMM to analyze sequences of bytes from

the whole payload

None of the above techniques represented the structure of the payload

11 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 12: Ariu - Workshop on Multiple Classifier Systems 2011

The proposed system Basic Idea

•  We propose to take into account the structure of HTTP payloads – For each line of the payload, an ensemble of HMM is used to model the sequences of bytes.

– The final decision is obtained by using the HMM outputs as features. The payload is thus classified by a one-class classifier trained on the outputs of the HMM ensembles.

12 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 13: Ariu - Workshop on Multiple Classifier Systems 2011

The proposed system A scheme

13 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

HMM Ensemble Request‐Line 

HMM Ensemble User‐Agent 

HMM Ensemble Host 

HMM Ensemble Accept‐Encoding 

HMM Ensemble Accept‐Language  0.62 

‐1 

0.53 

0.34 

0.49 

One‐Class Classifier 

Output Score  or 

Class‐Label 

IDS 

GET /pra/index.php HTTP/1.1 Host: prag.diee.unica.it User-Agent: Mozilla/5.0 Accept-Encoding: gzip, deflate

HTTP Payload 

Page 14: Ariu - Workshop on Multiple Classifier Systems 2011

Missing Features

•  Each request typically does not contain all the headers

– Training phase: the value of the feature related to a missing header has been set to the average value

– Testing phase: the value of the feature related to a missing header has been set to -1

14 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 15: Ariu - Workshop on Multiple Classifier Systems 2011

Experimental Setup - 1

•  2 Datasets of “Real” legitimate traffic –  DIEE, collected at the University of Cagliari

–  GT, collected at Georgia Tech

15 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 16: Ariu - Workshop on Multiple Classifier Systems 2011

Experimental Setup - 2 

•  3 Datasets of “Real” Attacks – Generic, 66 Attacks – Shell-code, 11 Attacks – XSS-SQL Injection,38 Attacks

•  Training: 1 day of traffic •  Test: the remaining traffic plus attacks – K-fold CV

16 

Page 17: Ariu - Workshop on Multiple Classifier Systems 2011

Experimental Setup - 3

•  4 One-class classification algorithms with default setting of parameters –  Gauss - Gaussian distribution –  Mog – Mixture of Gaussians –  Parzen – Parzen density estimator –  SVM – SVM with RBF Kernel

•  Performance evaluated using the “Partial AUC” –  Computed in the FP range [0,0.1] –  Normalized dividing by 0.1

17 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 18: Ariu - Workshop on Multiple Classifier Systems 2011

Experimental Results Partial AUC – DIEE Dataset

18 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 19: Ariu - Workshop on Multiple Classifier Systems 2011

Experimental Results Multiple HMM – DIEE Dataset – Shellcode Attacks

19 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 20: Ariu - Workshop on Multiple Classifier Systems 2011

Experimental Results Partial AUC – GT Dataset

20 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 21: Ariu - Workshop on Multiple Classifier Systems 2011

Experimental Results Comparison with similar IDS

21 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 22: Ariu - Workshop on Multiple Classifier Systems 2011

Computational Cost

22 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 23: Ariu - Workshop on Multiple Classifier Systems 2011

Conclusions

•  We proposed an anomaly based IDS for the protection of Web-Servers and Web-Applications

•  We exploited the MCS paradigm –  To analyze the structure of the HTTP payload –  By combining the outputs through a One-class classifier

•  Compared to similar systems, our propoal –  Provides high performance in attack detection –  Is fast

23 Pattern Recognition and Applications Group http://prag.diee.unica.it

Group 

Page 24: Ariu - Workshop on Multiple Classifier Systems 2011

Thank You!