High Performance Deep Packet Inspection - DEEPNESS · PDF fileAbstract Deep packet inspection...

153
The Raymond and Beverly Sackler Faculty of Exact Sciences The Blavatnik School of Computer Science High Performance Deep Packet Inspection Thesis submitted for the degree of Doctor of Philosophy by Yaron Koral This work was carried out under the supervision of Professor Yehuda Afek and Doctor Anat Bremler-Barr Submitted to the Senate of Tel Aviv University September 2012

Transcript of High Performance Deep Packet Inspection - DEEPNESS · PDF fileAbstract Deep packet inspection...

  • The Raymond and Beverly Sackler Faculty of Exact Sciences

    The Blavatnik School of Computer Science

    High Performance Deep Packet Inspection

    Thesis submitted for the degree of Doctor of Philosophy

    by

    Yaron Koral

    This work was carried out under the supervision of

    Professor Yehuda Afek and Doctor Anat Bremler-Barr

    Submitted to the Senate of Tel Aviv University

    September 2012

  • 2012

    Copyright by Yaron Koral

    All Rights Reserved

  • This work is dedicated to the pursuit of a safe and secure world.

  • Acknowledgements

    First and foremost, I would like to thank my advisors, Yehuda Afek and Anat

    Bremler-Barr, for their continued support and guidance throughout my Ph.D. I have

    learned a lot from you whether in doing research, writing papers or giving presentations.

    Above all you taught me how to walk in the world of science and think sharply.

    I had the pleasure of working with the following people: David Hay, Yotam Har-

    chol, Shimrit Tzur-David and Victor Zigdon. I thank you for your companionship and

    support. Working with you was both enriching and a great delight.

    Last, and certainly the most, I thank my family: my beloved wife Keren; my charm-

    ing kids Omer, Ofri, Romi and Yarden; and my parents Akiva and Rahel for their

    unfailing love, encouragement and support.

    The work in this thesis was partially supported by the European Research Council

    under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC

    Grant agreement no 259085.

  • Abstract

    Deep packet inspection (DPI) is a form of network packet filtering that can search

    the packets content and locate the presence of certain patterns. These include headers

    and data-protocol structures as well as the payload of the message. It enables advanced

    network management, user service, and security functions as well as Internet data min-

    ing, eavesdropping, and censorship. It is currently being used by enterprises, service

    providers, and governments in a wide range of applications.

    DPI may be implemented by a wide range of pattern matching algorithms. The

    general problem of pattern matching is considered fundamental in computer science

    and has been researched thoroughly over the last decades. Still, when applied to the

    network domain of recent years, the traditional algorithms fail to face current challenges.

    The first challenge is the continual increase in Internet traffic rates, which requires a

    scalable design in terms of speed and memory usage. The second challenge arises from

    the increase in Web traffic compression due to the increasing popularity of Web surfing

    over mobile devices. The security device is forced to decompress this traffic prior to

    inspection, leading in turn to processing and space penalties. The third challenge is due

    to the requirement for a solution that is resilient to attacks that overload the security

    device. We address these challenges here. Moreover, we apply several technological

    advances to boost the performance of the traditional algorithms, including, for example,

    the presence of Ternary Content Addressable Memory (TCAM) elements in network

    devices and the availability of multi-core platforms for the DPI task.

    The work presented in this thesis focuses on DPI algorithms and techniques that

    relate to network security elements. In Chapter 3, we provide an algorithm for a scalable

    design of a DPI engine. Our design reduces the problem of pattern matching to the

    well-studied problem of Longest Prefix Match (LPM), which can be solved either in

    TCAM, in IP-lookup chips, or in software.

    Next we deal with the challenge of DPI over compressed traffic. Chapters 4 and 5

    focus on reducing the space and time penalties resulting from the compressed traffic.

  • These works show that, by using the meta-data generated during the compression stage,

    pattern matching over compressed traffic can be accelerated significantly as compared

    to traditional pattern matching over non-compressed traffic, and that the space penalty

    can be reduced by a factor of six as compared to current designs. Chapter 6 intro-

    duces an algorithm for scanning traffic compressed by SDCH compression, which is the

    compression scheme used by Google. Our design gains a performance boost of over

    40%.

    Finally, we address the challenge of performing DPI when the system is under denial-

    of-service via algorithmic complexity attacks. We provide a system design that takes

    advantage of commercial multi-core platforms to efficiently mitigate complexity attacks

    of varying intensity.

    The algorithms and techniques presented in this thesis provide a suitable DPI solu-

    tion that confronts todays network challenges.

  • Contents

    1 Introduction 1

    1.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.2 Overview of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2.1 CompactDFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2.2 SOP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2.3 SPC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.2.4 SDCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    1.2.5 MCA2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    1.3.1 DFA Compression . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    1.3.2 Compressed Web-Traffic . . . . . . . . . . . . . . . . . . . . . . . 15

    1.3.3 DPI Using Multi-Core Platforms . . . . . . . . . . . . . . . . . . 16

    1.3.4 Denial-of-Service Mitigation . . . . . . . . . . . . . . . . . . . . . 17

    1.4 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2 Background 19

    2.1 DFA based Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.2 Compressed Web-Traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.2.1 Gzip Compression . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.2.2 SDCH Compression . . . . . . . . . . . . . . . . . . . . . . . . . 26

    2.3 Complexity attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    3 CompactDFA 29

    3.1 The CompactDFA Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    3.1.1 CompactDFA Output . . . . . . . . . . . . . . . . . . . . . . . . 29

    3.1.2 CompactDFA Algorithm . . . . . . . . . . . . . . . . . . . . . . . 30

    ix

  • 3.1.3 The Aho-Corasick Algorithm-like Properties . . . . . . . . . . . 32

    3.1.4 Stage I: State Grouping . . . . . . . . . . . . . . . . . . . . . . . 34

    3.1.5 Stage II: Common Suffix Tree . . . . . . . . . . . . . . . . . . . . 36

    3.1.6 Stage III: State and Node Encoding . . . . . . . . . . . . . . . . 38

    3.2 CompactDFA for total memory minimizations . . . . . . . . . . . . . . . 40

    3.3 CompactDFA for DFA with strides . . . . . . . . . . . . . . . . . . . . . 41

    3.4 Implementing CompactDFA using IP-lookup Solutions . . . . . . . . . . 43

    3.4.1 Implementing CompactDFA with non-TCAM IP-lookup solutions 44

    3.4.2 Implementing CompactDFA with TCAM . . . . . . . . . . . . . 45

    3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    4 Space Efficient DPI of Compressed Web Traffic 57

    4.1 SOP Packing technique . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    4.1.1 Buffer Packing: Swap Out of boundary Pointers (SOP) . . . . . . 58

    4.1.2 Huffman Coding Scheme . . . . . . . . . . . . . . . . . . . . . . . 60

    4.1.3 Unpacking the Buffer: Gzip Decompression . . . . . . . . . . . . 63

    4.2 Combining SOP with ACCH algorithm . . . . . . . . . . . . . . . . . . . 64

    4.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    4.3.1 Experimental Environment . . . . . . . . . . . . . . . . . . . . . 69

    4.3.2 Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    4.3.3 Space and Time Results . . . . . . . . . . . . . . . . . . . . . . . 70

    4.3.4 Time Results Analysis . . . . . . . . . . . . . . . . . . . . . . . . 72

    4.3.5 DPI of Compressed Traffic . . . . . . . . . . . . . . . . . . . . . . 73

    5 Shift-based Pattern Matching for Compressed Traffic 75

    5.1 The Modified Wu-Manber Algorithm . . . . . . . . . . . . . . . . . . . . 75

    5.2 Shift-based Pattern matching for Compressed traffic (SPC) . . . . . . . 78

    5.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

    5.3.1 Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

    5.3.2 Pattern Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    5.3.3 SPC Characteristics Analysis . . . . . . . . . . . . . . . . . . . . 83

    5.3.4 SPC Run-Time Performance . . . . . . . . . . . . . . . . . . . . 85

    5.3.5 SPC Storage Requirements . . . . . . . . . . . . . . . . . . . . . 86

  • 6 Decompression-Free Inspection 88

    6.1 Our Decompression-Free algorithm . . . . . . . . . . . . . . . . . . . . . 88

    6.1.1 Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . 88

    6.1.2 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    6.1.3 Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

    6.1.4 Dealing with Gzip over SDCH . . . . . . . . .