Security Enhanced Communications in Cognitive Networks · Security Enhanced Communications in...

Security Enhanced Communications in Cognitive Networks

Qiben Yan

Dissertation submitted to the Faculty of theVirginia Polytechnic Institute and State University

in partial fulfillment of the requirements for the degree of

Doctor of Philosophyin

Computer Science and Applications

Wenjing Lou (Chair)Y. Thomas HouIng-Ray ChenDanfeng Yao

Sushil Jajodia

June 23, 2014Falls Church, Virginia

Keywords: cognitive network security, cognitive radio network, reactive jamming attack,network monitoring, botnet detection.

c©Copyright 2014, Qiben Yan

Security Enhanced Communications in Cognitive Networks

Qiben Yan

ABSTRACT

With the advent of ubiquitous computing and Internet of Things (IoT), potentially billionsof devices will create a broad range of data services and applications, which will requirethe communication networks to efficiently manage the increasing complexity. Cognitivenetwork has been envisioned as a new paradigm to address this challenge, which has thecapability of reasoning, planning and learning by incorporating cutting edge technologiesincluding knowledge representation, context awareness, network optimization and machinelearning. Cognitive network spans over the entire communication system including the corenetwork and wireless links across the entire protocol stack. Cognitive Radio Network (CRN)is a part of cognitive network over wireless links, which endeavors to better utilize thespectrum resources. Core network provides a reliable backend infrastructure to the entirecommunication system. However, the CR communication and core network infrastructurehave attracted various security threats, which become increasingly severe in pace with thegrowing complexity and adversity of the modern Internet.

The focus of this dissertation is to exploit the security vulnerabilities of the state-of-the-artcognitive communication systems, and to provide detection, mitigation and protection mech-anisms to allow security enhanced cognitive communications including wireless communica-tions in CRNs and wired communications in core networks. In order to provide secure andreliable communications in CRNs: first, we incorporate security mechanisms into fundamen-tal CRN functions, such as secure spectrum sensing techniques that will ensure trustworthyreporting of spectrum reading. Second, as no security mechanism can completely prevent allpotential threats from entering CRNs, we design a systematic passive monitoring framework,SpecMonitor, based on unsupervised machine learning methods to strategically monitor thenetwork traffic and operations in order to detect abnormal and malicious behaviors. Third,highly capable cognitive radios allow more sophisticated reactive jamming attack, which im-poses a serious threat to CR communications. By exploiting MIMO interference cancellationtechniques, we propose jamming resilient CR communication mechanisms to survive in thepresence of reactive jammers. Finally, we focus on protecting the core network from botnetthreats by applying cognitive technologies to detect network-wide Peer-to-Peer (P2P) bot-nets, which leads to the design of a data-driven botnet detection system, called PeerClean.In all the four research thrusts, we present thorough security analysis, extensive simulationsand testbed evaluations based on real-world implementations. Our results demonstrate thatthe proposed defense mechanisms can effectively and efficiently counteract sophisticated yetpowerful attacks.

To my beloved wife, Luna Le Lu, and my parents Yaqin Chen and Shifu Yan

iii

Acknowledgments

To have reached this point in my life, I have been offered so many guidance and support

from people who have changed my life. I owe a debt of gratitude to them all, and I would

really like to express my deepest appreciations here.

First and formost, I would like to express my sincere gratitude to my advisor Dr. Wenjing

Lou. Her research attitude, sense of responsibility and pursuit of excellent really inspire me

along the past years. Dr. Lou was instrumental in helping me develop my research skills and

presentation skills, providing me with invaluable perspective and encouragement along the

way. I can’t thank her enough for giving me the opportunity to learn from and work with

her, for patiently meeting with me, talking about my ideas, answering my questions and

proofreading my papers. Her logical way of thinking and her keen sense of future technology

have been of great value to me. Dr. Lou not only guided my research in the past few years,

but she also cares for my life and personal growth with great thoughtfulness. I feel fortunate

to find an advisor who has always shown respects for my own interests, and done everything

she could to help make me successful. I am grateful for what she has done for me.

I am also extremely grateful to Dr. Thomas Hou, who gave me invaluable guidance and

advice on my life, research and career. Dr. Hou was working closely with me throughout

my Ph.D. studies. He has been an inspiration to me with his great interest and passion on

research. His detailed-oriented, determination and hardworking style encourage me to follow

iv

my heart, never lose hope and keep pursuing my dream.

I am also thankful to Dr. Ing-Ray Chen, Dr. Danfeng Yao and Dr. Sushil Jajodia for serving

on my dissertation committee. Their insightful questions and comments about my research

greatly contributed to improving this dissertation.

I would like to thank Dr. Ming Li. As a collaborator, he always provided helpful comments

and advices on my research. Dr. Li worked closely with me in the past few years, who

discussed with me about the research ideas and inspired me with his broad knowledge and

keen insights. I am grateful to Dr. Zhenyu Yang and Dr. Ning Cao, two previous colleagues,

who discussed ideas with me in various stages of my research and offered generous help into

my research and life.

I also want to thank Dr. Liang Xiao for her helpful discussions on wireless security and

privacy. I am thankful to Dr. Feng Chen for helping me embrace the machine learning and

artificial intelligence world. Feng gave me inspirations on applying and developing machine

learning mechanisms for security purposes.

I wish to thank my former colleague, Hanfei Zhao, and my labmates in the Complex Networks

and Security Research (CNSR) lab at Northern Virginia Center: Yao Zheng, Ning Zhang,

Bing Wang, Changlai Du, Wenhai Sun for creating an intellectual and enjoyable atmosphere,

and making my Ph.D. a memorable journey. I benefited tremendously from the discussions

and interactions with my labmates. I also want to thank my former and current labmates in

CNSR lab at Blacksburg: Dr. Canming Jiang, Dr. Liguang Xie, Huacheng Zeng, Xu Yuan,

who shared their knowledge on wireless networking and operations research with me. I am

especially indebted to Yao Zheng and Huacheng Zeng, who put great efforts in improving

and implementing our research ideas.

I am greatly indebted to my father Shifu Yan, and my mother Yaqin Chen. They always

v

understand me and support my choices with endless love. They have done so much for

me professionally and personally. They have sacrificed so much to support me along this

wonderful journey. I truly will never be able to repay them with what they have done for

me.

More importantly, I am specially indebted to my wife Luna Lu. In my mind, her life and

her dream are what I am living for. I truly would not be pursuing Ph.D. degree without her

support and understanding. My wife has been, and always will be my best friend. At my

best time, and more importantly, at my worst time, I know that she will always stand by my

side, celebrating my achievements, and giving me love and hope during the frustrating time.

I am fortunate to have a wife who cares for me more than herself. Her inspiring words can

always cheer me up, and her beautiful smile can always calm me down. Although we were

separated during these years by the whole North American continent, she always believed in

me, and encouraged me to work hard and play harder. I would surely not be able to reach

this far without her support and encouragement. I cannot even imagine where I would be

today were it not for the people I love, thank you so much for always believing in me.

vi

Contents

Abstract ii

Dedication iii

Acknowledgments iv

List of Figures xi

List of Tables xiv

1 Introduction 1

1.1 Cognitive Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Security Challenges in Cognitive Networks . . . . . . . . . . . . . . . . . . . 3

1.2.1 Spectrum Sensing Security in CRNs . . . . . . . . . . . . . . . . . . . 4

1.2.2 Network Security in CRNs . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.3 Reliable CR Communications with Adversarial Software Radios . . . 7

1.2.4 Botnet Threats to Cognitive Communications in Core Networks . . . 8

1.3 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Secure Distributed Consensus-based Spectrum Sensing in Cognitive RadioNetworks 14

2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

vii

2.2.1 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.2 Distributed Consensus-based Spectrum Sensing . . . . . . . . . . . . 17

2.3 Vulnerability Analysis of Distributed Consensus-based Spectrum Sensing . . 19

2.3.1 Disruption of Sensing Operation . . . . . . . . . . . . . . . . . . . . . 20

2.3.2 Stealthy Manipulation of Sensing Results . . . . . . . . . . . . . . . . 23

2.4 Protection of Distributed Consensus-based Spectrum Sensing . . . . . . . . . 26

2.4.1 Robust Distributed Outlier Detection with Adaptive Local Threshold 27

2.4.2 Hash-based Computation Verification of Neighbor State Update . . . 30

2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.5.1 Impact of Covert Adaptive Data Injection Attacker . . . . . . . . . . 34

2.5.2 Effectiveness of Robust Distributed Outlier Detection with AdaptiveLocal Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.5.3 Security Analysis of Hash-based Computation Verification Approach . 36

2.5.4 Cost Evaluation of Hash-based Computation Verification Approach . 39

2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3 Non-Parametric Passive Traffic Monitoring in Cognitive Radio Networks 41

3.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.2.1 Monitoring System Model . . . . . . . . . . . . . . . . . . . . . . . . 44

3.2.2 Channel Access Model . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.3 User Channel Access Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3.1 Primary User Detection . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.3.2 Secondary User Channel Access Model . . . . . . . . . . . . . . . . . 50

3.4 Near-Optimal Monitoring Mechanism . . . . . . . . . . . . . . . . . . . . . . 56

3.4.1 Frame-level Quality-of-Monitoring Optimization . . . . . . . . . . . . 58

3.4.2 User-level Quality-of-Monitoring Optimization . . . . . . . . . . . . . 61

3.4.3 Numerical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.4.4 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

viii

3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.5.1 Performance of Secondary User Channel Access Model . . . . . . . . 67

3.5.2 Real-Time Monitoring Performance . . . . . . . . . . . . . . . . . . . 68

3.5.3 Frame Capturing Performance . . . . . . . . . . . . . . . . . . . . . . 70

3.5.4 User Capturing Performance . . . . . . . . . . . . . . . . . . . . . . . 75

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4 MIMO-based Jamming Resilient OFDM Communications in Wireless Net-works 78

4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.2.2 Attack Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.2.3 MIMO Interference Cancellation and OFDM Basics . . . . . . . . . . 83

4.3 Impact of Reactive Jamming Attack to MIMO-OFDM Communications . . . 85

4.4 Defense Mechanisms of Reactive Jamming Attack . . . . . . . . . . . . . . . 88

4.4.1 Defense Mechanism Overview . . . . . . . . . . . . . . . . . . . . . . 89

4.4.2 Decoding the Signal of Interest . . . . . . . . . . . . . . . . . . . . . 89

4.4.3 Detecting the Jamming Signal . . . . . . . . . . . . . . . . . . . . . . 94

4.4.4 Enhanced Defense Mechanism . . . . . . . . . . . . . . . . . . . . . . 95

4.4.5 Defending Against Reactive Jamming In a Multi-hop Network . . . . 98

4.4.6 Dealing with Other Types of Jammers . . . . . . . . . . . . . . . . . 99

4.4.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.6.1 Impact of Received Signal Direction . . . . . . . . . . . . . . . . . . . 102

4.6.2 Impact of Channel Coherence Time . . . . . . . . . . . . . . . . . . . 103

4.6.3 Jamming Attack and Defense Performance . . . . . . . . . . . . . . . 105

4.6.4 Overhead Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

ix

4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5 Unveiling Peer-to-Peer Botnets through Dynamic Group Behavior Analy-sis 113

5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.2 Overview of PeerClean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.3 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

5.3.1 Flow Statistical Features . . . . . . . . . . . . . . . . . . . . . . . . . 120

5.3.2 P2P Host Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.3.3 Dynamic Group Behavior Analysis . . . . . . . . . . . . . . . . . . . 124

5.3.4 Training and Classification . . . . . . . . . . . . . . . . . . . . . . . . 134

5.3.5 Refined Bot Identification . . . . . . . . . . . . . . . . . . . . . . . . 135

5.3.6 Putting Them All Together . . . . . . . . . . . . . . . . . . . . . . . 136

5.3.7 Evasion Mechanisms and Limitations . . . . . . . . . . . . . . . . . . 136

5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5.5.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5.5.2 Clustering P2P Hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

5.5.3 Identifying Bot-included Clusters via Classification . . . . . . . . . . 144

5.5.4 Refined Bot Identification Performance . . . . . . . . . . . . . . . . . 146

5.5.5 Comparisons with Other Detection Approaches . . . . . . . . . . . . 147

5.5.6 System Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

6 Conclusion and Future Research 150

6.1 Research Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

Bibliography 159

x

List of Figures

1.1 Cognitive network concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 The distributed hash-based verification of neighbor state update. . . . . . . 30

2.2 Performance of covert adaptive data injection attack to distributed consensus-based spectrum sensing protected by traditional detection scheme with detec-tion threshold -56dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.3 Performance of robust distributed outlier detection with adaptive local threshold 36

2.4 Different collusion styles (solid points represent attackers, hollow points rep-resent honest SUs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.1 An overview of SpecMonitor system . . . . . . . . . . . . . . . . . . . . . . . 44

3.2 Monitoring system architecture for WhiteFi network inside a monitoring area 45

3.3 The percentage of frames in active slots (20 ms slot length) . . . . . . . . . . 45

3.4 Frame/Active slot interarrival time distribution (20 ms sensing slot, 2 mssensing period) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.5 Sliding window method (X axis denotes the interarrival time data, Y axisdenotes the channels) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.6 Normalized objective (with respect to the computed upper bound) for 20channels and 10 sniffers with α=0.3 . . . . . . . . . . . . . . . . . . . . . . . 63

3.7 Normalized objective (with respect to the computed upper bound) for 30channels and 20 sniffers with α=0.3 . . . . . . . . . . . . . . . . . . . . . . . 63

3.8 Normalized objectives for different α . . . . . . . . . . . . . . . . . . . . . . 64

3.9 Number of captured active slots using our algorithm vs. Upper bound for 20channels and 10 sniffers with α=0.3 . . . . . . . . . . . . . . . . . . . . . . . 65

3.10 Performance of secondary user channel access model . . . . . . . . . . . . . . 68

xi

3.11 Performance with different methods using synthetic time series data (5 chan-nels, 3 sniffer antennas) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.12 Performance with varied number of sniffer antennas using different methods(α=0.25, 20 channels) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.13 Performance using real-world traffic (7 channels, 4 sniffer antennas) . . . . . 73

3.14 Average frame capture rate comparison of multi-slot updating and per-slotupdating with 20 channels and 10 sniffer antennas . . . . . . . . . . . . . . . 74

3.15 User capture performance with 20 channels . . . . . . . . . . . . . . . . . . . 76

4.1 Reactive jammer starts jamming after certain reaction time . . . . . . . . . . 81

4.2 1× 2 MIMO-OFDM link attacked by a Jammer . . . . . . . . . . . . . . . . 85

4.3 Different two-dimensional received signal spaces . . . . . . . . . . . . . . . . 88

4.4 Jamming attack performance by approaching the sender’s location (in thisexperiment, the device works on 2.45GHz central frequency with a half wave-length λ

2= c

2f≈ 6.12cm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.5 A Flow Chart of Proposed Defense Mechanisms (Solid Box: Modules of theDefense Mechanism, Dashed Box: Modules of Enhanced Defense Mechanism) 90

4.6 Extended frame structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.7 Soft error based jamming detection . . . . . . . . . . . . . . . . . . . . . . . 95

4.8 Burst of packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.9 Packet delivery rate performance with different angles between two receivedsignals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.10 Autocorrelation of the channel phase in an indoor environment (tested using500KHz bandwidth communications) . . . . . . . . . . . . . . . . . . . . . . 105

4.11 Testbed. The receiver is placed at A, while the sender and jammer are placedat the selected locations 1 to 9. . . . . . . . . . . . . . . . . . . . . . . . . . 106

4.12 Packet delivery rate with and without jammer in 1× 2 link . . . . . . . . . . 107

4.13 Jamming attack and defense performance . . . . . . . . . . . . . . . . . . . . 108

4.14 PDR performance with different jamming powers . . . . . . . . . . . . . . . 109

4.15 PDR performance with different types of jamming signals . . . . . . . . . . . 110

4.16 Throughput performance using our defense mechanisms . . . . . . . . . . . . 110

xii

5.1 PeerClean system flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.2 Cluster connectivity feature . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.3 (a): The shared neighbor ratio of one Emule host pair compares with that ofone ZeroAccess host pair (b): Group shared neighbor ratio . . . . . . . . . . 127

5.4 (a): Significant connection feature (b): Significant connection feature of Ke-lihos and ZeroAccess bots in 24 hours . . . . . . . . . . . . . . . . . . . . . . 131

5.5 Significant connection volatility . . . . . . . . . . . . . . . . . . . . . . . . . 132

5.6 Box plot of Bot Compactness Ratio . . . . . . . . . . . . . . . . . . . . . . . 143

5.7 Classification performance with different percentages of training data . . . . 146

5.8 Running time of AP clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 148

xiii

List of Tables

2.1 Simulation parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.2 Costs of hash-based computation verification scheme at one node for eachiteration (N is the number of neighbors, P is the length of state in bytes, his the length of hash in bytes.) . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.1 Summary of symbols and notations . . . . . . . . . . . . . . . . . . . . . . . 48

3.2 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.3 Average running time with 20 channels and 10 sniffer antennas (in ms) . . . 70

4.1 Default system setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.2 Testbed setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.1 Flow size statistical features . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

5.2 Host access pattern features . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.3 Group-level feature summary . . . . . . . . . . . . . . . . . . . . . . . . . . 136

5.4 Traffic summary (‘P2P in campus’ denotes the traffic flows of the campusnetwork after P2P host identification) . . . . . . . . . . . . . . . . . . . . . . 141

5.5 Clustering result using UDP traffic (BSR1, BSR2 denotes the BSRs of Salityand ZeroAccess bots respectively) . . . . . . . . . . . . . . . . . . . . . . . . 142

5.6 Clustering result using TCP traffic (BSR3 denotes the BSR of Kelihos bots) 142

5.7 Classification accuracy when trained on one type of feature. Shared neigh-bor feature and significant connection feature present the best classificationaccuracy. The classifier achieves the best performance when combining al-l the features. Accuracy=(TP+TN)/all; Precision=TP/(TP+FP); Recal-l=TP/(TP+FN). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

xiv

5.8 Refined bot identification performance (the percentage in the parenthesis de-notes the bot detection rate and false alarm rate respectively) . . . . . . . . 146

5.9 Performance comparison with method A and method B (threshold: 0.2). . . 148

xv

Chapter 1

Introduction

The area of data communication technologies is one of the fastest changing areas, with nu-

merous services and applications having enormous impact on different aspects of modern

society, including economic growth, inter-human relationship, scientific development, educa-

tion and entertainment. Therefore, the development of a reliable and robust, yet flexible and

extensible communication infrastructure is of utmost importance to facilitate these human-

to-human as well as human-to-machine communications, providing services such as e-health,

e-commerce, e-banking, e-payment, e-learning. Future network will be more complex and

diversified, extending towards ubiquitous computing by involving various types of connected

devices including mobile devices, wearable devices, and smart home appliances. To effective-

ly manage the complexity faced by the future network and to provide optimized and secure

end-to-end data communications, a new networking paradigm, called cognitive network [1–3],

has been recently proposed. In this chapter, we introduce the definition of cognitive net-

works, expose the security challenges in cognitive communications, and present the major

research contributions of this dissertation.

1

2

1.1 Cognitive Networks

Cognitive involves conscious intellectual activity as thinking, reasoning or remembering, and

is based on or capable of being reduced to empirical factual knowledge, according to the

dictionary [4]. A cognitive network is a network which has knowledge representation about

the devices, systems, networks and events, and uses cognitive process/cycle that can perceive

current network conditions, and then plan, decide, act on those conditions. The network

can learn from the consequences of the actions to make future decisions, all while following

end-to-end goals. The knowledge representation and cognitive cycle are two main elements of

cognitive networks. Knowledge representation is a prerequisite for achieving self-awareness.

Knowledge-enpowered network builds up a model, based on which the network can perform

reasoning and updating to configure itself, explain itself and repair itself. The cognitive cycle

allows the cognitive systems to adjust their functions by perceiving the environment, which

is also called cognition cycle. The cognition cycle as described by Mitola [5] contains the

following states: observe, orient, plan, decide, act and learn. Similarly, the cognitive cycle

defined in [3] consists of six states: sense, plan, decide, act, learn, policy. In general, the

cognitive network uses sensors to sense the environment (sense), and plans the strategies

(plan) based on observations and policies (policy). Then, a learning module is employed

to learn from the observations and possible strategies (learn) to aid the decision making

(decide). Finally, the actuators will execute the adequate changes to the cognitive network

(act). Empowered with knowledge representation mechanism and cognitive cycle, cognitive

network becomes self-aware, analytical and adaptive.

Cognitive networks span over the CRNs and core networks, as illustrated in Fig. 1.1. CRNs

are formed by high performance software defined radios (SDRs) [6, 7], who are inherently

programmable and flexible. CRNs take advantage of the flexible capabilities of SDRs to

engage in dynamic spectrum access, who can learn the spectrum availability and radio prop-

3

Figure 1.1: Cognitive network concept

agation environment, and access the spectrum in a correct and cooperative manner. The

cognitive core network utilizes advanced tools from data analytics and machine learning to

ensure the reliability and robustness of data delivery.

1.2 Security Challenges in Cognitive Networks

In this section, we unfold the alarming security challenges in cognitive networks, including

security in CRNs and security in core networks. In cognitive networks, not only the cognitive

functionality of the network has vulnerabilities that can be exploited by the adversaries, but

the network itself suffers from highly capable and adaptive adversaries who may learn from

the environments to adjust their strategies.

The security goals we want to achieve in cognitive networks may include: confidentiality,

integrity, availability, authenticity & trustworthiness [8]. According to [9], the cognitive

network security can be defined as: the provisions and policies adopted by a network ad-

4

ministrator to prevent and monitor unauthorized access, misuse, modification, or denial of

computer network and network-accessible resources, in order to satisfy the security require-

ments. In this dissertation, we attempt to enhance the security of cognitive networks in

terms of achieving four security goals: first, we focus on enforcing sensing report integrity

in CR communications; second, we aim at monitoring unauthorized or malicious network

access, spectrum misuse to enforce the trustworthiness of CR communications; third, we try

to prevent denial of service caused by advanced jamming attacks to enforce the availability of

CR communications; fourth, we strive to achieve confidentiality, authenticity, trustworthiness

and availability of cognitive communications in core networks via detecting and eliminating

infected machines or bots.

Although this dissertation does not address all the security issues in cognitive networks (e.g.

we do not deal with data privacy issue to achieve confidentiality in CRNs), it contributes to

enhancing the security of the current cognitive networks in various aspects. In the following

sections, we list all the security challenges that this dissertation is going to address.

1.2.1 Spectrum Sensing Security in CRNs

Cognitive radio (CR) [10,11] has emerged as a key technology to enabling the use of licensed

spectrum bands from incumbents, also known as primary users (PUs), when they are idle.

An important challenge in CR technology is reliable spectrum sensing [12], by which cognitive

radio devices, also known as secondary users (SUs), detect and exploit a spectrum band when

it is unused, but vacate the channel immediately upon detecting the presence of primary

users. Cooperative spectrum sensing, which exploits the cooperation of multiple SUs and

leverages the spatial diversity among those location-dispersed SUs, has shown significant

advantages in achieving reliable spectrum sensing results [13].

5

Cooperation in spectrum sensing can be achieved in two models: centralized or distributed.

The former uses a common receiver (i.e., fusion center) to collect sensing results from all

SUs and to make final spectrum sensing decision. It relies on a centralized infrastructure

which may be unavailable in ad-hoc CR networks. In contrast, a distributed approach allows

SUs to share individual sensing results with their neighbors, and to make their own sensing

decisions. Therefore, distributed spectrum sensing model is more suitable for CR ad-hoc

networks [14].

Despite the many benefits cooperative spectrum sensing entitles, it is vulnerable to many

potential attacks. Attackers may generate a false primary user signal to launch a primary

user emulation (PUE) attack [15] in order to gain unfair share of the spectrum usage, or they

can manipulate SUs’ sensing reports by various means in order to invert the detection result-

s (i.e., presence or absence of PUs), which is often termed as data falsification attack [16].

Current research in securing cooperative spectrum sensing have been focusing on address-

ing these attacks under the centralized model [17, 18]. Similar security threats exist in the

distributed schemes but are left under addressed thus far. In fact, a distributed scheme is

even more vulnerable to such attacks due to its distributed and cooperative natures. As

an example, recently, a bio-inspired consensus-based distributed spectrum sensing algorithm

has been proposed [19,20]. It is merely based on localized sensing status measuring and ex-

changing, thus is very efficient and scalable. However, due to the distributed and cooperative

natures of the protocol, the impact of malicious behaviors of an attacker, if not defended

properly, would propagate through the whole network [21], causing long-term widespread

impacts. More involved attacks which would undermine the distributed spectrum sensing

mechanisms without being detected are also possible. Thus, we focus on the protection of

distributed spectrum sensing in CR ad hoc networks in Chapter 2.

6

1.2.2 Network Security in CRNs

Malicious network activities, such as false data injection attacks [22], Denial of Service

(DoS) attacks [23–26], spectrum misuse [27, 28] have posed serious threats to CRNs, which

will result in a significant performance degradation of CR communications. As no security

mechanism can completely prevent all potential threats from entering CRNs, it is of utmost

importance to detect and then prevent these potential threats. However, the detection of

these activities in CRNs remains largely untouched in the literature.Passive monitoring has

been used to measure Wi-Fi networks [29–31] using a dedicated set of hardware devices, called

sniffers. It has been shown to complement the wire side monitoring by gathering detailed

PHY/MAC information. Passive monitoring serves as the basis of numerous applications

ranging from network forensics, fault diagnosis to resource management. As the quality of

those applications mainly depends on that of traffic monitoring, it is non-trivial to build

a traffic monitoring framework with excellent monitoring performance. Passive monitoring

is particularly important to CRNs, because: (1) cognitive radios are programmable and

difficult to manage; (2) the interference requirement in CRN is mandatory and extremely

high. In this dissertation, we consider the construction of a passive monitoring framework

for Wi-Fi like CR networks, or “WhiteFi” networks for short.

However, passive monitoring becomes a challenging task in WhiteFi networks. First, WhiteFi

networks have a much wider spectrum (50MHz-698MHz) than traditional wireless networks,

which makes it infeasible to deploy one sniffer for each channel. As a result, the sniffers

have to decide which subsets of channels they will operate on, referred to as sniffer channel

assignment problem. Second, SUs have to vacate the channels immediately once PUs start

transmissions on the corresponding channels. Such inevitable channel switching behavior

potentially complicates the sniffers’ traffic monitoring strategies. Last but not the least,

network traffic on each channel typically comes from multiple SUs, who share the spectrum by

7

following a certain medium access control (MAC) mechanism. Thus, traffic patterns observed

by the sniffers are highly dynamic, further complicating the sniffers monitoring strategies.

To meet these challenges, Chapter 3 elaborates the design of a monitoring framework for

CRNs, SpecMonitor, which utilizes a non-parametric density estimation method to model

SUs’ channel usage pattern.

1.2.3 Reliable CR Communications with Adversarial Software Ra-

dios

Jamming has been a major denial-of-service attack to wireless communications [23, 32]. By

intentionally emitting jamming signals, adversaries can disturb network communications,

resulting in throughput degradation, network partition, or a complete zero connectivity

scenario. Reactive jamming is one of the most effective jamming attacks. A reactive jammer

continuously listens for the activities on the channel, and emits jamming signals whenever

it detects activities, otherwise it stays quiet when the sender is idle. Reactive jamming is

regarded as one of the most effective, stealthy, and energy-efficient jamming strategies [25,33].

The recent advance in the highly programmable software defined radio (SDR) has made

such sophisticated but powerful jamming attacks very realistic – [34,35] demonstrated that a

reactive jammer is readily implementable and the jamming results devastating. Furthermore,

the reactive jamming can be triggered rapidly on any field of the packet, making it a realistic

threat for wireless communications.

On the other hand, orthogonal frequency-division multiplexing (OFDM) has developed into

a popular scheme for broadband wireless communications. Modern wireless communica-

tion systems, such as WLAN, digital TV systems and cellular communication systems, all

adopt OFDM as one of the primary technologies. While OFDM systems are robust against

8

multipath fading and can cope with severe interference and noise, they are not ideal for

environments where adversaries try to intentionally jam communications. The increasingly

severe hostile environments with advanced jamming threats prompt the development of secu-

rity extensions to the OFDM systems. Multi-input multi-output (MIMO) has also emerged

as a key technology for wireless networks. New wireless devices are equipped with a growing

number of antennas. MIMO can be exploited to obtain diversity and spatial multiplexing

gains, and lead to an increase in the network capacity. More importantly, recent advance

in MIMO interference cancellation (IC) technique [36–38] has greatly enhanced MIMO com-

munication capability under multiple concurrent transmissions. This inspires us to ponder:

whether it is possible to exploit IC technique in MIMO to mitigate jamming attacks tar-

geting OFDM systems, in particular, SDR based reactive jamming attacks. In Chapter 4,

we try to answer this question by examining the jammer’s capability in disrupting OFDM-

MIMO communications, and devising MIMO-based defense mechanisms by utilizing MIMO

technology coupled with IC and transmit precoding techniques.

1.2.4 Botnet Threats to Cognitive Communications in Core Net-

works

Botnet has become a major threat to the health of modern core networks. Through large-

scale compromise of end hosts, botmasters can commit organized cyber-crimes, such as

launching distributed denial-of-service (DDoS) attacks to cause unavailable connections,

sending spam and performing click fraud to violate the authenticity and trustworthiness

of communications, or stealing sensitive information to violate confidentiality of involved

entities.

The C&C channel is one of the most essential components of a botnet, through which a

9

botmaster manages the bot army of compromised end hosts. One common type of C&C

infrastructure relies on a central C&C server, which has recently drawn a great deal of at-

tention from security researchers and law enforcement forces. For the attackers’ point of

view, such centralized architecture suffers from a single point of failure problem, because if

the C&C server is identified and taken down, the botmaster will lose control over the whole

botnet. As a response, sophisticated botnet developers attempt to build more advanced and

resilient P2P C&C infrastructure [39]. P2P C&C allows the bots to exchange C&C mes-

sages via their connected peers in a P2P manner. Therefore, despite of numerous takedown

attempts [40], P2P botnets kept reviving. Some notable examples of active P2P botnets

include Sality [41], ZeroAccess [42], and Kelihos [43], which have survived in the wild for a

long time and will likely continue to be alive in the near future [44].

To date, a few solutions have been proposed to detect P2P botnets [45–48]. Host-level mal-

ware detection techniques such as traditional signature-based approaches and more recently

behavior-based approaches [49] have been designed. However, these approaches are subject

to advanced malware obfuscation or polymorphism and require host-side installation, and

thus appear unappealing to the network administrators aiming at uncovering a network-wide

botnet. Alternatively, network-level techniques have been proposed to correlate the traffic

patterns of suspicious bots [46,47,50–53] or collect network communication graphs to identify

P2P bots [45,54]. Some of them apply deep packet inspection (DPI), which is not only com-

putationally expensive, but is also evadable through message encryption. Other approaches

are based upon network flow traffic analysis. For instance, Yen et al. [46] described an algo-

rithm to differentiate P2P file sharing applications with P2P bots based on network traffic

features such as traffic volume, peer churn rate and interstitial time distribution. Recently,

Zhang et al. [47] developed a botnet detection system to extract statistical fingerprints for

every host, and identify the bots based on a set of traffic features such as communication

10

persistency, fingerprint similarity, and shared contacts’ number. However, the traffic features

used in these approaches are not robust enough to identify bots in a dynamic network, as

observed from our experiments. On the other hand, the communication graph-based ap-

proach [45] seems more reliable, but it can only identify structural P2P subgraphs regardless

of whether the subgraphs contain bots. Also, it requires a list of honeypot hosts to bootstrap

its detection algorithm, limiting its practicality.

In Chapter 5, we jointly consider two sets of features: flow traffic statistics and network

connection behaviors, to reveal the presence of P2P bots within a monitored network, such

as a campus network or an ISP network. We introduce PeerClean to utilize the best of

these feature sets via a novel combination of clustering and classification. The bot detection

performance of PeerClean relies on a dynamic group behavior analysis (DGBA) method

which investigates the group-level connection behaviors of the infected machines.

1.3 Research Contributions

This dissertation research uncovers emerging security issues in the main functions of CRNs,

including the potential security vulnerabilities in the spectrum sensing and spectrum oppor-

tunity exploitation [55]. We provide a variety of defense mechanisms to enhance the security

of CRNs. Furthermore, we design jamming resilient communication mechanisms to enable

reliable CR communications in the presence of SDR based reactive jammers. Finally, as

the core networks are increasingly sabotaged by the sophisticated malwares and botnets, we

propose to enhance the self-awareness of cognitive core networks by unveiling P2P botnet-

s through intelligent traffic analysis empowered by dynamic feature analysis and machine

learning. We make the following contributions from this research.

11

1. Distributed Outlier Detection based Spectrum Sensing: we analyzed the vul-

nerabilities of distributed consensus-based spectrum sensing by proposing several novel

attacks. They include naive ones that aim at causing disruption to sensing operations,

and more sophisticated covert adaptive data injection attack, which is capable of ad-

justing attack strategies via learning through perceived environments and stealthily

manipulating the sensing results without being caught by traditional detection schemes.

The latter is the first adaptive attack with learning capability in the area of secure spec-

trum sensing. We also discussed advanced collusion attacks that are hard to defend

against. We presented several protection mechanisms corresponding to the various at-

tacks we have identified. In particular, we proposed a novel robust distributed outlier

detection algorithm with adaptive local threshold to defend against covert adaptive

data injection attack, and a hash-based computation verification scheme to defend

against colluding attackers. Through extensive simulation and analysis, we showed

the severe impacts of covert adaptive attacks to distributed spectrum sensing. We

also presented the effectiveness of our detection mechanisms under various detection

parameters, network topologies and sensing data variances. Moreover, the costs of

proposed protection mechanisms are shown to be low. This part of the dissertation is

published in part in [56].

2. Non-Parametric Prediction based Traffic Monitoring: we designed a gener-

al framework to monitor the WhiteFi networks, which jointly considers the channel

availability and secondary user access pattern. In particular, we designed an online

non-parametric density estimation mechanism to model the secondary user channel ac-

tivity, which is able to support dynamic and complex access patterns. We formulated

the sniffer channel assignment problems as integer programming problems by incor-

porating the channel switching costs with the QoM objective, for which we provided

12

algorithms to maximize two different levels of QoMs respectively. We conducted ex-

tensive simulations and experiments to validate our statistical model and monitoring

framework. This part of the dissertation is published in part in [57].

3. Interference Cancellation based Jamming Resilient Communication: we ex-

ploited the MIMO IC and transmit precoding techniques to counteract reactive jam-

ming attacks for securing OFDM wireless communications. We proposed two novel

mechanisms: iterative channel tracking and sender signal enhancement (SSE) to effec-

tively sustain acceptable throughput under reactive jamming attack. We implemented

the jamming attack and defense mechanisms using USRP radios, and conducted ex-

tensive experiments to evaluate the performance in terms of packet delivery rate and

throughput. The experimental results showed that in the presence of various types of

reactive jammers with different power levels, the packet delivery rate improves signif-

icantly using our defense mechanisms with IC and SSE. This part of the dissertation

is published in part in [58].

4. Group Behavior Analysis based Botnet Detection: we proposed a novel botnet

detection framework, PeerClean, using the high-level features extracted from network

flow data based on the flow-level traffic statistics and dynamic network connection

patterns. It explores the best of these different features with a novel combination

of unsupervised (clustering) and supervised (classification) machine learning methods.

We designed a dynamic group behavior analysis method to automatically extract the

collective connection features from P2P host clusters. We showed through experiments

that the extracted group features are robust, reliable and effective in identifying various

types of P2P botnets. Moreover, a prototype system is designed to evaluate the system

using network traces from various real-world botnet families (including Sality, Kelihos

and ZeroAccess), as well as background traces from a large campus network. We

13

demonstrated through experiments that PeerClean can identify different types of bots

with up to 95.8% accuracy, and negligible false positive rate.

1.4 Organization

This dissertation is organized as follows. Chapter 2 presents the vulnerability analysis and

protection mechanisms for securing distributed consensus-based spectrum sensing in CRNs.

In Chapter 3, we describe the design and performance evaluation of a passive monitoring

framework for CRNs, with the objective of maximizing quality of monitoring. In Chapter 4,

we investigate the impacts of reactive jammer to wireless OFDM communications and explore

the use of MIMO technology for enabling jamming resilient communications. Chapter 5

depicts a system design based on machine learning methods to detect and track down P2P

botnets, who recently impose serious threats to the core networks but are extremely hard

to track down. Finally, Chapter 6 summarizes the research achievements, concludes this

dissertation and points out several future research directions.

Chapter 2

Secure Distributed Consensus-basedSpectrum Sensing in Cognitive RadioNetworks

Cooperative spectrum sensing is key to the success of CRNs. Recently, fully distributed co-

operative spectrum sensing has been proposed for its high performance benefits particularly

in cognitive radio ad hoc networks. However, the cooperative and fully distributed natures

of such protocol make it highly vulnerable to malicious attacks, and make the defense very

difficult. In this chapter, we analyze the vulnerabilities of distributed sensing architecture

based on a representative distributed consensus-based spectrum sensing algorithm. We find

that such distributed algorithm is particularly vulnerable to a novel form of attack called

covert adaptive data injection attack. The vulnerabilities are even magnified under multiple

colluding attackers. We further propose effective protection mechanisms, which include a ro-

bust distributed outlier detection scheme with adaptive local threshold to thwart the covert

adaptive data injection attack, and a hash-based computation verification approach to cope

with collusion attacks. Through simulation and analysis, we demonstrate the destructive

power of the attacks, and validate the efficacy and efficiency of our proposed protection

mechanisms.

14

15

2.1 Related Work

The existing work on secure cooperative spectrum sensing mainly focused on centralized

secondary network model. The vulnerabilities in the centralized model lead to two types of

attacks. The spectrum sensing data falsification attack is considered in [15–17,59–70]. Li et

al. [59] proposed dependent attack to deviate the OR rule at fusion center, in which case the

attackers know the reports from other secondary users. In [60], Fatemieh et al. presented

omniscient attack to stealthily manipulate the average power, by which coordinated attackers

have the measurements of the whole network. However, these attackers are incapable of

adapting their attack strategies for each sensing period. In contrast, our proposed attacks

have the following differences: 1) our attacks employ adaptive attack strategies with learning

capability, by which the attackers can adjust their strategies according to their perceived

local environments and sensing algorithm; 2) we consider distributed model, with only local

information available; 3) our attacks focus on consensus-based spectrum sensing algorithm

[19] to disrupt the consensus operation or covertly deviate the sensing result. Another

existing attack termed as primary user emulation attack is proposed in [15, 59, 71–74], in

which an attacker may mimic a primary user’s signal to evict secondary users, which is

complementary to our attacks.

For defense mechanisms, outlier detection is widely used, either by statistics-based meth-

ods [16] or signal propagation-based methods [17, 60]. Our outlier detection mechanism

depends on signal propagation model for classifying original measurements from differen-

t SUs. However, our approach further introduces an adaptive detection algorithm with

varying parameters, which differs from all existing work based on fixed defense strategies.

Furthermore, oppose to current detection mechanisms, ours will ensure the correctness of

consensus operations by integrating hash-based computation verification at each node that is

able to testify the integrity of data involved in its neighbors’ computations. The verification

16

process is different from existing hash-based scheme [75] that only attests the data from the

node itself.

2.2 System Model

In this section, we describe the network model considered in this work, and we briefly review

the distributed consensus-based spectrum sensing algorithm.

2.2.1 Network Model

We consider a CR network where PUs and SUs coexist. PUs are located far away from

the secondary network, but usually have high transmission power, so we assume the whole

secondary network is within the PUs’ transmission range. Different PUs working under

the same spectrum are separated far away enough to reduce interference. As a result, in

one secondary network, the entire spectrum can be regarded as multiple disjoint primary

channels and each SU needs to sense all of them. Without loss of generality, we consider the

scenario that all the SUs in a secondary network are detecting the incumbent existence in

one primary channel.

SUs form a CRAHN. We assume the collective coverage range of the SUs that form the

CRAHN is small compared with the coverage range of a PU, while the distance between two

SUs is long enough for spatial diversity exploitation. We assume the primary and secondary

network topologies remain unchanged during one sensing period, which starts from each SU

measuring its local received PU power level and ends upon achieving a unanimous decision

by all SUs. We also assume the communication links between SUs employ some reliable

communication protocol so the communications are error-free.

17

We adopt energy detection spectrum sensing method. The sensing output of each SU ni is

the received power of the PU, Pi, which can be expressed by following the signal propagation

model [76] as follows:

Pi = P0 − (10αlog10(di/d0) + Si +Mi)(dB), (2.1)

where P0 is the transmit power of PU, α is the path-loss exponent, d0 is the reference

distance (in this work, d0 = 1m), di denotes the distance from SU to the measured PU,

Si represents power loss due to shadowing fading, and Mi represents the multi-path fading

effect. We adopt the widely used log-normal shadowing model [76], by which Si is modeled as

a Gaussian random variable with Si ∼ N(0, σ2). We consider Mi as negligible. Therefore, Pi

can be modeled as a Gaussian distribution Pi ∼ N(µi, σ2), in which µi = P0 − 10αlog10(di).

For simplicity, we assume σ is distance-independent to PU, so that each SU experiences i.i.d.

Gaussian shadowing and fading.

2.2.2 Distributed Consensus-based Spectrum Sensing

In a nutshell, a distributed consensus-based spectrum sensing algorithm works as follows.

It starts a sensing period by each SU taking the local measurement of the received PU’s

signal using an energy detection mechanism. The SUs then exchange their local sensing

measurements with their direct neighbors. Each SU, upon receiving the updates from all

its neighbors, updates its sensing state following a state update algorithm; if the differences

among these new sensing states are above a certain threshold, an update is necessary and the

SU will send a state update message to all its neighbors. This process continues iteratively

till no more update is triggered and the sensing state at every node in the network has

reached the consensus. In what follows, we briefly review a distributed consensus-based

18

spectrum sensing algorithm focusing on the state update algorithm and distributed state

update protocol.

The secondary network can be modeled as an undirected graph, denoted by a pair (N , E),

where N = {n1, n2, . . . , nm} denoting a set of secondary nodes, E ∈ N 2 denoting a set of

undirected edges. We will use secondary nodes and SUs interchangeably in the following

sections. The performance of consensus algorithm is associated with the connectivity of

secondary network, which can be represented by an adjacency matrix of the network graph.

The consensus-based spectrum sensing algorithm can be expressed using a discrete-time state

equation:

xi(k + 1) = xi(k) + ε∑j∈Ni

(xj(k)− xi(k)), (2.2)

where the initial state xi(0) is the original sensing measurement of node ni, and xi(k) is the

updated state at time step k, Ni = {j|(j, i) ∈ E} ∈ N denotes the neighbor set of node

ni, and ε is a consensus parameter. State update occurs at discrete time k = 0, 1, 2, . . . for

each node locally. With some constraints on network connectivity and parameter ε [77], the

final average consensus result α = [∑m

i=1 xi(0)]/m is asymptotically reached for all nodes.

The final sensing decision at each node is made by comparing the consensus result with a

primary detection threshold γ as follows:

α =

[ m∑i=1

xi(0)

]/m

H1

RH0

γ, (2.3)

where H1 and H0 denote the hypotheses corresponding to the presence or the absence of

PU. The primary detection threshold γ is determined by performance requirements. For

instance, if we want to keep the primary miss detection rate PMD = P (α < γ|H1) below a

threshold, while minimizing the primary false alarm rate PFA = P (α > γ|H0), the threshold

19

γ is given as follows [78]:

γ =σ√mQ−1(1− PMD) + µα, (2.4)

where α ∼ N(µα, σ2/m), in which µα = (

∑mi=1 µi)/m, Q−1(.) is the inverse of well-known

Q-function.

Compared to the centralized cooperative spectrum sensing schemes, distributed consensus

protocols are fully distributed, scalable, and with exponential convergence rate. It well suits

the CRAHN.

2.3 Vulnerability Analysis of Distributed Consensus-

based Spectrum Sensing

Although the distributed nature of the consensus-based protocols entitles such protocols sig-

nificant performance benefits, it also exposes such protocols to a number of security threats.

In this section, we identify and evaluate the vulnerabilities of consensus-based protocols

under various potential attacks. We consider both passive and active attackers. We also

consider both insider and outsider attackers. An outsider attacker is one who may intercept

other nodes’ states, inject false states, perform replay attack, camouflage other honest nodes

with their captured identities, etc., but who does not possess valid security keying material.

An insider attacker is a compromised SU, who has knowledge of all the keying material

stored in the SU node if any, capable of manipulating its sensing measurements/states, and

then disseminating the spoofed information, etc. Note that another type of attackers is

faulty nodes who may output measurements/states with a large deviation due to hardware

or software failure. We do not differentiate intentional attackers from faulty nodes as the

consequence caused is the same.

20

To facilitate our analysis, we classify the attacks into two categories based on the intended

objectives and consequences: disruption of sensing operation and stealthy manipulation of

sensing results. Attackers in the former category have limited information and capabilities,

and typically launch arbitrary attacks with the objective to disrupt the sensing operation.

In contrast, attackers in the latter category are more capable. To avoid being detected, they

can adapt their attack strategies to the perceived environment, and can collude with each

other.

Note that, we do not consider sybil attack in which a node fabricates multiple identities, or

radio-jamming attack. Those are considered outside the scope of this work.

2.3.1 Disruption of Sensing Operation

We identify two types of attacks that can lead to disruption of sensing operation. The attacks

in this category may come from insider attackers, outsider attackers or faulty nodes. We

also analyze their harmful impacts on consensus-based spectrum sensing algorithm.

Blocking attack

Blocking attack refers to unexpected cease of information transmission from a SU. This is

the weakest attack in the sense that the only induced damage is the isolation of the SU

and possible partition of the network graph. Intuitively, if the network graph is divided into

several subgraphs, the consensus can only be reached for each isolated subgraph. Its impacts

are stated in the following theorem:

Theorem 1. let A ∈ Mn×n be the adjacency matrix of a secondary network. After blocking

several secondary users by the attackers, if the adjacency matrix of the remaining network

A ∈Mn×n satisfies

21

(I + A)n−1 > 0,

the attackers achieve no more benefit than defeating several secondary users. Otherwise, the

whole secondary network is partitioned so that a global decision can never be attained.

Proof. The well known Perron-Frobenius theory [79] states that if a non-negative matrix

A ∈ Mn×n is irreducible, the graph based on adjacency matrix A has full connectivity.

Furthermore, based on the theorem about irreducible matrix [79], we learn that if (I +

A)n−1 > 0, the matrix A is irreducible. Thus, if (1) satisfies, the remaining graph is still

connected after eliminating the abnormal SUs, in which case the attackers gain no advantage

except stalling these SUs. Otherwise, the remaining graph is disconnected.

Arbitrary False Data Injection Attack

Here, the attacker injects forged data during each consensus iteration. Two forms of such

attack, differing in their ways of data injection, are presented as follows.

• Constant data injection: the attacker ignores the state update algorithm and keeps

transmitting a constant value in each state update message. We have the following theo-

rems to illustrate the impacts of single attacker scenario and multiple attackers’ scenario

separately:

Theorem 2. In a connected undirected graph with one arbitrary false data injection attacker

sending constant data, a consensus can be asymptotically reached which equals to the constant

value injected by the attacker, for any set of initial states.

Proof. Assume node nt is the attacker, we can write the consensus algorithm at this malicious

node as follows:

22

xt(k + 1) = xt(k). (2.5)

Thus, the graph around the attacker becomes a directed graph with zero input-degree and

multiple output-degree. With multiple output-degree, the attacker injects constants to its

neighbors. Therefore, we can consider the manipulated graph has a spanning tree1 rooted

from the attacker. Notice that the attacker converts the Perron matrix P into P with the

t-th row becoming an all-zero vector except the only ‘1’ at the t-th entry. However, P is

still a stochastic matrix with positive diagonal entries. As proven in [80], we know that if

the matrix has a spanning tree and positive diagonal entries, the discrete-time consensus

algorithm with Eq. (2.2) and Eq. (2.5) achieves consensus asymptotically. Unfortunately,

the whole network will converge to a static value, which is the constant value injected by

the attacker.

Theorem 3. In a connected undirected graph with more than one arbitrary false data injec-

tion attackers sending constant data at different values, a consensus cannot be reached for

the whole network.

Proof. The graph with more than one attackers will have no spanning tree, because more

than one node are disseminating different information with zero input-degree. As proven

in [80], if the directed graph has no spanning tree, the discrete-time consensus algorithm will

not converge.

Intuitively, the consensus algorithm with a single attacker will take a much longer time to

converge than the normal case, because the information flow is sourced from a single node,

while in the normal case information are more evenly distributed across the whole network

1A spanning tree of a directed graph is a directed tree formed by graph edges that connect all the nodesof the graph.

23

and all the nodes cooperatively propagate the information flow. Therefore, such attack also

delays the consensus reaching time.

• Random data injection: the attacker injects random values into its neighborhood at

each iteration. The impacts of random data injection attack are hard to analyze, but we

can conclude that unstopped random value injection will disrupt the consensus algorithm by

causing network divergence in most cases or converging to an arbitrary value.

2.3.2 Stealthy Manipulation of Sensing Results

The goal of stealthy manipulation of sensing results is to reverse the consensus result of a

PU’s status, either from absence to presence, i.e. exploitation objective, or from presence

to absence, i.e. vandalism objective [60]. With the exploitation objective, attackers can

evacuate other SUs to obtain exclusive usage of the available spectrum, while with vandalism

objective, attackers will cause severe interference to the primary network. We propose two

novel attacks: (1) covert adaptive data injection attack that achieves stealthy manipulation

of sensing results with independent attacker(s), and (2) covert adaptive data injection attack

with node collusion.

Covert Adaptive Data Injection Attack

This attack has two major features: “adaptive” means the attacker can adapt its strategy

based on neighbors’ state update information, with prior knowledge about the detection

algorithm; “covert” reflects the attacker’s goal of covertly manipulating the sensing results,

without being detected by the detection mechanism. Outsider attackers can be effectively

expelled from the network with an authentication mechanism. In this work, we focus on

insider attackers that reside in legitimate nodes.

24

Outlier detection algorithms are commonly used to defend against insider attackers perform-

ing data injection attacks. Generally, the current outlier detection algorithms all rely on a

static attack detection threshold2 λ to classify honest SUs and attackers. Suppose we di-

rectly apply these detection algorithms into the distributed system, i.e. at each iteration, a

detection algorithm automatically expels the abnormal SUs. However, according to Kerck-

hoffs’s principle, if the attacker knows the detection algorithm and detection threshold λ, a

covert adaptive data injection attacker defined below is capable of bypassing the traditional

detection algorithms.

Attack strategy. (1) At each iteration, the attacker first collects all its neighbors’ states

normally.

(2) The attacker then computes a maximal acceptable deviated state based on all its neigh-

bors’ submitted states and the threshold in the detection algorithm. The difference between

the deviated state and genuine state indicates the attack strength a(k), where k ∈ [0, kstop]

is the attack time.

(3) Finally, it injects the forged state into its neighborhood.

To illustrate the attacker strategy of computing a maximal acceptable deviated state, we give

an example of a simple detection scheme with a threshold λd, by which one node na is flagged

as attacker whenever any of its neighbors detect its abnormality. Assume the attacker has

vandalism objective with its neighbors’ states as {st1, ..., st|Na|}, then the maximal acceptable

deviated state can be { maxi=1→|Na|

(sti)− λd} to avoid being detected.

In practice, if the attacker is unable to collect all its neighbors’ states before sending its own

states, at current iteration, it replaces its previous state with a maximal acceptable deviated

2This threshold is for the purpose of detecting the presence of attacks, which is different from primarydetection threshold γ mentioned before. In the later section, unless otherwise noted, the detection thresholddenotes attack detection threshold.

25

state calculated by a collection of neighbors’ previous states. And then, it updates its current

state with the spoofed previous state and sends it to the neighbors. In this case, the attack

strength enforced at the current state will directly affect the next state of the network.

For an attacker, the knowledge of attack stop time kstop is crucial for launching a successful

attack. If kstop → ∞, the consensus protocol will not converge. The following inequalities

show the basic principle in terms of the amount of changes an attacker has to inject in order

to fulfill exploitation objective and vandalism objective respectively:

x+

∑kstopi=0 a(i)

m> γ, a(i) ≥ 0, exploitation objective, (2.6)

x+

∑kstopi=0 a(i)

m< γ, a(i) ≤ 0, vandalism objective, (2.7)

where x is the average value of the original measurements of the whole network, which is also

the legitimate consensus result, and m is the number of nodes in the network. Because the

consensus algorithm has invariant average quantity [77], the impact of each attack strength

a(i) can be quantified as augmenting the final consensus result by a(i)/m. However, the

attackers have no way of knowing x without the global knowledge. Therefore, we propose

an iterative stop strategy. Let xmin(k) and xmax(k) be the minimal and maximal state

from neighbors of the attacker at k-th iteration, the attacker injects forged states only when

xmin(k) < γ or xmax(k) > γ for exploitation or vandalism objectives respectively. Otherwise,

the attacker follows the consensus protocol. The proposed stop strategy guarantees the

attacker’s neighborhood to achieve the objective first, whose deviated states will then spread

through the whole network for reversing the consensus result.

Multiple attackers with the covert adaptive attack strategy can jointly set their attack

26

strengths, so that the protocol converges faster to their desired consensus objectives. Covert

adaptive data injection attack is effective in evading traditional outlier detection. In section

2.4.1, we will present a novel detection mechanism to invalidate such attack.

Covert Adaptive Data Injection Attack with Node Collusion

The covert adaptive data injection attack becomes even more powerful and harder to defend

against when nodes start to collude. Not only can such attack obtain a faster convergence

rate to their desired objectives, but it can also evade the computation verification scheme

proposed in section V-B. Both the insider attacker and outsider attacker can perform such

collusion attack.

When we involve the protection mechanism with computation verification to check the legit-

imacy of consensus operation, collusion attackers will avoid being caught by sending forged

verification to cover up each other’s false data. Moreover, stronger collusion attackers are

capable of employing more malicious neighbors for false validations. In section 2.4.2, we

will present a hash-based computation verification mechanism to defend against collusion

attacks.

2.4 Protection of Distributed Consensus-based Spec-

trum Sensing

The vulnerabilities of distributed consensus-based spectrum sensing algorithm demand for

effective protection mechanisms. In this section, we first present a robust distributed outlier

detection mechanism with adaptive local threshold to counter covert adaptive data injection

attack. Then, we put forward a hash-based computation verification mechanism that ensures

27

the correctness of a neighbor’s state update process to thwart collusion attacks by common

neighbor cross-validation.

2.4.1 Robust Distributed Outlier Detection with Adaptive Local

Threshold

The goal of this protection mechanism is to detect and eliminate the abnormal states injected

by attackers. As described in section 2.3.2, the traditional outlier detection mechanisms

used in the existing spectrum sensing rely on a fixed and known global detection threshold,

which enables an attacker to have strengthened attacking capabilities throughout the whole

consensus process.

According to the consensus algorithm, the maximum state of the network is monotonically

decreasing, while the minimum state of the network is monotonically increasing until reaching

consensus [81]. The updated states of each node are bounded by the maximum and minimum

states, which gradually converge on the same value, while the differences among updated

states of various SUs are bounded by the differences between the maximum and minimum

states, which gradually diminish until reaching zero.3 So the main idea of our detection

algorithm is to use localized detection threshold at each node, and adapt the threshold

with the diminishing behavior of state differences. The benefits of adaptive local threshold

are twofold: (1) it becomes more difficult for attackers to guess all the instant detection

thresholds of its neighbors without two-hop information; (2) the detection thresholds drop

with the shrinking of variances among different states. Especially at the final consensus

state, the detection thresholds reach zero which gives zero-tolerance to the attackers.4

3Although the differences between the updated states of any two SUs are not monotonically decreasing,the diminishing trends are guaranteed.

4Even if the attacker has global knowledge to guess all the instant detection thresholds, and compute adeviated state to bypass the detection at each iteration, the influence to the final results will be limited due

28

To illustrate the detection mechanism, we assume a common communication range for each

SU dcr. Consider to compute the threshold at honest SU nh in its neighborhood, we have its

two honest neighbors ni and nj with distances di and dj from the PU with dj = di + ∆dij,

0 < ∆dij ≤ 2dcr. We use the method in [17] to find a detection threshold λh0 for SU nh

at starting time, such that with high probability (e.g. 0.99), xi(0) − xj(0) ≤ λh0 satisfies.

According to Eq. (2.1), the distribution of difference is as follows:

xi(0)− xj(0) = N(10αlog10di + ∆dij

di, 2σ2). (2.8)

For a fixed di, ∆dij = 2dcr will maximize the mean of the distribution. We assume a

large attenuation factor α to reduce the influence from uncertainty of α. Then, di is

estimated through a robust statistic estimation. Median estimate is used in [17], while

biweight estimate [82] is another good candidate, which has a higher efficiency in terms of the

variance of estimation. To trade off the overhead and performance, we use median and bi-

weight estimation comparatively to determine the estimation of xi(0): xest =median(xk(0))

or biweight(xk(0)) for k ∈ Nh. Thus, di ≈ dest = 10P0−xest

10α . Now we have the following

distribution of difference:

xi(0)− xj(0) = N(10αlog10dest + 2dcr

dest, 2σ2). (2.9)

Assume Pr(xi(0)−xj(0) < λh0) > 1−µ, where µ is detection parameter (typically µ = 0.01),

we can calculate λh0 using Eq. (2.9) as λh0 =√

2Q−1(µ) + 10αlog10(dest+2dcrdest

).

Up to now, we obtain the detection threshold at starting time. Then the updating thresh-

old of node nh at k-th iteration is denoted as λhk. From the above deduction, we notice

to the shrinking thresholds.

29

the implicit meaning of detection threshold is the maximal acceptable difference between

two honest SUs in the neighborhood. Then in order to adapt the detection threshold ac-

cording to the shrinking difference, we calculate the robust statistic estimate for differences,

estdifhk =median(|xj1(k) − xj2(k)|) or biweight(|xj1(k) − xj2(k)|), where nj1, nj2 ∈ Nh.

Therefore, we propose the following updating algorithm:

λh(k+1) =estdifh(k+1)

estdifhkλhk. (2.10)

As estdifhkk→kfinal−−−−−−→ 0, we can guarantee λhk

k→kfinal−−−−−−→ 0, finally revealing zero-tolerance to

attackers. To prevent attackers from forging an alarm to exclude legitimate nodes, we adopt

majority rule to dispute any suspicious attacker. The whole protocol is described as follows:

• Every node computes its detection threshold at each iteration according to Eqs. (2.9) and

(2.10), and then identifies suspicious attackers in its neighborhood.

• Once a node discovers a suspicious attacker, it broadcasts a primitive alarm to its neighbors

which will not be forwarded.

• Assume the number of common neighbors between a node and the suspicious attacker is

B. If the node collects no less than dB/2e primitive alarms from the common neighbors, it

will broadcast a confirmed alarm and forward it to the remaining network.

• Finally, once the presence of covert adaptive data injection attackers is disclosed, it is

straightforward to handle them or eliminate their impacts.

30

Assume: every SU in the network owns a unique ID and shares a unique key with every neighbor;every pair of neighbor nodes has at least one common neighbor; there exists a secure neighborhooddiscovery mechanism, with which each node can obtain two-hop neighborhood information. Inthe k-th iteration (k > 0) (when k = 0, every node first submits its measurement to its neighborsusing authenticated broadcast.)

• Update Commit: (First round) after one node collects all its neighbors’ submissions, it com-putes and disseminates an updated submission using authenticated broadcast containinga hash commitment of its inputs together with its own data. Therefore, node nv receivesa collection of updated submissions from its neighbors.

• Distributed Verification: (Second round) every node disseminates all its neighbors’ datacollected at the beginning of first round using authenticated broadcast, so node nv receivesa collection of data from the two-hop neighbors. Then node nv performs the followingverification: (1) it checks the IDs in the collection are consistent with its stored neighborIDs; (2) it checks whether its own data and the common neighbors’ data are incorporatedin each updated state by recomputing each updated state and hash commitment; (3)whenever one of the verification for node np fails, multipleMACKvi

(ERR, IDp) will spreadthrough the whole network to stop the state update process, where ERR is a uniquemessage identifier, Kvi is the shared key between nv and its neighbor ni.

Figure 2.1: The distributed hash-based verification of neighbor state update.

2.4.2 Hash-based Computation Verification of Neighbor State Up-

date

The above outlier detection mechanism detects abnormal node measurements/states in the

statistical sense. It is only effective when the majority of nodes in a neighborhood are honest.

When malicious colluding nodes exist, the statistical outlier detection methods become less

effective. In order to defend against collusion attacks, each node must ensure: (1) the

authenticity and integrity of the updated states sent by neighbor nodes; (2) the state update

algorithm has been followed truthfully at a neighbor node nv, i.e. it has correctly executed

the update algorithm using all nv’s neighbors’ data.

To realize the above additional goals, we propose a hash-based approach with common

neighbor cross-validation to counter collusion attacks with honest common neighbors and

provide computation verification. To provide sensing data legitimacy check, data authen-

ticity/integrity and computation verification simultaneously, we can combine our outlier

31

detection with the hash-based verification mechanism. Next we will focus on the hash-based

verification mechanism.

We assume each node has a unique identifier and shares a unique secret symmetric key with

every neighbor. In addition, every node uses authenticated broadcast such as µTESLA [83]

to send messages to its neighbors to enforce message authenticity and integrity. In the fully

distributed CRAHN, every node should keep a unique one-way key chain and send the key

chain commitments to every neighbor.

The main goal of this approach is to ensure each neighbor node perform trustworthy state

updates. We adapt the idea of common neighbor cross-validation in traditional secure data

aggregation techniques [75] to counter collusion attacks. In our scheme, each SU is responsi-

ble for checking not only its own contributions, but also the common neighbors’ contributions

incorporated into the updated states. The details of our proposed scheme are illustrated in

Fig. 2.1.

We assume there exists a secure neighbor discovery mechanism [84], by which each node can

securely discover its two hop neighbors during the network initialization process. Each iter-

ation contains two phases: update commit and distributed verification. We give an example

to illustrate how the scheme works. In the k-th iteration, the initial submission of node nh

has the format: 〈k, IDh, value〉, where value is the power measurement. In the update com-

mit phase, node nh collects the following data from its neighbors: d(k−1)1 , d

(k−1)2 , . . . , d

(k−1)q ,

and its updated submission s(k)h has the format:

〈k, IDh, state(k),

H(k‖IDh‖state(k)‖ID1‖d(k−1)1 ‖ID2‖d(k−1)

2 ‖ . . .

32

Table 2.1: Simulation parameters

Parameter Value DescriptionN 50 Number of secondary usersRs 1km Length and Width of secondary networkdsp 5km Distance between primary user and center of

secondary networkP0 66dBm Transmission power in dBmdcr 300m Communication range of secondary userα 3 Path-loss exponentσ 3dB Standard variance for fading and shadowingN0 -80dB Noise powerε 0.05 Consensus parameter

|IDq‖d(k−1)q ‖)〉, k > 0.

The hash digest in each submission is called hash commitment used for computation in-

tegrity check. In the distributed verification phase, nh disseminates its neighbors’ data

〈ID1, d(k−1)1 , ID2, d

(k−1)2 , . . . , IDq, d

(k−1)q 〉 using authenticated broadcast. Each node in its

neighborhood can then recompute the updated states based on Eq. (2) and regenerate the

hash commitments, and compare the updated states and hash commitments with the received

ones to verify the computation. This approach enables each honest node to check whether

its neighbor nodes have performed the consensus-based state update algorithm correctly. In

addition, as long as each pair of colluding attackers shares one honest common neighbor,

this scheme can also detect colluding attacks. We provide a detailed security analysis in

Section 2.5.3.

33

0 10 20 30 40 50 60 70 80 90 100−50

−49

−48

−47

−46

−45

−44

−43

−42

−41

−40

Iteration Step

Nod

e S

tate

s

node 1node 2node 3node 4node 5node 6node 7node 8node 9node 10

(a) No attacker case

0 50 100 150 200 250 300−70

−65

−60

−55

−50

−45

−40

−35

Iteration Step

Nod

e S

tate

s

node 1(Attacker)node 2node 3node 4node 5node 6node 7node 8node 9node 10

(b) One covert adaptive attacker case

0 50 100 150 200 250 300−80

−75

−70

−65

−60

−55

−50

−45

−40

Iteration Step

Nod

e S

tate

s

node 1(Attacker)node 2(Attacker)node 3(Attacker)node 4(Attacker)node 5(Attacker)node 6node 7node 8node 9node 10

(c) Ten covert adaptive attackers case

0 2 4 6 8 10 12 14 16 18 200

50

100

150

Number of Attackers

Atta

ck S

top

Tim

e &

C

onve

rgen

ce T

ime

Attack Stop TimeConvergence Time

(d) Impact of attacker population

Figure 2.2: Performance of covert adaptive data injection attack to distributed consensus-based spectrum sensing protected by traditional detection scheme with detection threshold-56dB.

2.5 Evaluation

In this section, we first evaluate the vulnerabilities of distributed consensus-based spectrum

sensing. We then demonstrate the effectiveness and present the security analysis of our

proposed protection mechanisms, followed by a numerical analysis on their efficiency. Table

3.2 lists the system parameters used in our simulation with MATLAB. The performance

results are averaged over 1000 simulation runs.

34

2.5.1 Impact of Covert Adaptive Data Injection Attacker

Fig. 2.2(a)-Fig. 2.2(d) show how vulnerable the consensus-based spectrum sensing with tradi-

tional outlier detection scheme [17] is under covert adaptive data injection attacks. Fig. 2.2(a)

depicts the normal behavior in terms of protocol convergence without attackers. In less than

10 iterations, the difference among all the nodes becomes less than 1dB, which means a

consensus has been reached. Fig. 2.2(b) shows the effect of a single attacker launching the

attack. It stealthily deviates its states in order to subvert the consensus results for vandalism

objective. In around 60 iterations, the nodes’ states in the attacker’s neighborhood has been

dragged lower than the threshold so the attacker temporarily stops injecting false states and

starts to follow the algorithm properly. After that, the attacker repeatedly enforce attack

strength for several iterations whenever it finds the maximal state in its neighborhood stays

higher than γ, until the consensus of the whole network. At around the 100th iteration,

the network reaches a consensus but it is a wrong one (i.e., the opposite one). When there

are multiple attackers working for the same objectives, the consensus can be reached much

faster as shown in Fig. 2.2(c). Finally, Fig. 2.2(d) demonstrates the connection between the

attack stop time and the convergence time5 with respect to the attacker population. We

observe that with the growing of attacker population, both times will decrease gradually,

indicating an increasing attack power. In general, we observe that the proposed attack can

be very effective in bypassing the traditional outlier detection mechanism and manipulating

the final spectrum sensing result.

5The convergence time is defined as the number of iterations before the network-wide consensus is reached.Reaching a consensus means the difference among all the node states falls below 0.5dB.

35

2.5.2 Effectiveness of Robust Distributed Outlier Detection with

Adaptive Local Threshold

We evaluate our proposed outlier detection scheme by comparing with existing outlier de-

tection scheme [17]. We study the impacts of detection parameter, network topology and

sensing data variance to two primary detection performance criteria, primary miss detection

rate PMD and primary false alarm rate PFA, determined by attack capabilities under the pro-

tection mechanisms. Fig. 2.3(a)-Fig. 2.3(c) show the effectiveness of our detection scheme,

which can successfully eliminate the impacts of attackers. We observe from Fig. 2.3(a), the

detection performance is insensitive to µ, because the attackers know the values of detection

threshold and µ, and employ them to bypass the detection scheme. Fig. 2.3(b) shows that

when the SU’s communication range increases, correspondingly the density of the neighbor-

hood increases, both PMD and PFA of existing scheme increase.However, with our detection

scheme, PMD and PFA both decrease, owning to the increasing attacker detection rate with

more neighbors. We notice that when communication range is extremely small, existing

scheme is shown to outperform our scheme. This is because when neighborhood has a small

population, cross-validation is less effective. Some honest nodes may be mistaken as attack-

ers, which potentially amplifies the impacts of uncovered attackers to the remaining network.

Fig. 2.3(c) indicates with the increase of data variance, the attackers have a much larger in-

fluence to the existing detection method, but our method effectively impedes the influence.

To further evaluate attacker detection performance, we involve two more criteria: P aD as the

attacker detection rate6 and P aMI as the attacker miss identification rate7. Fig.2.3(d) shows

both P aD and P a

MI are steadily growing with the increasing of data variance with our scheme,

while the increasing of P aMI is a side effect of our scheme caused by adaptive threshold, but

6This rate is defined as the probability of detecting one attacker.7This rate is defined as the probability of mistakenly identifying a legitimate node as an attacker.

36

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Primary miss detection & false alarm performance

µ

PM

D

&

PF

A

PMD

with existing scheme

PMD

with our scheme

PFA


PFA

with our scheme

(a) Impact of detection parameter µ

100 150 200 250 300 350 4000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9


Communication Range of Secondary users

PM

D

&

PF

A

PMD


PMD

with our scheme

PFA


PFA

with our scheme

(b) Impact of communication range

0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9


Variance of Channel Fading and Shadowing

PM

D

&

PF

A

PMD


PMD

with our scheme

PFA


PFA

with our scheme

(c) Impact of sensing data variance

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9Attacker detection & miss identification performance

Variance of Channel Fading and Shadowing

Pa D

&

P

a MI

PaMI


PaMI

with our scheme

PaD


PaD

with our scheme

(d) Attacker detection performance

Figure 2.3: Performance of robust distributed outlier detection with adaptive local threshold

it will not degrade the primary detection performance.

2.5.3 Security Analysis of Hash-based Computation Verification

Approach

We first consider the case where there is a single attacker A in the network. The securi-

ty of the hash-based computation verification scheme is based on the following. (1) The

37

(a) Pairwise collusion attack (b) Collusion attack with honest common neigh-bors

(c) Collusion attack without honest commonneighbor

(d) Neighborhood collusion attack

Figure 2.4: Different collusion styles (solid points represent attackers, hollow points representhonest SUs)

secure neighborhood discovery scheme utilized in the network initialization process ensures

each node learn two-hop neighborhood information securely, so that the attacker can nei-

ther discard the state value of a legitimate neighbor node nor include a forged state value

while updating its state. (2) The message authenticity and integrity are guaranteed by the

broadcast authentication. (3) Whether the attacker A uses a different state value d′(k−1)B

from what submitted by a neighbor node B for state update computation, and includes

d′(k−1)B in the message sent to B in the verification phase; or computes a wrong state′(k)

using the correct neighbor states, the inconsistency will be detected by B. (4) Otherwise,

38

based on the collision resistant property of hash function, it is computationally infeasi-

ble to generate a valid hash commitment H(k||A||state′(k)||ID1||d′(k−1)1 || · · · ||IDn||d

′(k−1)n ) =

H(k||A||state(k)||ID1||d(k−1)1 || · · · ||IDn||d(k−1)

n ), where at least one of the primed state values

does not equal to the authentic ones.

Next, we discuss the security of state update algorithm with colluding attackers. In Fig. 2.4,

we identify four types of collusion attacks based on their increasing colluding capabilities:

• Pairwise collusion: this collusion (in Fig. 2.4(a)) emphasizes the collusion by two neigh-

boring attackers.

• Collusion attack with honest common neighbors: this collusion (in Fig. 2.4(b))

involves more neighbors of two pairwise attackers as their collusion companions, but every

pair of nodes has at least one honest common neighbor.

• Collusion attack without honest common neighbor: this collusion (in Fig. 2.4(c))

involves all the common neighbors of two pairwise attackers as their collusion companions.

•Neighborhood Collusion: If all the neighbors of an attacker are also malicious attackers,

they form a neighborhood collusion (in Fig. 2.4(d)).

We show that the hash-based computation verification scheme is able to deal with the former

two types of collusion attacks, but not the latter two. In cases that pairwise attackers exist

(including Fig. 2.4(a) and Fig. 2.4(b)), a malicious neighbor node can cover up the forgery

of the central node. However, inconsistency will still be discovered by a honest common

neighbor of these colluding nodes, because the honest neighbor overhears colluding attacker’s

input state value.

However, when there exists collusion attackers without honest common neighbor (Fig. 2.4(c)),

one node can arbitrarily deviate its malicious neighbor’s states without being detected. Note

39

that neighborhood collusion in Fig. 2.4(d) cannot be addressed by distributed computation

verification schemes. The central attacker in the colluded neighborhood is regarded as a

hidden attacker, whose malicious behavior is most difficult to detect, because none of its

neighbors is honest to follow the verification mechanism. Thus, the misbehavior of a hidden

node will continue and eventually, the entire network will be inevitably controlled by the

hidden attacker.

2.5.4 Cost Evaluation of Hash-based Computation Verification

Approach

Finally, we evaluate the efficiency of our proposed hash-based computation verification

scheme through its computational and communication costs. We skip the discussion of

the overhead for authenticated broadcast. We measure the computational cost by the num-

ber of hash operations at one node in each iteration, while assessing the communication cost

in terms of number of transmitted/received bytes. Another important metric for compu-

tational cost is key storage, which is defined as number of keys stored by each node. The

costs are listed in Table 2.2, where we estimate both the iteration number and node ID by

one byte. The computational costs of the hash operation depend on the number of input,

which relies on the number of neighbors. Therefore, the computational cost of this approach

is determined by the number of neighbors. The table shows that the computational and

communication costs are both acceptable.

40

Table 2.2: Costs of hash-based computation verification scheme at one node for each iteration(N is the number of neighbors, P is the length of state in bytes, h is the length of hash inbytes.)

Computation O(N + 1)Communication N(NP +N + h+ 1)Key Storage N

2.6 Summary

In this chapter, for the first time, we investigated the vulnerability and protection of consensus-

based spectrum sensing. We proposed various attacks that can disrupt the consensus algo-

rithm or stealthily subvert the sensing results, especially the covert adaptive attacks with

learning capability. We also developed a robust distributed outlier detection scheme with

adaptive local threshold to counter covert adaptive attacks by exploiting the state conver-

gence property. In addition, a hash-based computation verification scheme is presented to

effectively defend against colluding attackers. Our simulation results demonstrated the se-

vere vulnerabilities of distributed spectrum sensing, and also showed that our protection

mechanisms are secure, robust, and efficient.

Chapter 3

Non-Parametric Passive TrafficMonitoring in Cognitive RadioNetworks

Passive monitoring by distributed wireless sniffers has been used to strategically capture

the network traffic, as the basis of automatic network diagnosis. However, the traditional

monitoring techniques fall short in CRNs due to the much larger number of channels to be

monitored, and the secondary users’ channel availability uncertainty imposed by primary

user activities. To better serve CRNs, in this chapter, we propose a systematic passive

monitoring framework, SpecMonitor, for traffic collection using a limited number of sniffers

in Wi-Fi like CRNs. We utilize a non-parametric density estimation method to model SUs’

channel usage pattern. This method makes no assumptions on the unknown distribution

of channel access pattern, thus offers accurate and flexible models which can be updated

in an online fashion with little complexity. Moreover, we design a sliding window method

to perform online learning of data dynamics, and an accumulative combination method to

further improve modeling accuracy. Then, SpecMonitor takes inputs from SUs’ channel

usage model to construct near-optimal monitoring strategies.

We consider two levels of monitoring objectives: frame-level and user-level, to serve different

41

42

network diagnostic problems. The frame-level objective can be interpreted as maximizing

the frame-level quality-of-monitoring (FL-QoM), defined as the amount of captured MAC

frames of interest, due to their significance for the subsequent aggregated traffic analysis

[31]. The user-level objective is to maximize the user-level quality-of-monitoring (UL-QoM),

defined as the expected number of active users monitored, which can facilitate user behavior

analysis [85]. We cast the monitoring optimization problem as a sniffer channel assignment

problem with objective of maximizing the corresponding QoMs.

3.1 Related Work

Passive Monitoring in Traditional Wireless Networks: Passive monitoring in wireless

networks has been an active research area. Yeo et al. were the first to use dedicated sniffers to

passively measure a Wi-Fi network, successfully identifying protocol anomalies and malicious

WLAN usages [29]. Cheng et al. presented Jigsaw, which is a large-scale passive monitoring

infrastructure to collect and dissect wireless traffic for cross-layer network diagnosis in a

large enterprise Wi-Fi network [30, 31]. While the above works focused on developing the

monitoring infrastructure, some recent works investigated the problem of optimal sniffer

channel assignment to maximize the amount of monitored information. Shin et al. [86]

considered to obtain optimal strategies by selecting a limited number of sniffers to monitor

multiple channels in wireless mesh networks, in which they formulated the sniffer channel

assignment problem as a maximal coverage problem and design approximation algorithms

to solve this problem. In [87], Chhetri et al. further extended the preceding work by taking

into account the users’ access patterns. They proposed two monitoring models: user-centric

model and sniffer-centric model. However, they assume the statistics for different users’

activities are known. Recently, Arora et al. [88] proposed to use multi-armed bandit to

43

perform sequential learning of the unknown channel statistics, which can be used to facilitate

optimal channel assignments. However, multi-armed bandit is too complex to be used for

online and efficient channel assignments. In this chapter, we present an efficient online

channel assignment mechanism without any prior knowledge of channel access statistics,

which can provide optimized channel assignments in real-time. Note that all the above

works only considered maximizing the number of active users covered by the sniffers, while

we further address the problem of maximizing the number of captured frames.

Spectrum Monitoring in Cognitive Radio Networks: Chen et al. studied frame

capturing problem for network forensics in CRNs [89], in which support vector regression

(SVR) method is employed to predict the frame arrival time to guide channel assignments.

They have similar objectives as ours, however, our method has the following advantages: 1)

SVR method requires a time consuming training phase, while we utilize density estimation to

produce new estimates in an online fashion avoiding of the expensive training and retraining

phases; 2) SVR method falls short of dealing with interleaved traffic from multiple users,

which corresponds to dynamic traffic statistics, while our scheme can adapt promptly to the

traffic dynamics; 3) their monitoring framework has poor performance when the monitored

channels carry high data rate traffic, because of frequent channel switching behavior induced

by the heuristic channel assignments. In contrast, as we jointly consider channel switching

costs and frame capturing gains to optimize channel assignments, our method can achieve

better performance with fast traffic flows. Recently, Yi et al. formulated the secondary user

data capturing problem as multi-armed bandit problem [90], which takes a long period of

learning process before yielding an accurate estimation of user access pattern. Thus, their

method is not efficient enough to capture adaptive and interleaved traffic patterns either.

44

Figure 3.1: An overview of SpecMonitor system

3.2 System Model

3.2.1 Monitoring System Model

In this section, we describe the monitoring system model for CR networks. We consider CR

networks with coexisting PUs and SUs. The most common PUs are TV towers and wireless

microphones (WMs). As PUs’ networks are regulated by service providers or specific WM

users, they are out of the interests of our monitoring system. Instead, our monitoring system

is interested in the network traffic from SUs including APs and clients who form a WhiteFi

network, as illustrated in Fig. 3.2. In WhiteFi networks, multiple clients share their working

channels, decided by the APs, using widely adopted CSMA/CA mechanism. Curious readers

please refer to [91] for the design and implementation details of WhiteFi system.

Intuitively, one may consider the APs in the WhiteFi system as appropriate monitoring

devices, because all the traffic of SUs will directly go through APs. However, the passive

monitoring using sniffers have multiple advantages over AP-side monitoring: first, APs are

secondary users, who may be compromised by the adversaries; second, the sniffers are able

to monitor several WhiteFi networks operating on different channels concurrently; third, the

monitoring system can reveal the detailed PHY/MAC information, such as the physical layer

header information including the signal strength, noise level for every individual packet, etc.,

which is important for security monitoring [29], yet unattainable at AP side.

For a multi-hop network, we segment the whole network region into small regions, called

45

Figure 3.2: Monitoring system archi-tecture for WhiteFi network inside amonitoring area

Figure 3.3: The percentage of framesin active slots (20 ms slot length)

monitoring areas, and assign a certain number of sniffers to monitor the traffic for each

monitoring area. Each sniffer may be equipped with multiple antennas, which allow him/her

to sense/capture traffic over multiple channels at one time. We assume different AP-client

pairs in the monitoring area pick different working channels to avoid interference, and each

sniffer can overhear all the inbound and outbound traffic from any secondary device inside its

monitoring area if they tune into the same channel. Similar to [89], some sniffers are used as

dedicated inspection sniffers that periodically sense channels to gain channel usage statistics,

while other sniffers called operation sniffers are responsible for capturing information. All

the sniffers are connected to a sniffer center for centralized decision making. The monitoring

system architecture is demonstrated in Fig. 3.2. Each inspection sniffer is assigned multiple

channels to scan. A sensing slot is a period during which the inspection sniffer scans through

all the assigned channels. In the following, a slot stands for the sensing slot unless otherwise

noted.

An overview of the SpecMonitor system model is shown in Fig. 3.1. The sniffers first detect

the PUs’ activities in order to identify available channels. Then, the inspection sniffer scans

the available channels to build SUs’ channel access model. The channel access probability for

46

Figure 3.4: Frame/Active slot interarrival time distribution (20 ms sensing slot, 2 ms sensingperiod)

each channel can be derived based on the model. Finally, utilizing the channel access prob-

ability, FL-QoM optimization or UL-QoM optimization problems can be solved to provide

optimized channel assignments for the operation sniffers in the forthcoming slots.

3.2.2 Channel Access Model

The channel access/usage model capturing the patterns of secondary user activities is con-

structed using the sensing outcomes of the inspection sniffers, thus is closely related to the

duration of a sensing slot. A sensing slot is composed of channel sensing and channel switch-

ing time, whose length depends on the number of channels to be scanned. Typically, a

channel sensing period is approximately 1ms per channel using energy detection [92], while

channel switching for a commodity 802.11b/g network card takes about 1− 5ms [93]. Since

a longer sensing period benefits the frame capturing as shown below, we assume each inspec-

tion sniffer spends 2ms for sensing one channel and additional 2ms for switching to another

channel. If each antenna of an inspection sniffer is assigned 5 channels to scan, one sensing

slot will be 20ms long.

During each slot, the inspection sniffer scans 5 channels to reveal their channel states (ac-

tive/idle). Here, an active slot indicates a slot during which the sniffer spots SUs’ traffic

47

after channel sensing, while idle slot represents the opposite. As channel sensing period is

less than a full slot length, the discovered active/idle state may not reflect the genuine state

of a slot. Let X ik be the state of channel i at k-th sensing slot, which takes only binary values

“1/0”, corresponding to active/idle state (in the following, we omit the subscript i for i-th

channel). Then, the sequential data Xk (k = 1, 2, . . .) are used to calculate the active slot

interarrival time for each channel, which is defined as the time interval between two consec-

utive active slots. In our design, inspection sniffers produce active slot interarrival time as

their sensing outcomes, which will be used as the inputs to build channel access model as

explained in Section 3.3.

Note that one straightforward way of meeting frame capturing objectives is to predict SUs’

frame arrival time by modeling frame arrival pattern. However, it is infeasible to derive opti-

mized channel assignments with dynamic frame length and unslotted frame transmission [89].

Instead of directly modeling frame arrival pattern, we model the active slot interarrival pat-

tern as the basis of our monitoring framework, in which monitoring an active slot implies

capturing all the frames in the slot. To motivate/justify the adoption of active slot inter-

arrival time, we performed a real-world experiment using the traffic from different types

of applications (e.g. Web browsing, Bittorrent, FTP) in operational 802.11g WLAN. The

experiment settings are illustrated in Section 3.5.3. Fig. 3.3 shows the percentage of frames

in active slots corresponding to different sensing periods, from which we notice most of the

frames reside in the identified active slots, especially when the sensing period is longer than

2ms. In other words, by capturing frames in active slots, we are able to collect most of the

frames. Fig. 3.4 plots the histograms of frame interarrival time versus active slot interarrival

time, with each bar showing the percentage of frames or active slots whose interarrival time

is indicated by the x-axis. These two distributions appear very similar to each other with

most of the frames concentrated within small interarrival time region, indicating active slot

48

K Number of time slotsN Number of channelsN Channel index setM Number of antennas of all operation sniffersSop Operation sniffer antenna index setXk Sensing results of inspection sniffers at slot kZ(k) Current data set at slot kZin Input data set for density estimationTint Active slot interarrival timeW Sliding window size for online estimationnw Number of previous windows considered for

combinationtw Number of different samples in the previous

window for combinationf Probability density estimateF Cumulative density estimateSCAPi Slotted channel access probability of channel i4 Sensing slot lengthIdleCount Number of idle slots countedyi Vector of binary random variables for channel

izs,i Vector of binary random variables indicating

whether sniffer s is assigned to channel iα Switching cost weightUi Number of distinct users in channel i

Table 3.1: Summary of symbols and notations

interarrival time well characterizes channel usage pattern.

Intuitively, if the slot length becomes shorter, active slot interarrival pattern will approximate

frame interarrival pattern more closely. However, channel scanning with a shorter sensing slot

requires more inspection sniffers/antennas or faster channel sensing and switching operation.

Additionally, we infer from Fig. 3.4 that an operative 802.11g WLAN has a high traffic load,

since most of the frame interarrival time is rather short. For ease of reference, the commonly

used notations are summarized in Table 3.1.

3.3 User Channel Access Prediction

In this section, we propose a unified model to estimate secondary user channel access pat-

tern, as the front-end of SpecMonitor. In order to build the unified model, we first study

the primary user detection issue, and then we design an online non-parametric density es-

49

timation mechanism to predict SUs’ slotted channel access probability (SCAP ) pertained to

each sensing slot. As its name suggests, slotted channel access probability is defined as the

probability of SUs’ channel access during each slot.

3.3.1 Primary User Detection

To enable CR communications, SUs need real time knowledge of PUs’ activity to identify

available spectrum. Similarly, the sniffers are also required to detect PUs’ activity in order

not to waste time and energy listening on the primary-occupied channels. Primary user de-

tection can be achieved by using either spectrum sensing or by querying a geo-location white

space database over the internet. Spectrum sensing is expensive in cost, energy consump-

tion and complexity of hardware. On the other hand, the database approach is easier to

implement, which allows devices to report their locations to a web server that returns a list

of available channels at that location. However, database approach suffers from utilization

inefficiency, since it uses propagation models to decide the available spectrum, and hence, is

conservative in the channels it returns for a given location. Either of these two approaches

can be applied to our monitoring framework.

Feature detection is one popular spectrum sensing method for the sniffers to detect PUs’

appearance. The feature detection algorithms described in [94] can be used to sample the

UHF spectrum to detect the presence of TV broadcasts and wireless microphone signals,

which can effectively differentiate between the SUs’ and PUs’ signals. Then, the sniffers can

directly perform feature detection in the beginning of every slot to sense the availability of

monitored channels.

The database approach allows both sniffers and SUs to query the database for spectrum

availability at a certain location. After querying the database, the SUs begin operating

50

on a set of available channels, while inspection sniffers tune onto these available channels to

monitor SUs’ traffic patterns, and operation sniffers are assigned to the SU-occupied channels

correspondingly. In SpecMonitor, we adopt the database approach for simplicity.

3.3.2 Secondary User Channel Access Model

In this section, we propose a framework to estimate the secondary users’ SCAP at each slot

by modeling the active slot interarrival time distribution. The SUs’ channel access pattern

in WhiteFi networks is complicated, mainly due to the dynamics brought by time-evolving

mixed traffic from multiple SUs with channel switching behavior.

Non-parametric Density Estimation Model

Instead of assuming a specific active slot interarrival time distribution for quantifying SUs’

traffic pattern, we propose a SU channel usage model using the non-parametric density esti-

mation method to better capture SUs’ traffic dynamics. As mentioned previously, different

from support vector machine and neural network based methods, density estimation method

does not involve a time-consuming training phase, which makes it appropriate for online

prediction. More importantly, this non-parametric approach provides a greater flexibility

and accuracy in modeling a given data set, compared with other parametric approaches.

Currently, one of the most popular non-parametric density estimation approaches is Kernel

Density Estimator (KDE) with a Gaussian kernel function [95]. Given n independent real-

izations Xi (i = 1, 2, . . . , n) drawn from an unknown probability density function (pdf) f(x),

the Gaussian KDE with bandwidth σ is defined as:

f(x;σ) :=1

n

n∑i=1

KG(x,Xi, σ), x ∈ R, (3.1)

51

where

KG(x,Xi, σ) =1√2πσ

e−(x−Xi)2/(2σ2), (3.2)

from which we can see that Gaussian KDE is essentially the overall sum of Gaussian kernels

centered at location Xi with an equal bandwidth σ.

In fact, the setting of σ is of utmost importance for the density estimation performance. A

classic measure to determine the optimal σ is Mean Integrated Squared Error (MISE):

MISE{f}(σ) := E[f(x;σ)− f(x)]2, (3.3)

where f(x) is the underlying genuine distribution. Assuming a large sample set, we can ob-

tain an asymptotic approximation to MISE, denoted as asymptotic MISE (AMISE), written

as [95]:

AMISE{f}(σ) =1

4σ4‖f ′′(x)‖2 +

1

2n√πσ

, (3.4)

where f ′′(x) is the second derivative of f(x), and ‖ · ‖ denotes the Euclidean norm on R.

Thus, the asymptotic optimal value of σ∗ is obtained by minimizing AMISE:

σ∗ = (1

2n√π‖f ′′(x)‖2

)1/5. (3.5)

In order to compute σ∗ from Eq. 3.5, we need to approximate ‖f ′′(x)‖2 by estimating

the general form ‖f (j)(x)‖2 for arbitrary j. The corresponding optimal solution σ∗j =

( 12n√π‖f (j)(x)‖2 )1/5 with a generalized term of ‖f (j)(x)‖2 can be solved in a recursive form,

namely σ∗j = γj(σ∗j+1), where γj is a complicated formula given in [95]. Then, a fixed point

iteration method is employed to compute σ∗2, which is equivalent to the target value σ∗. This

KDE algorithm provides a viable means of automatically selecting optimal bandwidths with

superior density estimation performance.

52

Modeling Active Slot Interarrival Time Distribution

The KDE collects the data set of active slot interarrival time measured by inspection sniffers

to generate the density estimates. Since the distribution of collected data sets may vary over

time, the modeling accuracy of the KDE will be affected by taking into account outdated

historic data. Thereby, only the most recent data should be imported into the modeling

process. On the other hand, the modeling accuracy also largely depends on the size of

the input data sets. If we only consider the most recent observations by discarding all the

historical ones, the modeling accuracy will be brought down significantly. Furthermore,

the amount of inputs to KDE has great impacts on its computational efficiency. Generally

speaking, KDE with a small data set runs more efficiently than that with a large data set.

Therefore, the major issue of this model is to decide how much historical data should be

incorporated for density estimation, in order to produce an accurate and efficient model.

Now we present our proposed online non-parametric density estimation protocol. The basic

idea is to use sliding window method to perform online updating of the density estimates, and

to incorporate additional historic data sets for improving the estimation accuracy. The whole

protocol is presented in Algorithm 1, which is repeated for each channel. Whenever a new

observation arrives, the online estimation model only takes the data in a sliding window of

size W , i.e., the data sets exporting to the KDE only hold the most recent W observations.

The setting of window size W is pertained to the data dynamics, thus is empirical. A

simple guideline would be: first, we set an initial value for the sliding window size and run

KDE; second, we move the sliding window forward to see whether the estimated distribution

changes over time; third, if the change is significant, we decrease the window size, otherwise,

increase it, until we reach a satisfactory window size. Specifically in the WhiteFi network

scenario, we set a relatively small W as 50 data samples, since the data distribution will

change more dynamically than that in a traditional wireless network.

53

Figure 3.5: Sliding window method (X axis denotes the interarrival time data, Y axis denotesthe channels)

One of the most favorable features of sliding window method is attributed to its support for

online learning of density estimates. As time advances, our density estimator will take newest

sets of data falling inside the sliding window to compute the latest estimate, as illustrated

in Fig. 3.5. Therefore, our model enables the effective characterization of the time-evolving

active slot interarrival distribution, and allows us to update density estimates with every

newly arrival observation.

However, the major drawback of the sliding window method resides in the following fact:

the sliding window to specify input data also deteriorates the accuracy of KDE, because the

size of sliding window restricts the number of observations (only W ). Hence, we need to

improve the estimates by expanding the input data size.

As depicted in Algorithm 1, we propose to combine the data sets from multiple sliding win-

dows according to some well-defined criteria, in order to enlarge the sample space. How

to define such criteria for merging sample space is crucial to the ultimate estimation per-

formance. At first glance, more recent windows of data sets should have higher relevance

to current window. Therefore, one intuitive method to achieve more accurate estimation

is to combine the most recent density estimates from latest windows to capture the data

freshness [96]. However, because of the uncertain channel availability and underlying MAC

54

protocol, multiple clients may generate interleaved traffic due to alternate channel access-

es. Therefore, the most recent windows may not necessarily reflect the underlying density

of current window best, while some earlier historical data originating from the same clients

pertaining to the current window might do. Accordingly, we propose an accumulative combi-

nation method to make the decision of merging historical data based on statistical correlation

among the samples. As shown in Algorithm 1, we simplify the computation of statistical

correlations by employing Kolmogorov-Smirnov test (KS test). KS test is characterized as

a non-parametric inferential statistical method, since it makes no assumption about the dis-

tributions of samples, thus is completely data-driven. The Kolmogorov-Smirnov statistic is

defined as follows:

Definition 1. Consider two sets of observations Z1 and Z2, with n1 = |Z1| and n2 = |Z2|

samples. The Kolmogorov-Smirnov statistic is defined as:

Dz1,z2 = supx|F1(x)− F2(x)|,

where F1 and F2 represent the empirical cumulative distribution functions (cdfs) of the sam-

ples in Z1 and Z2, respectively.

Then, given Dz1,z2 , we can confirm two sample sets are from the same distribution with a

certain significance level β, if√

n1n2

n1+n2Dz1,z2 ≤ Kβ, where Kβ can be set according to a well-

defined table [97]. Note that cdf is a byproduct of the KDE, denoted as F (k) in Algorithm 1.

After KS test, we combine all the data sets passing the tests into one single data set, which

is provided for the KDE to update density estimates f(k) for the current slot. To tradeoff

the performance improvement and computational overhead, we limit the number of KS tests

by only preserving the previous nw windows of data sets for each channel. Meanwhile, two

consecutive windows only differ with one data point, thus it becomes more beneficial to test

55

Algorithm 1 Online non-parametric density estimation protocol

1: Input: W , nw, tw, current sensing result Xk at k-th sensing slot.2: if Xk! = 0 then3: Calculate the new observed active slot interarrival time Tint(k);4: Update the current data set Z(k) = {Tint(k), . . . , Tint(k −W + 1)};5: Update the input data set Zin = Z(k);6: Update the current density estimate [F (k), f(k)] = KDE(Zin);7: for i← 1 to nw do8: Perform KS test: KStest(Z(k),Z(k − i · tw));9: if pass KS test then

10: Update the input data set Zin = {Z(k) ∪ Z(k − i · tw)};11: end if12: end for13: Update the current density estimate [F (k), f(k)] = KDE(Zin);14: else15: return.

16: end if

windows with interval of tw samples. In this way, every previous window passing the KS test

can export tw more samples into the merged data set (see line 10 of Algorithm 1).

Consequently, we derive an accurate density estimate for active slot interarrival time distri-

bution at each channel, by online learning of data dynamics and cumulative combination of

historic data.

Computing Slotted Channel Access Probability

The problem we are going to address in this section is how to estimate the SCAP based on

the predicted distribution of active slot interarrival time. As mentioned before, SCAP (k+1)

represents the probability that (k+1)-th slot is active. In theory, the predicted secondary user

SCAP at (k+1)-th slot should be represented as SCAP (k+1) = Pr(Xk+1 = 1|X1, . . . , Xk),

whose computation appears intractable since it takes all the historic channel states into

consideration. However, we can take advantage of the updated active slot interarrival time

distribution to simplify the computation. We note that if the current slot is active, the

current active slot interarrival time will be the time period between the current slot and the

56

most recent active slot. Consequently, the probability that current slot is active SCAP (k+1)

can be interpreted as the probability that the active slot interarrival time is equal to the time

period between the current slot and the preceding active slot. If we assume the preceding

active slot is k, SCAP (k + 1) = Pr(Xk+1 = 1|Xk = 1) with active slot interarrival time

becoming ∆. If we assume the preceding active slot is j, SCAP (k+1) = Pr(Xk+1 = 1|Xk =

0, . . . , Xj+1 = 0, Xj = 1) with active alot interarrival time becoming (k+1−j)·∆. Therefore,

the predicted SCAP (k + 1) can be written as follows:

SCAP (k + 1) =

Pr(Xk+1 = 1|Xk = 1), if Xk = 1,

P r(Xk+1 = 1|Xk = 0, . . . , Xj+1 = 0, Xj = 1),

if Xk = 0.

=

∫∆

0f(k)dt, if Xk = 1∫ (k+1−j)·∆

(k−j)·∆ f(k)dt, if Xk = 0,(3.6)

where ∆ is defined as the sensing slot length. The algorithm to compute the SCAP for

each slot is given in Algorithm 2. SCAP provides an appropriate measure for quantifying

the secondary user channel access pattern, which takes into account the channel availability,

SUs’ current activity, and SUs’ traffic pattern learnt from their past activities. The major

goal of the inspection sniffers is to predict SCAP (k + 1) that guides the operation sniffers’

channel assignment strategies, which is the main focus of the following section.

3.4 Near-Optimal Monitoring Mechanism

The monitoring mechanism of SpecMonitor addresses the problem of sniffer channel assign-

ment to maximize two different levels of QoMs, which is carried out by the sniffer center. In

57

Algorithm 2 The Computation of Slotted Channel Access Probability

1: Input: current density estimate f(k), current sensing result Xk, the sensing slot length ∆.2: Initialization: IdleCount = 13: if Xk! = 0 then4: Compute SCAP (k + 1) =

∫∆

0f(k)dt;

5: Reset IdleCount = 1;6: else7: Update IdleCount = IdleCount+ 1;

8: Compute SCAP (k + 1) =∫ (IdleCount·∆)

(IdleCount−1)·∆ f(k)dt;

9: end if

particular, at k-th slot, the sniffer center collects all the channel usage information gathered

by the inspection sniffers to produce a prediction set of SCAP (k + 1) for all the channels

simultaneously. This set of predicted SCAP is then leveraged to provide optimized channel

assignments for the forthcoming slot.

Although channel switching enables the sniffers to capture channel dynamics adaptively, its

negative effects should not be neglected in computing QoMs, especially in the CRNs with

channel availability issue. We claim that channel switching indeed produces non-negligible

overhead in terms of frame losses in practice. For example, as mentioned previously, the

802.11b/g wireless cards take approximately 1 − 5ms for one channel switching operation.

If one slot lasts 20ms, at most 1/4 frames in the slot will be missing during the channel

switching operation which constitutes a non-negligible fraction of frames. To be more specific,

the typical frame interarrival time in a 10Mbps wireless network is 0.1ms (assume 1000 bit

frame). Then, during the 5ms channel switching, we may lose 50 frames, nearly 1/4 of the

total frames in this slot. In addition to that, frequent channel switching also raises energy

costs. These are the reasons why we integrate channel switching costs into the optimization

objectives. In the following, we show our formulation of sniffer channel assignment problem

with two levels of QoMs, respectively.

58

3.4.1 Frame-level Quality-of-Monitoring Optimization

The goal of FL-QoM optimization is to maximize the number of captured frames, given a

set of channels and operation sniffers inside one monitoring area. In section 3.2, we show

that active slot interarrival pattern is closely associated with frame arrival pattern, so that

the number of captured frames during K slots from a certain channel can be written as:

Nf =∑K

k=0(Ik · n(k)f ), where Ik is an indicator indicating whether the k-th slot is active, n

(k)f

denotes the number of frames inside the k-th slot. Therefore, instead of directly maximizing

the number of captured frames, we transform FL-QoM into an objective of maximizing

the number of active slots captured. For notation convenience, let us define index sets

i ∈ N = {1, . . . , N}, s ∈ Sop = {1, . . . ,M} for indexing channels and operation sniffer

antennas respectively. The optimization problem can be formulated as the following integer

programming (IP) problem:

maximizeN∑i=1

SCAPi(k + 1) · yi(k + 1)

− αM∑s=1

N∑i=1

1

2[zs,i(k + 1)− zs,i(k)]2

(3.7)

subject toN∑i=1

zs,i(k) ≤ 1,∀s ∈ Sop,∀k (3.8)

M∑s=1

zs,i(k) ≤ 1,∀i ∈ N ,∀k (3.9)

yi(k) =M∑s=1

zs,i(k),∀i ∈ N ,∀k (3.10)

yi(k), zs,i(k) ∈ {0, 1},∀s ∈ Sop, i ∈ N ,∀k. (3.11)

Each operation sniffer antenna in the set Sop is associated with a binary decision vector

zs,i(k) ∈ {0, 1}, i ∈ N , which is called sniffer channel assignment indicator, with zs,i(k) = 1

59

if the sniffer is assigned to channel i at slot k; 0 otherwise. yi(k + 1) is the binary variable

indicating whether or not the channel i is monitored by some sniffer in (k+1)-th slot. The IP

formulation is supposed to run iteratively: at k-th slot, after obtaining zs,i(k) and predicted

SCAPi(k + 1), we can acquire yi(k + 1) and zs,i(k + 1) by solving the IP problem. Clearly,

the sniffer channel assignment is updated once every slot, which allows our mechanism to

quickly adapt to the traffic dynamics.

Note that the objective function Eq. (3.7) is comprised of two parts: the positive part

represents the average number of captured active slots, while the negative part indicates

the channel switching costs. For simplicity, we use the number of channel switches between

every two subsequent slots to approximate the channel switching costs. In addition, we set a

switching cost weight α to represent the relative significance of channel switching costs w.r.t.

the gains obtained from captured slots, which is a constant value residing within [0, 1].

Here, we define α as the ratio of channel switching duration to the slot duration. In the

previous example with 5ms channel switching and 20ms slot, we have α = 1/4. However,

the definition of α can be further extended to incorporate more sophisticated metrics for

channel switching costs. For instance, we can further incorporate the probability that the

current channel will be idle in the next slot, because sniffer’s channel switching does not

incur frame loss overhead if the sniffer listens on an expected idle channel.

The constraints (3.8), (3.10) arise due to the facts that one sniffer antenna can only monitor

one channel, and one channel is better to be covered by one sniffer antenna inside the

monitoring area. In particular, we put forward the second constraint, because if we allow

multiple antennas to listen over the same channel in the same area, their captured frames

will provide duplicate information. This IP problem can be viewed as a NP-hard problem

following the proof in [87], thus we need to find an approximation algorithm to solve the IP

problem.

60

LP rounding algorithm has been adopted to solve the IP problem [86, 87]. This algorithm

solves the LP-relaxation of the IP formulation, and then rounds the fractional results into

integral solutions using for example the probabilistic rounding algorithm (PRA) [98]. How-

ever, this algorithm is only applicable to linear program problem, while in our formulation,

the objective function contains some quadratic terms. We then reformulate the objective

function to remove the nonlinear terms. As zs,i(k)2 = zs,i(k) when zs,i(k) ∈ {0, 1}, the

objective function Eq. (3.7) can be rewritten into a linear form as follows:

N∑i=1

SCAPi(k + 1) · yi(k + 1)−

αM∑s=1

N∑i=1

1

2[zs,i(k + 1) + zs,i(k)− 2zs,i(k) · zs,i(k + 1)]

Note that zs,i(k) is already known before solving optimization problem. The PRA algorithm

has been proven [98] to produce (1− 1/e)-optimal sniffer channel assignment in linear time.

However, the execution of PRA disregards the constraint (3.10) completely. Hence, the

resulted channel assignment obtained from PRA cannot prevent multiple antennas from

listening on the same channel. We define this problem as channel conflict problem, and the

sniffer antennas assigned to the same channel as conflict sniffer set.

In response, we propose a heuristic sniffer fixing strategy to address the channel conflict

problem, which takes the following steps:

(1) Find all the conflict sniffer sets in the solution obtained from the PRA algorithm;

(2) Pick one sniffer antenna in each conflict sniffer set randomly, and fix it to the conflicted

channel;

(3) Run LP rounding algorithm again to get a new solution;

(4) Test whether the new solution contains any conflict sniffer set: if yes, go to step (1);

61

otherwise return the solution.

The above heuristic channel assignment strategy fixes one sniffer antenna to one channel

every round by adding constraints, thus it guarantees to provide a feasible solution of channel

assignments for all the sniffers within linear time, which turns out to be a near-optimal

solution for the sniffer channel assignment problem, as shown in Section 3.4.3. In the end,

all the confliction will be addressed after running through a sequence of LP rounding, which

guarantees the convergence of the algorithm.

We call the channels to be assigned as potential channels. The resulted channel assignment

strategy can provide the sniffers with the assignments of potential channels for the next slot.

Then, the sniffer center checks every potential channel to determine whether it has already

been monitored: if yes, it skips assigning this channel; if no, it selects a sniffer antenna which

is not listening on any other potential channels to monitor this channel. In this way, the

channel switching costs are further alleviated.

3.4.2 User-level Quality-of-Monitoring Optimization

The objective of UL-QoM optimization is to maximize the expected number of active users

monitored. In order to capture the user-level information, it is indispensable to identify

the source of each frame, even encrypted frames. Let Ui(k) for i ∈ N denote the number

of active users operating in channel i at the k-th slot. We assume once a sniffer is tuned

into a channel, it covers all the active users operating in this channel. We do not consider

the channel switching costs in this case, because there are typically multiple frames from a

single user so that a small number of frame loss due to channel switching does not have a

big impact on the number of users measured. The UL-QoM optimization problem can be

62

casted as the following IP problem:

maximizeN∑i=1

Ui(k) · SCAPi(k + 1) · yi(k + 1) (3.12)

subject to (3.8)− (3.11).

The above optimization problem can be solved using exactly the same approximation al-

gorithm illustrated in the previous section, thus is omitted here. Note that Ui(k) can be

measured by counting the number of different MAC addresses from frames passing through

the AP running in channel i within time slot k. In practice, Ui(k) may not be available at

the beginning of the k-th slot, so it can be approximated by the measurement of Ui(k − 1),

assuming users remain operating in the same channel for the next time slot. Small errors

in estimating Ui(k) would not affect the performance much. In the extreme case when false

MAC addresses are inserted by the attackers, more sophisticated approach is required. For

instance, machine learning methods to perform Internet traffic classification [99] can be used

to differentiate different users based on their identified traffic types. This is out of the scope

of this chapter.

3.4.3 Numerical Analysis

In this section, we present numerical results for our approximation algorithm and compare

them to the upper bound of the problem. Without loss of generality, we focus on the FL-QoM

optimization problem.

Deriving an Upper Bound: The complexity of the optimization problem formulated in

Section 3.4.1 stems from the binary yi(k) and zs,i(k) variables, for ∀ k. To derive an upper

bound for the problem, we relax the integer (binary) requirement on yi(k) and zs,i(k) with

63

Figure 3.6: Normalized objective (with respect to the computed upper bound) for 20 channelsand 10 sniffers with α=0.3

Figure 3.7: Normalized objective (with respect to the computed upper bound) for 30 channelsand 20 sniffers with α=0.3

0 ≤ yi(k) ≤ 1 and 0 ≤ zs,i(k) ≤ 1. The relaxed problem is a standard LP problem, the

solution of which can be obtained in polynomial time. Since the relaxation enlarges the

optimization space, the solution to the relaxed LP problem yields an upper bound for the

original optimization problem.

Numerical Results: We consider N = 20 or 30 channels, M = 10 or 20 sniffer antennas.

SCAP values are randomly generated for every channel over 1000 slots. We first present the

simulation results for 20 channels and 10 sniffer antennas. We used the PRA approximation

and sniffer fixing algorithms to determine a feasible solution which serves as a low bound,

64

Figure 3.8: Normalized objectives for different α

and compared the corresponding objective value with the upper bound. Fig. 3.6 shows

the normalized objective values with respect to the computed upper bound (i.e. feasible

solution/upper bound) for 20 channels and 10 sniffer antennas. The average normalized

objective value obtained among 1000 slots is 0.95 and the standard deviation is 0.03. Fig. 3.7

shows the normalized objective for 30 channels and 20 sniffer antennas, with average objective

value as 0.96, and standard deviation as 0.01. We further adjust switching cost weight α

to examine the variations of the normalized objective values. Fig. 3.8 shows the tiny gap

between the achieved solution and the upper bound for different α.

Since the actual optimal value lies between the feasible solution value and the upper bound,

the solution value of our approximation algorithm must be even closer to the optimal val-

ue than the foregoing normalized ratio (normalized objective value). Thus, the derived

solution value of our approximation algorithm is close to optimality, thus confirming its

near-optimality. Finally, we run an experiment to compare the average number of captured

active slot using our algorithm with the upper bound, the result of which is shown in Fig. 3.9.

We notice that the difference between the monitoring solution and upper bound is kept small

over the time, which proves the near-optimality of our approximation algorithm from the

experimental perspective.

65

Figure 3.9: Number of captured active slots using our algorithm vs. Upper bound for 20channels and 10 sniffers with α=0.3

3.4.4 Complexity Analysis

In this section, we analyze the complexity of the above approximation algorithms. We first

analyze the complexity of LP rounding algorithm. The LP rounding algorithm involves two

major steps: (1) solving LP relaxation, and (2) executing PRA algorithm. We notice that

the above two IP problems contain (N+MN) unknown variables. Therefore, the complexity

of solving the LP relaxation of IP formulation is given as O((N + MN)3/log(N + MN))

[86], which is determined by the complexity of LP solver. On the other hand, the PRA

algorithm has a linear complexity O(M ∗ N), governed by the input vector size (M ∗ N)

[98]. Thus, the LP rounding algorithm can be solved with polynomial time complexity

O((N +MN)3/log(N +MN)).

Second, the heuristic sniffer fixing strategy will solve channel conflict problem by running

through a series of LP rounding algorithm. In the worst case scenario, it will invoke LP

rounding M times. Hence, the sniffer channel assignment problem can be solved with an

overall worst case complexity of O(M ∗(N+MN)3/log(N+MN)). The efficiency evaluation

of the algorithm implementation is presented in Section 3.5.2.

66

3.5 Evaluation

In this section, we conduct extensive simulations and experiments to evaluate the perfor-

mance of SpecMonitor for CRNs. The simulations leverage synthetic traces, which allow

us to vary the number of channels and sniffers, as well as the traffic patterns of different

users. We also carry out experiments and test the performance of SpecMonitor on real traces

collected from the experiments. Aside from implementing the proposed SpecMonitor frame-

work, we also implemented two baseline algorithms and one previously proposed algorithm,

listed as follows, for performance comparison purpose.

• Random channel assignment: the sniffer channels are randomly assigned.

• Greedy channel assignment: the sniffers are always assigned to the predicted busiest

channels based on SCAP at every sensing slot, i.e. the channels with the largest SCAP .

• Support Vector Regression (SVR) channel assignment: the sniffers are assigned

to the channels in which the next frame is predicted to arrive within a short period based

on the frame interarrival time predication using SVR method [89].

We assume the PUs’ presence can be detected promptly by both inspection and operation

sniffers, as illustrated in section 3.3.1. In the following sections, we first evaluate the per-

formance of the proposed secondary user channel access model, and then we evaluate the

real-time monitoring performance of SpecMonitor. Finally, the frame capturing performance

and user capturing performance are examined. The default systematic parameters used in

the evaluation are shown in the Table 3.2.

67

Table 3.2: Parameters

Parameters ValuesW 50nw 10tw 25Sensing slot length ∆ 20msSensing period 2msMulti-slot updating 5 slotsGaussian mean values of synthet-ic traces

[3, 42]

Gaussian standard deviation ofsynthetic traces

2

3.5.1 Performance of Secondary User Channel Access Model

The proposed secondary user channel access model utilizes the online non-parametric density

estimation to achieve an accurate estimated distribution of active slot interarrival time. As

mentioned in section 3.3.2, our channel model uses sliding window technique to track the

dynamic traffic patterns of SUs. We generate a data set with changing data distribution (pdf

of the data set changes from e−x to 15e−

x5 ). The tracking performance is shown in Fig. 3.10(a)

with W=50, nw=10, tw=25 (unless otherwise noted, those parameters are set as default),

from which we can see that the estimated distribution is gradually changing from the original

data distribution to the current data distribution. The transition happens when the current

window takes data from both distributions and completes when the whole current window

only contains data from the new distribution. The tracking speed is determined by the

window size, i.e., larger window size will cause more delays during the transition. We can

select the window size according to the variability of traffic patterns as mentioned in Section

3.3.2.

However, too small window size will affect the modeling accuracy, which motivates the com-

bination of historical samples. Recall that only the data sets passing the KS tests can be

68

0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Data Value

Pro

babi

lity

Den

sity

Estimated Density with Original dataEstimated Density during the ChangeEstimated Density After the Change

(a) The tracking performance when the traffic pat-tern is changing

0 0.5 1 1.5 2 2.5 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Data Value

Pro

babi

lity

Den

sity

Real Density of the Data SetEstimated Density with Uncombined DataEstimated Density with Combined Data

(b) Performance improvement with data combina-tion

Figure 3.10: Performance of secondary user channel access model

combined to improve the estimation accuracy. Fig. 3.10(b) shows the performance compar-

ison between the density estimates with data combining and that without data combining.

We can see the significant improvement brought by the accumulative data combination.

3.5.2 Real-Time Monitoring Performance

As illustrated in Section 3.3,3.4, our monitoring framework has a very stringent real-time

requirement. Basically, we are required to complete the channel assignments before a slot

ends, i.e., within 20 ms according to our setting. In this section, we evaluate the running time

of SpecMonitor using experiments, and propose to relax the stringent requirement without

compromising the monitoring performance. We implement SpecMonitor framework using

MATLAB R2011b on a Windows machine with 3.2 GHz Intel Xeon W3565 CPU and 18 GB

memory, including the channel access model and near-optimal channel assignment algorithm.

In our original design, whenever there is a new observation of active slot interarrival time,

SCAP values are updated and channels are reassigned.

69

We carry out experiments to count the running time of the monitoring framework and

breakdown the running time into different sections to identify the bottleneck as shown in

Table. 3.3. Note that the recorded running time is the average value after running 1000

slots. The overall running time is 91.2 ms, with 97.5% of time spent on KDE operation and

optimization algorithm. By delving into the KDE function and optimization algorithm, we

find the bottleneck of KDE operation is on the fixed point iteration algorithm [95] consuming

95% of overall time for each KDE invocation, while the bottleneck of optimization algorithm

is on the linprog MATLAB function (52%) and LP rounding algorithm with sniffer fixing

strategy (40%). In our experiment, the sniffer fixing strategy runs through LP rounding

algorithm two times on average until a valid channel assignment is generated. Consequently,

the overall running time far exceeds a slot duration. To address this issue, we can convert

the code using more computationally efficient programming language such as C.

Another alternative and more viable approach is to relax the stringent real-time requirement.

Instead of updating SCAP and assigning channels every slot, we relax the per-slot updating

requirement into T -slot updating requirement, which allows SCAP to be renewed every

T slots. To satisfy the relaxed requirement, the sniffer center only needs to check for new

observations and incorporate them in the channel usage model every T slots. In other words,

SCAP gets updated and channels are reassigned, only if there is at least one new observation

during T slots. In our implementation, we can set T = 5, so that the implemented model

update and channel assignment complete before the next channel assignment can be carried

out, i.e., within 100ms or 5 slots (as 91.2ms < 100ms). We show in the following sections

that per 5-slot updating does not sacrifice the monitoring performance much compared

with per-slot updating, thus it satisfies the real-time processing while retaining an excellent

performance. Note that we neglect the wire side communication costs between inspection

sniffers and sniffer center, which are in the order of microsecond, regarded as negligible.

70

KDE Opera-tion

KStest Oper-ation

SCAP Com-putation

OptimizationAlgorithm

Total Time

58.1 0.6 0.6 30.8 91.2

Table 3.3: Average running time with 20 channels and 10 sniffer antennas (in ms)

3.5.3 Frame Capturing Performance

In this section, frame capturing performances of different channel assignment algorithms are

evaluated. First, we generate synthetic traces to evaluate the frame capture rate and channel

switching cost. Frame capture rate is defined as the ratio of the number of captured frames

versus the overall number of frames passing through all the channels up to the current time,

while channel switching cost is represented by the negative part of Eq. 3.7. Then, we collect

real-world traces using AirPcap Nx [100] with Wireshark. The traffic traces are captured

from multiple channels of operative WLANs (802.11g mode) to emulate the scenarios in

WhiteFi networks. We evaluate SpecMonitor by comparing its frame capturing performance

with other algorithms. For all the following evaluations, we measure the performance of

algorithms running through 1000 slots for 100 rounds.

Synthetic Traces

First, we generate synthetic time series traces to represent frame interarrival time, using

Gaussian distribution with an exponential correlation function. Each trace corresponds to

the traffic generated in one channel with different mean values to simulate different traffic

loads. We evaluate the capturing performance w.r.t. different switching cost weights α.

Generally speaking, one channel switching causes a penalty of losing α slot, α ∈ [0, 1].

We assume the training process of SVR scheme has already been done, which takes about 35

training samples [89]. Fig. 3.11(a) shows the frame capture rates of different methods. With

the increase of α, frame capture rates of all three compared schemes fall down steadily because

71

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Alpha

Fra

me

Cap

ture

Rat

e

SpecMonitorSVRRandomGreedy

(a) Frame Capture Rates

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Alpha

Cha

nnel

Sw

itchi

ng C

ost


(b) Channel Switching Costs

Figure 3.11: Performance with different methods using synthetic time series data (5 channels,3 sniffer antennas)

of the increasing penalty for channel switching. However, with excessive α, SpecMonitor will

force the sniffers to switch channel only when the reward from channel switching is higher

than the penalty; otherwise, it keeps the sniffers staying in the current channels. In this

way, SpecMonitor retains an excellent frame capture performance. Regarding the SVR

scheme, it performs best when the channel switching costs are neglected (α = 0 or 0.1),

which means SVR achieves accurate estimations of frame interarrival time. However, when

α grows larger than 0.2, frame capture rate of SVR scheme drops steadily because of the

aggravating switching penalty caused by frame loss. Note that the capturing performance of

SpecMonitor also drops a bit due to higher switching costs, but then reverts back to surpass

the performance curves of any other schemes.

Fig. 3.11(b) shows channel switching costs w.r.t. α, from which we can see SVR methods

induce highest switching costs among all methods, because of its unslotted and heuristic

switching strategy. In contrast, the switching cost of SpecMonitor remains the lowest.

Finally, Fig. 3.12(a) shows the different capturing capabilities w.r.t. the number of sniffer

72

6 7 8 9 10 11 12 13 14 15 160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

11

Number of Sniffers

Fra

me

Cap

ture

Rat

e

Our SchemeSVR SchemeRandom SchemeGreedy Scheme


6 8 10 12 14 160

1000

2000

3000

4000

5000

6000

Number of Sniffers

Cha

nnel

Sw

itchi

ng C

ost



Figure 3.12: Performance with varied number of sniffer antennas using different methods(α=0.25, 20 channels)

antennas, the frame capturing performance of all the methods keeps growing with the in-

creasing number of antennas. SpecMonitor achieves the highest frame capture rate. We also

compare channel switching costs in Fig. 3.12(b). With more sniffer antennas, the channel

switching costs of SVR method decreases significantly, because the increased traffic captur-

ing capability refrains SVR method from aggressive channel switching behavior. Meanwhile,

the other methods have much less, yet more stable channel switching costs.

Real Traces

We collect the real traces from 802.11g WLAN network, captured by a sniffer listening on the

channel established by one AP and client pair running various applications. The captured

traces include both the uplink traffic to AP and the downlink traffic from AP. We consider

five different types of trace data (FTP, BT, Web Browsing, Skype Voice and Skype Video)

with one trace per channel. FTP and BT traces are obtained by running an automated script

on the client to download/upload several files from/to a server continuously, and we write

73

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Alpha

Fra

me

Cap

ture

Rat

e



0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

200

400

600

800

1000

1200

1400

1600

1800

2000

Alpha

Cha

nnel

Sw

itchi

ng C

ost



Figure 3.13: Performance using real-world traffic (7 channels, 4 sniffer antennas)

another automated script to browse several websites to collect Web Browsing trace. Skype

voice trace and video trace are collected by connecting to a client using Skype voice call

or Skype video call. We evaluate the performance with seven channels of real-world traffic,

while the additional two channels contain mixed traffic pattern. Namely, one is the traffic

combined from two clients using Skype Voice and BT, and the other one is generated from

two clients using Skype Voice and Web Browsing. The performance is shown in Fig. 3.13(a)

and Fig. 3.13(b), from which we can see SVR method performs even worse than random

scheme. The reason is that the SVR method takes a long time for retraining, when the

predicted value has a large deviation from the genuine one because of the real-world traffic

dynamics. Frequent retraining and channel switching operations significantly deteriorate the

capturing capability of SVR method. However, SpecMonitor retains the best performance,

except in the case of small α, the greedy method performs better when channel switching only

incurs a small penalty. This comparison result also indicates that our model can accurately

capture the traffic statistics regardless of whether the traffic is interleaved or not.

74

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

11

Alpha

Fra

me

Cap

ture

Rat

e

Per−slot updating5−slot updating10−slot updating

Figure 3.14: Average frame capture rate comparison of multi-slot updating and per-slotupdating with 20 channels and 10 sniffer antennas

Comparison of multi-slot updating and per-slot updating

As mentioned in Section 3.5.2, multi-slot updating relaxes the real-time requirement. In

this section, we compare the performance of multi-slot updating with per-slot updating.

Fig. 3.14 presents the frame capturing performance comparison for multi-slot updating and

per-slot updating, which shows a slight performance degradation using multi-slot updating

method. Interestingly, 5-slot updating achieves a better frame capture performance when

α = 0.7, because it incurs less switching costs by switching at least every 5 slots. However,

in most cases, per-slot updating captures more frames, due to its more rapid adaptation to

the traffic dynamics. From this performance comparison, we conclude that 5-slot updating

retains an excellent frame capturing performance while fulfilling the real-time requirements

as presented in Section 3.5.2.

The intuition behind the fact that T-slot updating achieves similar results is that the data

distribution presented in our experiments does not change rapidly over the course of T slots.

However, this fact does not apply to all the traffic scenarios. For example, for traffic scenarios

when the traffic statistics are rapidly changing, T should be assigned a small value. The exact

75

value of T can be picked using the above performance comparison method, via evaluating

the performance of various T values and selecting a larger one with acceptable performance

degradation.

3.5.4 User Capturing Performance

Finally, we evaluate the performance for maximizing UL-QoM using synthetic data. We

assume different channels contain different numbers of SUs, and the numbers are dynamically

changing within range [0, 10] (assume uniform distribution); also the frame interarrival time is

exponentially distributed with mean values residing in [1, 40], specifying the traffic pattern.

First, we compare the expected number of captured users per slot using three different

monitoring schemes in Fig. 3.15(a). The result indicates that SpecMonitor is able to capture

more users per slot, because the optimized monitoring strategy keeps the sniffers watching

the channels with more users.

Then, we define Active User Capture Rate as the ratio of number of active users captured

versus the overall number of the active users appeared in all the channels. The performance

of active user capture rate w.r.t. different number of sniffers is shown in Fig. 3.15(b), from

which we notice that SpecMonitor can select best sets of channels to maximize the number

of active users captured during the monitoring period. The result implies SpecMonitor

significantly outperforms two baseline schemes, in terms of user capturing performance. We

plan to use real-world traffic to examine user capturing performance of SpecMonitor coupled

with reliable user identification schemes such as radiometric signature [101] in our future

work.

76

SpecMonitor Greedy Random0

10

20

30

40

50

60

70

Exp

ecte

d N

umbe

r of

Cap

ture

d U

sers

per

Slo

t

5 Sniffers10 Sniffers15 Sniffers

(a) User-Level QoM Objective

6 7 8 9 10 11 12 13 14 15 160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.90.9

Number of Sniffers

Act

ive

Use

r C

aptu

re R

ate

SpecMonitorRandomGreedy

(b) Active User Capture Rate

Figure 3.15: User capture performance with 20 channels

3.6 Summary

In this chapter, we have introduced a systematic passive monitoring framework, SpecMonitor,

for Wi-Fi like CRNs to maximize two levels of QoMs incorporating switching costs. Both

the primary user and secondary user channel usage patterns are considered to optimize the

monitoring strategy. Specifically, we proposed an online non-parametric density estimation

scheme to learn and predict the time-evolving mixed traffic pattern from SUs. Based on

the predicted traffic pattern, the optimization problems of sniffer channel assignment are

formulated, for which we designed near-optimal monitoring algorithms. Our simulation and

experimental results both showed that SpecMonitor has superior capturing capability with

low channel switching overhead. One major limitation of the SpecMonitor system is that

SpecMonitor requires a substantial amount of traffic of interest on the channel in order to

produce a reasonable channel access model. If the traffic amount over a channel is small,

the produced model may be unreliable. In future research, we will consider the impact of

traffic amount to the channel access model. We plan to evaluate the modeling accuracy in

real time. The model built with a small traffic amount will be deemed as unreliable, which

77

will not be used for future predictions and channel assignments. In addition, we will explore

the capability of the proposed framework in identifying malicious or misbehaved network

activities. We also plan to design customized quality of monitoring objective for different

security services, such as optimizing the attack signature generation or facilitating link loss

analysis.

Chapter 4

MIMO-based Jamming ResilientOFDM Communications in WirelessNetworks

Jamming is a common but serious threat to wireless communication. In particular, reactive

jamming is considered the most powerful jamming attack as the attack efficiency is max-

imized while the risk of being detected is minimized, which has been implemented using

software defined radios. Currently, no effective anti-jamming solutions exist to enable reli-

able OFDM wireless communications in the presence of reactive jamming attack. On the

other hand, MIMO has emerged as a technology of great research interest in recent years

mostly due to its capacity gain. In this chapter, we explore the use of MIMO technology for

jamming resilient OFDM communication, especially its capability to communicate against

the powerful reactive jammer. We first investigate the jamming strategies and their impacts

on the MIMO-OFDM receivers. We then present a MIMO-based anti-jamming scheme that

exploits interference cancellation and transmit precoding capabilities of MIMO technolo-

gy to turn a jammed non-connectivity scenario into an operational network. Our testbed

evaluation shows the destructive power of reactive jamming attack, and also validates the ef-

ficacy and efficiency of our defense mechanisms in the presence of numerous types of reactive

78

79

jammers.

4.1 Related Work

Jamming Attack and Defense Mechanisms. The mainstream jamming defense mech-

anisms rely on FHSS and DSSS, either requiring the communicating parties to pre-share

secret keys [102, 103], or let them communicate without pre-shared keys [104, 105]. Recent-

ly, powerful reactive jamming has aroused many researchers’ interests. For instance, [34]

demonstrates the feasibility of reactive jamming using software-defined radios. [25] proposes

detection mechanism to unveil reactive jammer in sensor networks. [106] investigates the im-

pacts of reactive smart jamming attacks to IEEE 802.11 rate adaptation algorithms. Recent

studies consider to defend against more powerful wideband and high power jamming attack-

s [107, 108]. However, both of them only support low data rate communications. Besides

that, both of these two defense mechanisms only work for conventional wireless communica-

tions that are not OFDM-based. In [109], Vo-Huu et al. proposes a mechanical beamforming

scheme and a digital interference cancellation algorithm to cancel jamming signals. However,

they can only deal with static adversaries and require additional hardware costs, while our

mechanism is purely digital which is capable of dealing with mobile attackers as long as the

channel estimation is accurate. Further, they only focus on non-OFDM systems.

In the context of jamming resistent MIMO/OFDM communications, Rob Miller et al. [110]

study various jamming attacks to disrupt the MIMO communication by targeting its chan-

nel estimation procedure. Specifically, the adversary interferes with the preambles or pilots

to let sender and receiver perform false estimation. In similar essence, some recent work

investigate and attempt to alleviate the impacts of jamming attacks to the OFDM systems.

Han et al. [111] propose a jammed pilot detection and excision algorithm for OFDM systems

80

to counteract narrow-band jammer that jams the pilot tones. Clancy et al. [112] further

introduce pilot nulling attack that minimizes the received pilot energy to be more destruc-

tive, and provide mitigation schemes by randomizing the location and value of pilot tones.

However, they both specifically focus on the adversaries jamming pilot tones, who require

knowing the pilot locations and also demand very tight synchronization. Moreover, their

defense mechanisms will fail to recover signals when all the OFDM subcarriers including the

pilots are jammed as in the case of reactive jamming attack. Note that it is extremely diffi-

cult for the adversary to synchronize his/her transmission with the legitimate sender during

the short channel sounding period, while this chapter focuses on a more practical reactive

jamming attack.

Interference Cancellation Mechanisms. Research efforts in the interference manage-

ment area have developed novel interference cancellation techniques to improve the network

throughput [36], medium access protocol [38] and robustness [37] of MIMO networks. [36]

proposes a centralized solution to combine interference cancellation and alignment for decod-

ing concurrent transmissions in MIMO networks, doubling the throughput of MIMO LANs.

Lin et al. [38] extend the previous work by presenting a distributed random access pro-

tocol. Shen et al. [113] further develop a rate adaptation scheme through learning clients’

signal directions. However, all the above works consider interferences caused by concur-

rent transmissions from legitimate senders in the same network. The most relevant work

is [37], which enables MIMO communications under high-power cross-technology interferers.

Yet, our work exposes several significant differences: 1) we consider smart jammers, who

can adapt their attack strategy to be more destructive, while interferers are unintention-

al; 2) their channel estimation methods require to average over multiple OFDM symbols,

which is not applicable for tracking jammer’s channel due to jammer’s fast adaptation, while

our mechanism inserts pilots into known locations to jointly track the sender and jammer’s

81

Figure 4.1: Reactive jammer starts jamming after certain reaction time

channels instantaneously.

4.2 Problem Formulation

In this section, we present the system model, define the attack model and lay out preliminary

knowledge of MIMO-OFDM communication.

4.2.1 System Model

We consider an adverse wireless environment with a jammer targeting at the communication

link established by a sender and a receiver. We assume that the jammer is a common single-

antenna device, with the capability of taking any attack strategy to be most destructive.

The frames in OFDM wireless communications have signal structures as shown in Fig. 4.1.

A preamble is transmitted ahead of the data, which is used for signal acquisition, time

synchronization and initial channel estimation. We assume the sender transmits when the

jammer is not jamming, either by taking a random backoff between transmissions or by

sensing jamming activity [108]. We assume every sender and the intended receiver share a

secret key that is unknown to the jammer.

At receiver R, let PSR and PJR be the received signal powers from S and J respectively. The

signal-to-jamming ratio (SJR) at receiver R can be expressed as PSR/PJR, which determines

82

the decoding performance. We do not consider the noise and interference, since they are

negligible when compared to the jamming power.

4.2.2 Attack Model

There are three typical jamming attack models: 1) constant jammer continuously transmits

jamming signals to corrupt packet transmission. He/She has the capability of covering the

whole frame structure, whereas his/her energy consumption is extremely high, rendering

himself/herself easily discoverable; 2) random jammer is more energy-efficient, as he/she

emits jamming signals at random time for a random duration. However, his/her jamming

capability is limited due to the randomized jamming behavior; 3) reactive jammer is more

effective, energy-efficient and stealthier [25], which is the main focus of this chapter.

The key feature of reactive jammer is sensing-before-jamming. The jamming reaction time

denotes the time difference between the arrival of the original signal and the jamming signal

at the receiver. It takes a reactive jammer a minimum reaction time to perform channel

sensing and jamming initialization before sending out jamming signals, during which the

preamble of the frame could be transmitted without being jammed [34, 108], as shown in

Fig. 4.1.

In our experiment, a preamble takes only one OFDM symbol, which lasts 128µs with 1MHz

bandwidth. On the other hand, the jammer, who is agnostic to the implementation details

of the network (e.g., the transmission protocol and preamble symbols), can only carry out

energy detection [114], which requires more than 1ms to detect the signal for a 0.6 detection

probability and −110dBm signal strength, when implemented in a fully parallel pipelined

FPGA [92]. Even the advanced software radio based reactive jammer, who is aware of the

implementation details of the network, still incurs a considerable reaction delay including

83

software and hardware delays to process the incoming signal and to make a jamming decision,

during which the preamble of a frame is successfully delivered to the receiver without being

disturbed [34,35,115].

In addition, the jammer can transmit arbitrary signals with/without any signal structures.

The jammer is also capable of jamming the whole spectrum, invalidating the traditional

spread spectrum anti-jamming methods [107,108]. However, we assume the jammer cannot

perform full-duplex communications, which essentially disallows the jammer to sense and

jam simultaneously.

4.2.3 MIMO Interference Cancellation and OFDM Basics

In a MIMO network, the spatial multiplexing gain can be represented by a concept called

Degrees-of-Freedom (DoF), which is defined as the dimension of received signal space over

which concurrent communications can take place [116]. DoF indicates the number of con-

currently transmitted streams that can be reliably distinguished at a MIMO receiver.

Consider a 1 × 2 MIMO communication between sender S and receiver R as shown in

Fig. 4.2, the signals (xsxj ) from the sender and jammer respectively are transmitted concur-

rently through the channel H, and the received signals can be written as:

(y1y2) = (hsh′s)xs + (hjh′j

)xj, (4.1)

which live in a two-dimensional vector space corresponding to two receive antennas.

In order to decode xs, the IC technique is utilized to remove the interference from xj by

projecting the received signals onto the subspace orthogonal to xj (see Fig. 4.2), i.e., [h′j,−hj],

84

yielding a projected signal as:

yproj = h′jy1 − hjy2 = (h′jhs − hjh′s)xs. (4.2)

After that, the projected signal can be decoded using any standard decoder. This IC tech-

nique is also called Zero-Forcing (ZF).

According to Eq. 4.2, the knowledge of channel coefficients seems indispensable in decoding

xs, the estimation of which is referred to as channel estimation, which can be done by

transmitting a known symbol from the sender. However, as the jammer’s signals may not

have any recognizable signal structure, it becomes impossible to learn his/her corresponding

channel coefficients. Fortunately, we claim that to learn the exact values of jammer’s channel

coefficients is unnecessary, since we are not interested in decoding jammer’s signals. Instead,

we show in Section 4.4 that it will be sufficient to know the direction of the received jamming

signal. Note that, estimating jammer’s signal direction1 is the core of ZF decoder. Also, a

loss of original signal amplitude after projection is observed from Fig. 4.2.

OFDM divides the spectrum into multiple narrow subbands called subcarriers. The receiver

operates on each subcarrier, and applies FFT to the received signal for demodulation. This

allows many narrowband signals to be multiplexed in the frequency domain, which greatly

simplifies the channel estimation and equalization. In this chapter, the sender and receiver

establish OFDM communications with the signals of interest as OFDM-modulated signals.

Note that Eq. (4.1) assumes a narrowband channel, where h (such as hs, hj, etc) appears

simply as a complex number. However, for wideband channels, the signals at different fre-

quencies will experience different channels, bringing so called multi-path effects. As a result,

h will become a complex vector indexed by different frequency responses. Yet, Eq. (4.1)

1Signal direction is determined by the received signal vector induced on the receive antenna array by thetransmitted signal [116], which is defined in the antenna-spatial domain and not the I-Q domain.

85

Figure 4.2: 1× 2 MIMO-OFDM link attacked by a Jammer

still holds for each OFDM subcarrier in the OFDM communications, such that MIMO IC is

carried out over each subcarrier.

4.3 Impact of Reactive Jamming Attack to MIMO-

OFDM Communications

In this section, we investigate the impact of reactive jammer to the MIMO-OFDM commu-

nications. Without loss of generality, we explain the jamming strategy in the context of a

two-antenna receiver decoding a single transmission from the sender in Fig. 4.2. The sender

and receiver form a 1× 2 MIMO link of two DoF with one DoF consumed by the jammer.

According to Eq. (4.1), the received frequency-domain signals for each OFDM subcarrier i

are shown below:

y1i = hjixji + hsixsi, (4.3)

y2i = h′jixji + h′sixsi, (4.4)

where hji, h′ji, hsi and h′si are frequency version of channels at subcarrier i, and xji and xsi

are frequency-domain signals from the jammer and sender. Note that the jamming signals

need not be OFDM signals, and xji simply represents the narrowband portion of jamming

86

signals on i-th OFDM subband. As mentioned in Section 4.2.3, the MIMO IC technique is

carried out over each subcarrier to recover the legitimate signal, which is deemed as the key

to the data recovery process. Naturally, the MIMO IC technique becomes the target of the

jammer.

We reformulate Eqs. (4.3), (4.4) as follows (in the following, we omit the subscript notation

i for i-th subcarrier):

(y1y2) = H(10)xj + H(0

1)xs, (4.5)

where H = [hj hsh′j h

′s] = [hj,hs] is the 2 × 2 channel matrix. The received signals are the sum

of two vectors Jr = H[1 0]Txj and Sr = H[0 1]Txs, as shown in Fig. 4.2. We find that the

angle2 between Jr and Sr, determined by hj and hs, can be exploited by the jammer to

launch effective attack.

Attacking MIMO Interference Cancellation. In order to understand the attack strat-

egy, we inspect three special scenarios in Fig. 4.3 with different received signal spaces. Un-

doubtedly, the most severe attack is depicted in Fig. 4.3(a), in which Jr overshadows Sr in

the received signal space, preventing Sr from being recovered. On the contrary, the least

powerful attack emits a jamming signal that is orthogonal to the legitimate signal as shown

in Fig. 4.3(b), in which the projected signal is equivalent to the original signal, yielding

the highest projected signal amplitude. Fig. 4.3(c) shows a case between the above two

extreme cases, where the angle between two received signals takes a small value. Therefore,

by manipulating the jamming signal direction, the jammer has the potential of affecting the

effectiveness of MIMO IC mechanism.

Correspondingly, the jammer’s attack strategy is to shrink the angle between the jamming

2The angle between two received signal vectors is equal to the angle between two channel vectors, com-

puted by cosθ =|hH

j ·hs|‖hj‖‖hs‖ . The angle’s range is [0, π2 ].

87

signal and the intended signal by moving towards the vicinity of the sender. As a matter

of fact, the difference between hs and hj deviates according to the distance between S

and J [71]. More specifically, if the spacing between two antennas is narrower than a half

wavelength, the channels from these two antennas will become highly correlated [116], which

renders two received signal directions similar.

In order to demonstrate the effectiveness of such attack strategy, we perform an experiment

on a 1× 2 MIMO link of Fig. 4.2 by varying the distance between the jammer and sender’s

antennas. Fig. 4.4 shows the packet delivery rate (PDR) performance, in which sender’s

PDR drops to nearly zero when the antenna distance decreases below 6cm.

Antenna-Spatial Domain vs. I-Q Domain. The jammer has the ability of varying the

phase of the jamming signal, resulting in a same situation as having frequency offset. The

frequency offset causes the signal vectors to rotate in the I-Q plane. One may speculate that

the jamming signal may not have a constant phase offset with the signal of interest as shown

in Fig. 4.3, even if the channel matrix is fixed. This reasoning however is incorrect, since the

received signal space of Fig. 4.3 is in the antenna-spatial domain and not in the I-Q domain.

The frequency offset only determines how signal rotates in the I-Q domain, but only scales

the direction of the signal vectors in the antenna-spatial domain by a complex number [36].

In other words, the jamming signal direction in the received signal space is unaffected by

signal rotation in the I-Q domain, but instead is determined by the channels between the

sender, jammer and the receiver. Therefore, the jamming signal direction keeps constant

during the channel coherence period.

88

(a) Overlap signals (b) Orthogonal (c) Small angle

Figure 4.3: Different two-dimensional received signal spaces

0 2 4 6 8 10 12 14 16 180

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Distance Between Two Antennas (cm)

Pac

ket D

eliv

ery

Rat

e

Figure 4.4: Jamming attack performance by approaching the sender’s location (in this ex-periment, the device works on 2.45GHz central frequency with a half wavelength λ

2= c

2f≈

6.12cm)

4.4 Defense Mechanisms of Reactive Jamming Attack

In this section, we propose effective MIMO-based defense mechanisms to counteract reactive

jamming attack based on IC technique. We first develop an iterative channel tracking mech-

anism to cancel arbitrary jamming signals by keeping track of the jamming signal direction.

Then, we build an enhanced defense mechanism by incorporating sender signal enhancement

to enable a more robust OFDM communication.

As opposed to the attack strategy to shrink the angle between two arrival signals, the defense

mechanism attempts to expand the angle. We address two major issues in this section: 1)

how to decode the signals of interest in the presence of arbitrary jamming signals; 2) how to

strengthen the robustness of OFDM communications against adaptive and reactive jammer.

89

4.4.1 Defense Mechanism Overview

We offer an overview of proposed defense mechanisms in this section. The defense mech-

anism mainly includes angle expansion, signal decoding (Section 4.4.2), channel tracking

(Section 4.4.2) and jamming detection (Section 4.4.3) modules. Angle expansion module

aims at expanding the angle of arrival signals to make intended signals decodable. As long

as the jammer fails to approach the sender, the channels hs and hj will be uncorrelated,

resulting in a random angle between Sr and Jr, and thus a high decoding rate. To preven-

t the jammer from getting close is straightforward, the sender can move randomly inside

the receiver’s reception range to avoid being approached. Alternatively, spatial retreat [117]

technique can be utilized to strategically move away from the jammer. Then, signal decoding

is implemented using MIMO IC technique after channel estimation. Meanwhile, jamming

detection module intends to instantly identify the beginning and end of a jamming attack

to trigger the defending process.

Enhanced defense mechanism (Section 4.4.4) involves SSE module, for rotating the transmit-

ted signal to improve sender signal decodability. It also incorporates a feedback mechanism

to reliably guide the sender’s rotation process. A flow chart is illustrated in Fig. 4.5, which

shows both the defense mechanism and its enhanced version.

4.4.2 Decoding the Signal of Interest

According to Eqs. (4.2), (4.5), the estimation of the sender’s and jammer’s channels is

the most crucial task in jamming-resistant solution based on MIMO IC technique. Initial

estimation of sender’s channel hs can be derived via analyzing the undisturbed preamble.

However, since initial channel estimation is only valid within the channel coherence time,

updating the channel estimation over time becomes a necessity.

90

Figure 4.5: A Flow Chart of Proposed Defense Mechanisms (Solid Box: Modules of theDefense Mechanism, Dashed Box: Modules of Enhanced Defense Mechanism)

Inspired by ZigZag decoding technique [118], we devise an iterative channel tracking mech-

anism by jointly keeping track of both the sender and jammer’s channel conditions in a

timely manner. In the following, we first exhibit jammer channel estimation method, and

then present the iterative mechanism for updating both channels iteratively.

Jammer Channel Estimation. Without pre-known preambles in the jamming signals, it

is difficult to carry out jammer channel estimation. Fortunately, the most recent advance [37]

shows that the complete knowledge of hj = [hj, h′j]T is not necessary for decoding xs. Due

to the nice scale invariance property of signal direction, i.e., the direction of [hj, h′j]T is

equivalent to that of [hjh′j, 1]T , the only information required about jamming signal for IC to

work is the signal direction, i.e. jammer’s channel ratiohjh′j

.

Note that the received signal is a mixed signal Jr + Sr. If we can extract jammer’s signal

Jr = (hjh′j

)xj, we can derive the jammer’s channel ratio by computing the ratio of received

91

Figure 4.6: Extended frame structure

jamming signals on two receiving antennas, ashjh′j

=xj ·hjxj ·h′j

. Based on this derivation, We

propose the following method to enable the extraction of the jamming signal Jr so that the

channel ratio can be computed.

As shown in Fig. 4.6, the basic idea of extracting the received jamming signal Jr is to insert

known symbols (i.e. pilots) in the original data frame, and then subtract them from the

received mixed signal. The location of the inserted pilots should remain secret between the

sender and intended receiver, because if the jammer learns the locations of the pilots, he/she

can intentionally stop jamming during these pilot periods to avoid being tracked. Moreover,

the pilots should be inserted frequently to enable frequent updates of the channel estimation.

Note that, the extension of the frame structure introduces limited overheads, which will be

evaluated in Section 4.6.4.

The complete jammer channel estimation scheme proceeds as follows: 1) after detecting the

beginning of jamming (refer to Section 4.4.3), the intended receiver finds the next jammed

pilots; 2) the received pilots are reconstructed using the known pilot symbol transformed

by the estimated sender’s channel (sender channel estimation is presented below); 3) the

constructed received pilots are subtracted from the jammed pilots to restore the jamming

signal; 4) the extracted jamming signal is used to compute the jammer’s channel ratio

(jamming signal direction).

Iterative Channel Tracking Mechanism. For IC to work, we need the estimations of

both the sender channel and the jammer channel. When the channel is being jammed,

92

deriving an accurate estimation of sender channel is a difficult task. In addition, wireless

channels are time-varying due to multipath fading effects. Jammers are also motivated to

vary the channel in order to evade the defense mechanism. To keep the channel estimation

updated and accurate, we need to carry out the channel estimation frequently. However,

the estimation of both channels under the jamming situation is hard - we have two channel

responses to estimate and the received signal is a mixed signal with two unknown signal

components.

We propose the following alternating and iterative method to keep track of the sender and

jammer channels. The key idea of the proposed method is that, we will not be able to

calculate the two channel estimations given two unknown signals. However, we will be able

to estimate one channel if the other is known. We can make the initial sender channel

estimation after receiving the preamble. Assume there was no jamming signal, the initial

sender channel response can be estimated as:

Hs(0) = (hs(0)h′s(0)) = (y1y2)/x

�s, (4.6)

where x�s denotes the known pilots. We will then do the sender and jammer channel esti-

mations alternately for every pilot received. Assume the pilots are numbered as i = 1, ..., n.

After receiving the first pilot (or odd numbered pilot), the receiver updates the jammer

channel ratio as:

hj(i)/h′j(i) =

y1 − x�s · hs(i− 1)

y2 − x�s · h′s(i− 1), i = 1, 3, ..., (4.7)

where we assume the sender channel did not change in the past time slot. Similarly, after

receiving the second pilot (or an even numbered pilot), the receiver updates the sender

93

channel estimation Hs(i) = (hs(i)h′s(i)

) according to:

hs(i)−hj(i− 1)

h′j(i− 1)h′s(i) = (y1 −

hj(i− 1)

h′j(i− 1)y2)/x�s, i = 2, 4, ..., (4.8)

where we assume the jammer channel did not change in the past time slot. Two unknown

sender channel components hs(i) and h′s(i) in Eq. (4.8) are updated alternately after receiving

an even numbered pilot. Specifically, hs(i) gets updated when i = 4, 8, ..., while h′s(i) gets

updated when i = 2, 6, ..., by assuming the other channel component did not change over

the past two time slots. This updating process continues in such a way that the sender and

jammer channels are updated alternately. Note that this mechanism requires very frequent

channel updates, within the channel coherence time, which can be as short as tens of OFDM

symbol time [119] in some application scenarios. On the other hand, this frequent channel

updates help us to keep close track of the jammer’s potential fast adaptation.

Sender Signal Decoding. Based on Eq. (4.2), the signal of interest x∗s can be written as:

x∗s =y1 − hj

h′jy2

hs − hjh′jh′s, (4.9)

in whichhjh′j

is updated every odd numbered pilot in Eq. (4.7), and (hs− hjh′jh′s) is updated every

even numbered pilot in Eq. (4.8). With precise and frequent updates of channel estimation,

the signal of interest can be correctly recovered using any standard decoder.

Inter-Symbol Interference Issue. Another practical issue with the wideband jamming

signal is that it suffers from multipath effects, which leads to inter-symbol interference (ISI).

ISI of jamming signals will impose additional noise to Eq. (4.5). To counteract ISI, we

average our channel tracking results derived from multiple pilots within channel coherence

time to mitigate the negative effects of ISI on channel estimation. While it is not a problem

94

for accurate channel estimation, this additional noise would reduce the SNR of the intended

signal, hence, affects the throughput. To address ISI issue, we must directly investigate

the time-domain signal, since ISI is inherently a time-domain phenomenon. We apply the

method in [37] to deal with ISI issue, i.e., we convolute the received time-domain signals

with a filter constructed by taking the IFFT of jammer’s channel ratio to cancel out the

ISI and jamming signal simultaneously. The signal of interest can then be decoded using a

standard decoder.

4.4.3 Detecting the Jamming Signal

As mentioned in previous section, the receiver needs to detect the beginning and end of

jamming to facilitate IC mechanism. The jamming detection problem has been studied

in [108], in which the constellation diagrams are employed to identify jammed symbols. We

follow the same principle. Soft error vector is utilized to build the detection metric, defined

as the distance vector between the received symbol vector and the nearest constellation

points in the I/Q diagram, as shown in Fig. 4.7(a). The soft error is further normalized by

minimum distance of the constellation. We assume the normalized soft error vector is ‖Vk‖

for k-th received symbol, then the jamming detection metric is defined as ‖Vk‖/‖Vk−1‖ at k-

th symbol time, which is named as jumped value. Jamming attack is supposed to start when

‖Vk‖/‖Vk−1‖ > γ, where γ is a pre-defined threshold for jamming detection. Jamming

attack stops if the jumped value returns to normal. In our system design, we discover a

potential jammer by identifying a jump that is higher than doubling the errors with the

jamming attack, so that γ = 2. An example is shown in Fig. 4.7(b), where we can easily

identify the beginning and end of the jamming attack.

95

(a) Soft error vector (b) Detection example

Figure 4.7: Soft error based jamming detection

4.4.4 Enhanced Defense Mechanism

The basic idea of IC is to project the received sender signal to the direction that is orthogonal

to the received jammer signal. As shown in Fig. 4.3, the signal after projection will have a

reduced signal amplitude, depending on the angle between the two signals. The IC method

is most effective when the sender signal and the jammer signal are orthogonal [37, 113].

Therefore, another approach we can explore here is to maximize the amplitude of projected

sender signal, i.e. to improve the sender signal decodability.

The key idea is to rotate the sender’s signal so that the received sender signal is orthogonal

to the jamming signal. This mechanism works for a multi-antenna sender. Using a 2 × 2

MIMO link as an example,

(y1y2) = hjxj + Hs(10)xs, (4.10)

where hj denotes a two-dimensional channel vector from J to R, and Hs is the 2× 2 channel

matrix from S to R. We exploit the nice property of MIMO communications to control the

received signal vector along which the signal is received [36]. Instead of multiplying vector

[1 0]T , MIMO allows the sender to multiply with a different two-dimensional vector ~r, which

96

we call rotation vector 3 . After that, the sender will transmit two elements of ~r ·xs, one over

each antenna respectively, and the receiver will receive Hs ·~r · xs. In this way, the sender is

able to control the received signal vector, thus the received signal direction.

Constraints on Rotation Vector. After signal rotation, the received signal can be repre-

sented as:

(y1y2) = hjxj + Hs~rxs,

with a 2 × 2 channel matrix between S, J and R as H = {hj,Hs~r}. In order to make xs

decodable, H should remain as a full rank matrix. Thus, one constraint on ~r is that it cannot

reduce the rank of channel matrix.

In addition, the received signal powers from the sender and jammer are PSR ∝ Ps‖Hs~r‖2 and

PJR ∝ Pj‖hj‖2, where Ps and Pj are the sender and jammer’s transmission powers. From

the above formulas, different ~r may induce different PSR and SJR, which will in turn affect

the decoding performance. Therefore, we set ~r as a unit vector, i.e., ‖~r‖ = 1, such that PSR

can be confined in a reasonable range.

Sender Signal Enhancement Mechanism. In a 2 × 2 MIMO link of Eq. (4.10), signal

rotation can be achieved by simply multiplying normalized ~r = (H−1s · h⊥j )/‖H−1

s · h⊥j ‖ =

H−1s · [1,−

hjh′j

]T/‖H−1s · h⊥j ‖ to the sender signal, so that the received legitimate signal will

be orthogonal to the jamming signal, where h⊥j stands for the orthogonal vector of hj.

However, SSE is carried out over sender signal, while the channel estimation is conducted

at the receiver side. A feedback mechanism is necessary for sending the rotation vector ~r

calculated at the receiver back to the sender.

We define a “burst of packets” as a consecutive sequence of packets during the commu-

nications as shown in Fig. 4.8. During each burst, after identifying jamming threats, the

3Note that the signal rotation is carried out in the antenna-spatial domain rather than in the I-Q domain.

97

Figure 4.8: Burst of packets

sender continuously rotates the transmit signals of the subsequent frame using the computed

rotation vector of the previous frame carried by the feedback frame. To reliably feedback

rotation vectors in the presence of reactive jammer, we develop a feedback mechanism as

follows.

Feedback Mechanism. The feedback frame can be formulated using the same frame struc-

ture in Fig. 4.1 because it is short. The same IC technique can be employed to decode the

feedback information at the sender, reversing the roles of the sender and receiver in the

forward channel. However, during the transmission of packet bursts, it is highly likely that

both the feedback packets and the subsequent forwarding packets will be completely jammed

by the reactive jammer. In such a scenario, we try to find an opportunity to compute the

jammer’s channel ratio when the jammer is alone on the medium.

There are various situations that a jammer’s isolated transmission could be captured. In

the case that the feedback packets are covered by the jamming signals, the jamming signal

transmits ahead of the feedback signal, leaving the opportunity of capturing the jammer’s

isolated transmission, from which the sender can compute the jammer’s channel ratiohjsh′js

by

taking the ratio of two jamming signals received on his/her two antennas ys1 = hjsxjs and

ys2 = h′jsxjs. The receiver could also delay the transmission of the feedback packet for a

random time period so that the sender could capture jammer’s isolated transmission right

after his/her own transmission finishes. In either case, the sender uses the jammer’s channel

98

ratio to eliminate the jamming signal from the received mixed signal Jr + Sr, and find the

preamble to estimate the feedback channel using Eq. (4.6), which can be used for signal

decoding as usual.

Similarly, the receiver can also use the same mechanism to recover the completely jammed

forwarding packets in a packet burst. Two points are worth noting: first, the sender needs

to detect the jamming signals to decide whether he/she will apply the rotation vectors to

the subsequent packet. In particular, if the sender detects jamming signals when decoding

the feedback packet, he/she will apply rotation vectors, assuming the jammer will be active

for the subsequent transmission. Second, the feedback information should be received in a

timely fashion, because if the channel estimation expires, the rotation vector will no longer

be effective. Thus, the sender will count the feedback time to determine whether to apply

rotation vectors or not.

4.4.5 Defending Against Reactive Jamming In a Multi-hop Net-

work

In a multi-hop network with legitimate nodes and reactive jammers, every receiver in the

network performs IC-based defense mechanism when detecting jamming signals during the

packet reception period. With a reliable communication protocol such as TCP, the receiver

turns into a sender by sending back a feedback message to inform the original sender about

the reception status, after recovering the signals of interest. Because of the “quiet/stealthy”

nature of reactive jammer, the traditional media access control (MAC) protocol for wireless

networking can still be applied to avoid concurrent transmissions from multiple legitimate

senders.

Our defense mechanisms bring conspicuous opportunities to the multi-hop networking. In a

99

traditional multi-hop network with jammers, the link being jammed will be unable to transfer

information. However, by employing the proposed IC-based jamming resilient communica-

tion scheme, the jammed link is still capable of transmitting information, which introduces

new optimization problems associated with rate allocation, resource scheduling, and relay

selection in the presence of jammers, while under the protection of IC-based mechanisms.

We will investigate the cross-layer optimization of networking under jamming attacks in our

future work.

4.4.6 Dealing with Other Types of Jammers

We briefly discuss about the impacts of constant jammer and random jammer to our de-

fense mechanisms in this section. Constant jammer can cover all the packets including their

preambles, which will disable the initial channel estimation of our defense mechanisms. How-

ever, constant jamming is impractical due to its enormous energy consumption. Random

jammer randomly alternates between jamming and sleeping. We investigate the jammer’s

probability of covering preambles, and present the necessary modifications to the defense

mechanisms. First, let us assume both the jamming and sleeping periods are uniformly dis-

tributed within [0, 20]ms with an average of 10ms, thus the random jammer starts jamming

with a probability of 1/2. We further assume the preamble length is 0.1ms, and one burst

lasts for 100ms with 400ms inter-burst idle interval. Then, the probability of covering the

preamble of the first packet in the burst can be easily written by: 10/0.1(500−10)/0.1

· 12≈ 1%. One

can further reduce the probability by introducing a longer burst or burst interval, which

makes the preamble distortion a small probability event. Second, as the jamming detector

can identify the beginning and end of jamming attacks promptly, we can modify our de-

fense mechanisms to perform normal processing when the jammer is sleeping and conduct

IC within his/her jamming periods.

100

4.4.7 Discussion

Our defense mechanisms enable a reliable OFDM communication in the presence of powerful

single-antenna reactive jammer. Extending to a network with multiple jammers, the defense

mechanism should succeed in canceling jamming signals as long as different jammers operate

on different spectrum bands or transmit at different time slots, since the cancellation is

carried out for each OFDM subband at one time. In addition, our defense mechanism defeats

the multi-antenna jammers transmitting the same jamming signals over all the antennas,

because they can be regarded as single-antenna jammers with aggregated channel state

information. However, multi-antenna or multiple jammers sending multiple jamming streams

simultaneously are more destructive to the MIMO-OFDM communications, since they can

deplete the DoF of MIMO links. Currently, there is no available solution in the literature

to provide jamming resilient communication under multi-antenna jammers sending multiple

concurrent jamming streams. How to deal with such jammers will be considered in our future

work.

4.5 Implementation

We build a prototype using five USRP-N200 radio platforms [6] and GNURadio software

package. Each USRP board is equipped with one XCVR2450 daughterboard operating on

802.11 spectrums. The MIMO cable allows two USRP devices to share reference clock and

achieve time synchronization by letting the slave device acquire clock and time reference

from the master device. By connecting two USRP boards using MIMO cable to act as one

MIMO node, we build a 2 × 2 MIMO system using four USRP boards. Each MIMO node

runs 802.11-like PHY layer protocol using OFDM technology with 64 OFDM subcarriers.

The MIMO system works with various modulation types, while we use BPSK for legitimate

101

communications in our experiments. We configure each USRP to span 1MHz bandwidth

by setting both the interpolation rate and decimation rate to 100. MIMO IC technique

is implemented at the receiver to recover the signals of interest. We also implement the

enhanced defense mechanism by incorporating SSE.

The reactive jammer is another USRP device connected with XCVR 2450 daughterboard.

To defend against jamming attack, the receiver first estimates sender’s channel and jammer’s

channel ratio, then uses IC technique to eliminate the signals from the jammer. Meanwhile,

the receiver will compute the rotation vector and transmit it back to the sender for SSE.

After receiving the rotation vector, the sender checks whether it is still within the predefined

channel coherence time since its previous transmission. If it is, the sender will apply the

rotation vector to the newly generated symbols and send the rotated elements through two

antennas. We set the transmission power of both the sender and jammer as 100mW by

default.

Implementing a SDR-based reactive jammer is itself a non-trivial task [34, 115]. Here, we

emulate the reactive jamming attack and the jammer’s carrier sensing process by letting

the receiver broadcast a trigger signal. Both the jammer and sender record the timestamp

of detecting the trigger ttrig, then sender sets its beginning time of transmission as tsend =

ttrig + t∆1, and jammer sets its jamming start time as tjam = ttrig + t∆2. Then, the reactive

jammer’s reaction time is equivalent to (t∆2 − t∆1).

4.6 Evaluation

In this section, we demonstratively show the ability of jammer to disable MIMO IC mech-

anism, and we also evaluate the performance of our defense mechanisms in an indoor lab

environment. In our experiments, we first show how the received signal direction affects the

102

Carrier Frequency 2.4512GHzModulation Type BPSKTransmit Amplitude 1Transmit Gain 30dBReceive Gain 30dBOFDM FFT Length 64OFDM OccupiedTones

48

OFDM CP Length 64

Table 4.1: Default system setup

packet delivery performance. Then, we present our measured channel coherence time in the

indoor environment and discuss how it will affect the performance of our defense mecha-

nisms. The performance of jamming attack and defense mechanisms is evaluated using a

testbed under different bandwidth settings, different jamming powers and different types of

jamming signals. Finally, we provide an overhead analysis of our defense mechanisms. The

default communication system parameters are listed in Table 4.1.

4.6.1 Impact of Received Signal Direction

We argued in Section 4.3 that the angle between two received signal directions will affect the

decoding performance using IC. In this section, we will show the packet delivery performance

with respect to different angles. We set up two clients synchronized by a MIMO cable,

together with a two-antenna receiver. Then, two clients transmit different streams to the

receiver. The receiver applies IC technique to decode one of the streams by regarding the

other stream as interference from the jammer. We mentioned that the signal direction

is determined by the channels between the transmitter and the receiver in Section 4.2.3.

Although the channel evolves over time, we observe that the angle remains relatively stable

for the time being, given the fixed locations of clients and receiver. Then, we change the

locations of the clients and receiver to measure the packet delivery performance with different

angles between two received signals. We fix the distance between the clients and receiver, so

103

0 10 20 30 40 50 60 70 80 900

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Angle Between Two Clients in Degrees

Pac

ket D

eliv

ery

Rat

e

Figure 4.9: Packet delivery rate performance with different angles between two receivedsignals

that the performance variation among different cases is mainly induced by different angles,

rather than different path losses.

We show the performance measurement in Fig. 4.9, from which we can see the angle between

two received signals indeed affects the packet delivery performance significantly. The major

observation is that PDR declines below 20% once the angle becomes smaller than 20◦, while

PDR rises above 90% once the angle expands greater than 60◦. This result confirms our

analysis.

4.6.2 Impact of Channel Coherence Time

The channel coherence time determines how often the channel estimation should be updated

and the validity period of the rotation vector. In this section, we measure the channel

coherence time in an indoor environment.

We let a sender transmit consecutive known OFDM symbols following a preamble to track

the channel variations. The receiver uses these known OFDM symbols to estimate the chan-

nel coefficients, and examines how long the channel from the sender to the receiver remains

104

correlated. Each channel coefficient is a complex number with amplitude and phase values.

We investigate multiple subcarriers over several rounds. Fig. 4.10 shows the autocorrelation

of channel phase over multiple subcarriers. The channel phase correlates over multiple OFD-

M symbols before it becomes uncorrelated (i.e. autocorrelation value becomes zero [119]).

The number of correlated OFDM symbols varies with subcarriers, with the average number

of 33. On the other hand, the channel amplitude stays more stable over multiple OFDM

symbols, whose autocorrelation value shows correlation over 500 OFDM symbols. Therefore,

the channel coherence time in our experimental environment is nearly 33 OFDM symbols

or 8.5ms, which indicates that the channel estimation should be updated at least every 30

OFDM symbols, nearly 200 bytes under 500KHz bandwidth, or nearly 400 bytes under

1MHz bandwidth. Therefore, the pilots should be inserted at least once every 100 (200)

bytes of data under 500KHz (1MHz) bandwidth, because the estimation of the sender’s

and jammer’s channels is updated alternately every other pilot as shown in Section 4.4.2.

This result also tells us the rotation vector is effective within the 33 OFDM symbol time,

after which the rotation vector becomes expired.

Note that during jammer’s channel estimation in Section 4.4.2, we assume jammer’s channel

keeps static during the channel coherence time. However, mobile jammer has the ability of

changing his/her channel conditions in real-time. Referring back to Fig. 4.4, we notice 10cm

distance change will bring a dissimilar channel, i.e., if the jammer moves 10cm within the

channel coherence time, not only the jammer’s channel estimation will be inaccurate, but the

jammer can also vary his/her signal directions to nullify the channel tracking. However in this

case, the jammer should move at a speed of at least 10cm8ms

= 12.5m/s, or equivalently 45km/h,

making it extremely difficult to target at a specific MIMO link. Apparently, reducing the

pilot interval is a remedy to defeat a high-speed jammer. We will design experiments to

evaluate the IC performance under mobile jammers in our future work.

105

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of OFDM Symbols

Nor

mal

ized

Aut

ocor

rela

tion

Val

ue

1st Subcarrier Autocorrelation15−th Subcarrier Autocorrelation35−th Subcarrier Autocorrelation

Figure 4.10: Autocorrelation of the channel phase in an indoor environment (tested using500KHz bandwidth communications)

Case Number Location Set (Sender, Jammer)1 (1,2)2 (3,7)3 (default) (4,5)4 (6,8)5 (8,9)6 (5,9)7 (4,8)

Table 4.2: Testbed setup

4.6.3 Jamming Attack and Defense Performance

In this section, we evaluate the performance of the jamming attack and defense mechanisms

in terms of packet delivery rate. We place the receiver at location A in Fig. 4.11. In each

run, we place the sender and jammer at the selected locations in Fig. 4.11. We run the

experiments in seven different cases, as shown in Table 4.2. We repeat each case for more

than 10 times, with each run transmitting 5000 packets. The jamming signals are randomly

generated OFDM-modulated signals with similar configurations as in Table 4.1, but with

512 OFDM FFT length, 200 occupied tones and 128 CP length.

First, we present the jamming attack performance by jamming the 1 × 2 MIMO link in

Fig. 4.12, from which we can see that the PDR drops to zero in almost all seven cases in

the presence of the reactive jammer. This result shows the reactive jammer succeeds in

106

Figure 4.11: Testbed. The receiver is placed at A, while the sender and jammer are placedat the selected locations 1 to 9.

throttling MIMO-OFDM communications completely.

Then, we run another set of experiments to jam a 2 × 2 MIMO link. Fig. 4.13 plots the

sender’s PDR performance under different bandwidth settings. This figure also shows the

reactive jammer is very effective in degrading packet delivery performance of the MIMO links,

as none of the packets is successfully delivered to the receiver using the traditional MIMO

decoding scheme. In contrast, using our defense mechanism with IC technique, the jamming

signals can be eliminated to some extent by estimating jammer channel ratio. Therefore, the

PDR under 500KHz bandwidth can stay higher than 30%, while exact PDR value depends

on the channel estimation accuracy and the relative angles between the received signals from

the jammer and sender. We notice that the achieved performance shows great variations

across difference cases.

Finally, the PDR performance can be further improved using SSE. Both Fig. 4.13(a) and

Fig. 4.13(b) reveal that the packet delivery performance using enhanced defense mechanism

after applying SSE has been significantly improved and becomes more stable. In particular,

the jamming resilient communications achieve more than 60% PDR under 500KHz band-

width and more than 40% PDR under 1M bandwidth. Thus, we conclude that SSE can help

107

Figure 4.12: Packet delivery rate with and without jammer in 1× 2 link

sustain more robust OFDM communications. From Fig. 4.13(a) to Fig. 4.13(b), we note a

trend that the packet delivery performance becomes worse as the transmission bandwidth

expands. That is because higher data rate transmission is more sensitive to the burst of

interference and noise in the environment [120].

Different Jamming Signal Powers. Different jamming signal powers affect the jamming

attack and defense performance significantly. High power jamming signals will decrease SJR,

making it more difficult to cancel them out. We evaluate the PDR performance of 2 × 2

MIMO link under reactive jamming attacks with different jamming powers. We change the

jamming power by adjusting the jammer’s transmit amplitude from 0 to 1, corresponding

to the range of jamming power from 0 to 100mW. The sender’s transmit amplitude is set

as 0.5, and we place the sender and jammer according to case 3. Both Fig. 4.14(a) and

Fig. 4.14(b) show the PDR drops drastically with the increase of jamming power. Although

high power jamming signals drag down the PDR performance using IC and SSE techniques,

it is noticeable that the communication system using our defense mechanisms becomes more

robust against high power jamming attacks. Even with the jamming power that is nearly two

times of sender’s power (i.e., with transmit amplitude of 1), the enhanced defense mechanism

108

(a) 500KHz Bandwidth (b) 1M Bandwidth

Figure 4.13: Jamming attack and defense performance

with IC and SSE still achieves more than 50% PDR under 500KHz bandwidth (40% PDR

under 1MHz bandwidth). In the experiment, we find that our proposed defense mechanisms

are robust against different power levels of the jammers.

Different Types of Jamming Signals. We also evaluate the PDR performance using four

types of jamming signals: constant power signal, Gaussian noise, square signal, 100KHz sine

signal. These signals are configured to have 0.5 transmit amplitude, 30dB transmit gain,

2.4512GHz RF frequency. Fig. 4.15(a) shows the PDR performance using IC technique to

defend against different types of jamming signals. We vary the jammer’s transmit power,

and the results illustrate the effectiveness of our defense mechanism under various types

of jamming signals. Comparing between different types of jamming signals, we find that

Gaussian noise and sine signal lower down the PDR performance of our defense mechanism.

This is because the constant power signals and square signals are much easier to cancel out

compared with Gaussian noise and sine signals. Fig. 4.15(b) plots the PDR performance

using our enhanced defense mechanism with IC and SSE, which demonstrates the benefits

brought by SSE technique. Our enhanced mechanism achieves a improved PDR performance

109

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Transmit Amplitude

Pac

ket D

eliv

ery

Rat

e

Without DefenseDefend Solely with ICDefend with IC and SSE

(a) Varying jammer’s transmit amplitude under500KHz Bandwidth

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Transmit Amplitude

Pac

ket D

eliv

ery

Rat

e

Without DefenseDefend Solely with ICDefend with IC and SSE

(b) Varying jammer’s transmit amplitude under1M Bandwidth

Figure 4.14: PDR performance with different jamming powers

compared with Fig. 4.15(a) with IC technique under all four types of jamming signals. This

result proves the robustness and wide applicability of our defense mechanisms to defend

against various types of jamming signals.

Throughput Performance. Finally, We further evaluate the throughput performance of

the proposed jamming resilient communication mechanism under reactive jammers. Fig. 4.16(a)

and Fig. 4.16(b) show that our enhanced defense mechanism achieves 140 Kbps under

500KHz bandwidth and 220 Kbps under 1MHz bandwidth. Without jammers, the max-

imal achievable throughput under 500KHz is 187.5 Kbps, while the achievable throughput

under 1MHz is 375 Kbps. It is worth mentioning that the reative jammer causes a non-

connectivity scenario without defense mechanisms as shown in Fig. 4.13(a) and Fig. 4.13(b).

Therefore, considering the powerfulness and effectiveness of reactive jammers, the through-

put achieved using our defense mechanisms is very promising. In conclusion, our defense

mechanisms revive acceptable data rate communications under powerful reactive jamming

attacks.

110

(a) Using IC (b) Using IC and SSE

Figure 4.15: PDR performance with different types of jamming signals

(a) 500K Bandwidth (b) 1M Bandwidth

Figure 4.16: Throughput performance using our defense mechanisms

111

4.6.4 Overhead Analysis

We analyze the overhead for both the pilots and feedback information. In Section 4.6.2,

we measured that the channel coherence time (w.r.t. channel phase) in our experimental

environment is nearly 33 OFDM symbols or 8.5ms using 500KHz bandwidth, which indicates

that the channel estimation should be updated at least every 30 OFDM symbols. Thus, one

pilot symbol should be inserted every 15 OFDM data symbols, which takes nearly 6% of the

whole packet. On the other hand, the feedback message includes 48 rotation vectors with one

for each subcarrier in our setting. In order to reduce the feedback size, instead of returning

all the 48 vectors, it is sufficient to respond 12 vectors, since the channels for consecutive

subcarriers are rather similar. In addition, as the direction of vector [v1, v2] is equivalent to

[1, v2v1

], we can reduce the number of elements in a vector into one complex number. The

overall feedback overhead adds up to 24 bytes, or 4 OFDM symbols. Therefore, the feedback

information is also very short with only a few OFDM symbols.

4.7 Summary

OFDM is one of the most widely adopted wireless communication schemes. Despite its

popularity in the wireless field, it is vulnerable to advanced jamming attacks, especially

the powerful reactive jamming attack enabled by SDR technology. While no effective anti-

jamming solutions exist to secure OFDM communications, for the first time, we exploited

MIMO technologies to defend against such jamming attacks. We showed that such attacks

can severely disrupt MIMO-OFDM communications through controlling the jamming signal

vectors in the antenna-spatial domain. Accordingly, we proposed defense mechanisms based

on interference cancellation and transmit precoding techniques to maintain OFDM commu-

nications under reactive jamming. Our prototype experimental results demonstrated that,

112

while the MIMO-OFDM communication can be completely throttled by jamming attacks,

our defense mechanisms can effectively turn it into an operational scenario with considerable

throughput.

Chapter 5

Unveiling Peer-to-Peer Botnetsthrough Dynamic Group BehaviorAnalysis

Advanced botnets adopt peer-to-peer (P2P) infrastructure for more resilient command and

control (C&C) without relying on a central server. Traditional signature, sandboxing, and

blacklist based detection techniques become less effective in identifying bots that communi-

cate via a P2P structure. In this chapter, we present PeerClean, a novel system that detects

P2P botnets in real time using only high-level features extracted from C&C network flow

traffic. PeerClean reliably distinguishes P2P bot-infected hosts from legitimate P2P hosts

by jointly considering flow-level traffic statistics and network connection patterns. Instead

of working on individual connection or host, PeerClean clusters hosts with similar flow traf-

fic statistics into groups. It then extracts the collective and dynamic connection pattern

of each group by leveraging a novel dynamic group behavior analysis. Comparing to the

individual host-level connection patterns, the collective group patterns are more robust and

differentiable. Multi-class classification models are then used to identify different types of

bots based on the established patterns. To increase the detection probability, we further

propose to train the model with average group behavior, but to explore the extreme group

113

114

behavior for the detection. We develop a prototype system, and evaluate it on real-world

flow records from a campus network with IPv4 space of size /16. Our evaluation shows that

PeerClean is able to achieve high detection rates with few false positives in identifying P2P

botnets.

5.1 Related Work

The increasing popularity of P2P botnets have led to a vast amount of research that attempt

to track and remove them. In those work, the detection mechanisms can be classified into

two categories: host-based approaches and network-based approaches. The second category

can be subdivided into network traffic-based approaches and communication graph-based

approaches.

Network traffic-based approaches: Some related work utilized attack traffic character-

istics to identify hosts with similar abnormal network behaviors, such as spamming, port

scanning, sharing the same packet contents [121], or, having common destinations, similar

payloads and common host platforms [122]. However, these approaches can be evaded by

manipulating attacking strategies, such as using social engineering as an infection vector

instead of scanning.

Several work focused on identifying C&C traffic from the botnets. Bilge et al. [53] proposed

to use NetFlow analysis to distinguish botnet C&C servers from benign servers by extract-

ing flow-level features from the data. Wurzinger et al. [123] identify C&C by automatically

extracting signatures from bot responses after receiving commands. However, this approach

is hindered by traffic encryption. Moreover, the above approaches, which use only flow-level

statistics, are not robust enough to produce accurate detection results. Instead, PeerClean

greatly enhances the detection capability by jointly considering the flow-level traffic statis-

115

tics and network connection behaviors. Recently, Zhang el al. [124] proposed to pinpoint

stealthy malicious activities using triggering relation discover of network events. They used

the proposed mechanism to detect DNS-based botnets. However, the performance of differ-

entiating P2P C&C traffic and benign P2P traffic using triggering relation discover remains

to be seen.

Communication graph-based approaches: In [125], Coskun et al. proposed to identify

the local members of P2P bots using mutual contacts graph. However, this method requires

to start with a captured seed bot in the network, which may not be available. [126] attempted

to identify spamming bots using large scale graph analysis by looking for tightly connected

subgraph components. Jelasity et al. [127] argued that it is difficult to detect P2P bots

using traffic dispersion graph (TDG) especially with a limited view of the Internet traffic at a

single AS. Most recently, Li et al. [128] proposed to detect P2P community by identifying the

densely connected subgraphs. However, this approach only focused on a backbone network

which requires a very large communication graph. Also, solely relying on the connection

patterns, it may falsely include lots of benign hosts in the discovered P2P botnets.

P2P botnet resilience: Several related work studied the resilience of P2P botnets. [129]

provided a overview of bot structures, which shows the P2P bots with random graph networks

are highly resilient to both random and targeted defense mechanisms. Most recently, [54]

investigated an important characteristic of botnets: assortativity, and showed its impacts to

the network resilience and recovery. [44] provided formal models to systemize the attacking

strategies against P2P botnets. However, these takedown attempts require reverse engi-

neering of the bot binaries to crawl or sinkhole the whole botnets, which prevents the wide

applicability of these approaches.

116

Figure 5.1: PeerClean system flow

5.2 Overview of PeerClean

Our primary goal is to design a detection system for the network administrators to identify

P2P bots in a monitored network. Toward that goal, we present our data-driven detection

framework, PeerClean, which exploits network flow data captured at the edge of the network.

In this section, we give a brief overview of the PeerClean system.

Figure 5.1 shows the system flow of PeerClean. The upper part of the figure describes

the training process, with inputs from two labeled data sets: one is a subset of monitored

NetFlow data that is from the labeled legitimate P2P hosts, and the other one contains

the data from the labeled P2P bots (we discuss the acquisition of training data in Section

5.5.1). For each type of legitimate P2P hosts and P2P bots, PeerClean then performs

DGBA training to extract a collection of group-level connection features aggregated from

all the hosts of this specific type, and trains a SVM classification model using the extracted

group-level features. The bottom part of the figure presents the detection process with input

of monitored NetFlow data. After identifying the P2P hosts in the network, PeerClean

carries out P2P host clustering using the statistical features of their traffic flows, and applies

DGBA detection to every cluster of interest with the goal of detecting clusters containing

bots. Finally, the refined bot identification picks out the bots from the clusters for further

processing. PeerClean can be regarded as a three-layer system, with the first-layer modules

processing every host, the second-layer modules operating on the host group/cluster, while

the third-layer modules further handling the identified bot clusters.

117

Input Data: The input data set consists of a training data set and a testing data set of

NetFlow format. The testing NetFlow data can be one hour (or one day) flow traffic traces

captured at the gateway router of a campus (or enterprise, ISP) network, while the training

data set is constructed by the traffic from identified P2P bots and legitimate P2P hosts.

Specifically, the training data of P2P bots can be imported from honeynets running in the

wild, and the training data of legitimate P2P hosts can be derived via identifying legitimate

P2P traffic in the captured payload-available traffic traces using a signature matching method

[130].

P2P Host Identification: The high-speed networks generate a huge amount of NetFlow

data, which would potentially overwhelm the processing capability of our detection system.

Thus, the first step of PeerClean is to reduce the traffic volume by filtering out the hosts that

are unlikely to be related to P2P communications. Our approach is based on the observation

that the hosts engaging in P2P communications exhibit high failed connection rates mainly

caused by the high peer churn rate [131]. Therefore, we compute the percentage of failed

connections inside each time epoch (e.g. 1 hour). The hosts with failed connection rate

higher than an empirical threshold are selected as candidate P2P hosts [46]. This selection

process allows us to retain hosts engaging in P2P communications, while eliminating a vast

majority of non-P2P hosts.

Flow Statistical Feature Extraction: Having eliminated most of the traffic from non-

P2P hosts, we extract a collection of flow statistical features from the network flows of each

candidate P2P host. We propose two sets of features, including flow size statistical features

(e.g., the average number of bytes per flow transmitted from each host) and host access

pattern features (e.g., the outgoing/incoming flow interarrival time from each host). We will

explore the motivation behind these features in Section 5.3.1.

P2P Host Clustering: The objective of host clustering is to group together the hosts with

118

similar flow patterns. Since P2P bots from the same botnet share the same P2P network and

communication protocol, their flow features are likely to be clustered together in the feature

space. Clustering techniques aim at finding meaningful clusters that are both compact and

well separated from each other. PeerClean applies a clustering algorithm to generate the

clusters of hosts solely based on their flow statistical features. Ideally, we will gather the

bots belonging to different types of botnets into separated and compact clusters without

including any legitimate hosts, which is often not the case in reality.

Dynamic Group Behavior Analysis: DGBA investigates the aggregated connection be-

haviors of the entire cluster of hosts. The group-level connection features include cluster

connectivity feature (i.e., how the cluster connects to the outside world), shared neighbor

feature (i.e., the shared contacts of the hosts inside the cluster), significant connection feature

(i.e., the connections contributing a significant amount of network traffic), and temporal fea-

ture (i.e, how the connection behavior evolves over time), the details of which are presented

in Section 5.3.3. We divide DGBA into two phases for training and detection respectively.

DGBA training extracts average group-level connection features from the group of P2P bots

or legitimate P2P hosts of each type, while DGBA detection examines extreme group-level

connection features of every unlabeled cluster in order to identify clusters containing P2P

bots more accurately.

Training and Classification: As network traffic pattern is often distorted by noise, Peer-

Clean trains a non-linear SVM classifier due to its robustness against noisy data, using the

group-level connection features from DGBA. After the classifier is constructed with the train-

ing data set, PeerClean applies this classifier for classifying the unlabeled clusters generated

from the testing data set. Hopefully, each cluster containing a certain type of P2P bots will

be assigned a label corresponding to that specific bot type.

Refined Bot Identification: Finally, for each cluster classified as a cluster with bots, we

119

further identify the P2P bots inside the cluster based on their individual connection behav-

iors. We leverage a simple threshold-based approach to discriminate P2P bots from falsely

included benign users in the cluster. The identified P2P bots are then labeled and confirmed

as “infected”, calling for subsequent bot cleanup actions from the network administrators.

Detection Period: Since bot memberships are dynamically changing with some bots been

cleaned up and others been newly infected, we propose to perform bot detection periodical-

ly. PeerClean supports various configurations of detection periods, as long as bots generate

enough network flows with representable flow and connection features during that period.

In this chapter, we select one hour as the detection period in response to agile bot infec-

tions. Specifically, PeerClean produces one SVM model for each hour of the day. Then, by

examining each hour of testing traces collected from the edge routers, PeerClean identifies

specific types of bots existed in the network within that hour. In this manner, PeerClean

enables real time bot detection which supports a fast response to the bot infections (i.e. one

hour response time in this chapter).

Studied Botnets: We give a brief introduction to three botnets studied in this dissertation:

Sality, Kelihos and ZeroAccess Botnets. All these three botnets use P2P channel as their

primary communication channel. Sality botnet uses an unstructured P2P network, where

peers regularly contact their neighbors to exchange new URLs for downloading malwares.

Kelihos botnet is an unstructured P2P network with a tiered infrastructure, which includes

an upper layer of centralized nodes providing commands to a middle layer of router nodes.

Nodes at the router layer are responsible for relaying messages to a lower network layer

consisting of regular P2P worker nodes. The worker nodes are hosts that do not have a

direct connection to the internet, who are used for carrying out malicious activities such

as sending spam, collecting email addresses, sniffing user credentials. ZeroAccess botnet is

another unstructured P2P botnet which is used to download new malware payloads and to

120

form a network mostly involved in Bitcoin mining and click fraud, while remaining hidden

in the infected hosts using rootkit techniques.

5.3 System Design

PeerClean systematically incorporates two categories of features including flow statistical

features and network connection features. The effectiveness of PeerClean largely hinges

upon the discriminative ability of the selected features to set apart various P2P bots and

legitimate P2P hosts. In this section, we demonstrate the rationale behind the feature

selection, and look into the strength and weakness of the selected features. Meanwhile, two

machine learning techniques performing clustering and classification are described, which are

used to gather, identify and subsequently label the P2P bots.

5.3.1 Flow Statistical Features

The performance of host clustering relies on a set of carefully selected network flow fea-

tures. A common criticism of early attempts using machine learning methods over network

flow data is that the selected features were often not robust, resulting in an overfit model

to some specific features of the training set, such as a particular port or IP address used

by a bot. Dedicated bots can simply adapt their used ports and IP addresses to inval-

idate such flow analysis. To avoid the overfitting issue, we select flow features that are

both robust and distinctive among the botnets, including flow size statistical features and

host access pattern features. As some P2P botnets mostly use TCP protocol for C&C (e.g.

Kelihos), while others simply carry out UDP transmissions, we divide the traces into TCP

and UDP flow segments, and examine their flow statistical features separately. To yield cred-

121

ible statistical features, we only consider the hosts with 100+ TCP/UDP outgoing flows and

100+ incoming flows during the one-hour detection period E. Note that, at this stage, we on-

ly extract the flow features of candidate P2P hosts who survived the P2P host identification

process.

Flow Size Statistical Features: Flow size statistical features capture the flow size dis-

tribution for both outgoing flows and incoming flows at a specific host. Let F(out)i =

{f (out)j }j=1..m and F

(in)i = {f (in)

j }j=1..n denote the series of flows sent from or received by

host i inside E. We consider the basic flow size related features such as: bytes-per-flow (bpf)

feature and packets-per-flow (ppf) feature, as shown in Table 5.1. Note that each feature

records the distribution of the flow sizes among all the outgoing (incoming) flows at the cor-

responding host. In particular, we extract the mean µF

(out)i

, µF

(in)i

and the standard deviation

σF

(out)i

, σF

(in)i

of bpf and ppf from both the outgoing and incoming flows respectively.

This group of features characterizes the regularity of traffic flow size over time for each host.

The reason for selecting flow size features is because the flows carrying C&C information are

preferred by the botmaster to be as short as possible to remain stealthy under the surveillance

of various network monitoring tools. Moreover, due to the limited types of C&C messages,

only a small fluctuation on these features is expected. On the other hand, legitimate P2P

applications usually generate flows with large, yet highly variable flow sizes, with a few

exceptions such as skype application, which has a small flow size when used as instant

messenger or voice-over-IP client. Therefore, flow size statistical features are promising in

differentiating bots from legitimate hosts, but it alone may not be enough to create dense

bot clusters without including a large number of legitimate hosts such as the skype hosts.

Host Access Pattern Features: We introduce host access pattern features to capture the

flow arrival patterns. Table 5.2 lists the adopted features, including flow interarrival pattern,

flow density pattern and diurnal pattern. Assume T(out)i (T

(in)i ) is a time series of the starting

122

Feature Descriptions

Bytes-per-flowpattern

The distribution of thenumber of bytes per flowsent from (received by) ahost

Packets-per-flowpattern

The distribution of thenumber of packets per flowsent from (received by) ahost

Table 5.1: Flow size statistical features

time of outgoing (incoming) flows from host i inside E, based on which we can compute a

sequence of flow interarrival time I(out)i (I

(in)i ) by taking the difference of the starting time of

two consecutive flows. Flow interarrival pattern feature represents the statistical features of

flow interarrival time sequences, including the minimum, maximum, median and standard

deviation.

Different from all the aforementioned features which are extracted inside each detection

period E, the last two types of features, flow density pattern and diurnal pattern, are de-

termined anew every day. We define a time unit as a three-hour period with one whole

day been divided into 8 time units. We assume the flow amounts of each time unit during

the day pertaining to a certain host as Nj, j = 1, 2, · · · , 8. Flow density pattern records the

fraction of time units having ≥ x flows per day, i.e.,∑8j=1 σ(Nj≥x)

8, where σ() is a step function

yielding one when Nj ≥ x satisfies, and zero otherwise. In our prototype, x is empirically

set as 1000. In addition, to assess whether the flow arrival displays a diurnal pattern, we

take two percentages regarding to the flow amounts during the peak period or dip period1

as diurnal pattern features, i.e., NP∑8j=1Nj

and ND∑8j=1Nj

, where NP and ND denote the flow

numbers at the peak period and dip period respectively. These two types of features are

inserted as additional features at the final hour of the day, further elevating the possibility

1The peak time is expressed as P = arg maxjNj with flow amount NP = maxjNj , and the dip time asD = arg minjNj with flow amount ND = minjNj , j = 1, · · · , 8.

123

Feature Descriptions

Flow interarrivalpattern

The distribution of the in-coming (outgoing) flow in-terarrival time at a host

Flow density pat-tern

The fraction of the time u-nits with more than x flowsat a host

Diurnal pattern The percentage of the flownumbers in the peak (dip)period of the day at a host

Table 5.2: Host access pattern features

of differentiating various P2P bots from legitimate P2P hosts.

5.3.2 P2P Host Clustering

The basis of host clustering relies on the following observation: bots that belong to the same

botnet run the same P2P communication protocol and share the same C&C messages. In

the literature, there are a wide variety of clustering methods, but a well-suited algorithm for

PeerClean should be cautiously selected, because: first, the clustering algorithm for gathering

hosts with similar flow patterns not only determine the subsequent bot detection, but also

affect the system efficiency; second, P2P host clustering in a network with bots and benign

hosts is a challenging task, since the percentage of bots in the network is generally small

compared with the benign hosts. Thus, the clustering objective is to separate the small

number of P2P bots from a large number of benign P2P hosts. In this respect, partition-

based clustering methods suit our problem well [132].

Affinity Propagation (AP) is a recently proposed partition-based clustering method by Frey

and Dueck [133]. Compared with K-means, one of the most popular clustering methods

[134], the performance of AP does not rely on an initial selection of exemplars2 or cluster

2Exemplar represents for the cluster center that best accounts for the data in the cluster [133].

124

centers. Rather than specifying the number of clusters, AP can automatically determine

it solely based on the data. Whereas K-means clustering follows a greedy heuristics to

find the optimum of a combinatorial optimization problem, which is prone to local minima,

AP considers all data points as potential exemplars and tackles the optimization problem

distributively by exchanging messages between pairs of points until the clusters gradually

emerge. Thus, AP provides a guarantee of quasi global optimization [133].

The similarity s(i, k) of AP indicates how well data point xk is suited to be the exemplar

of data point xi. With the goal of minimizing squared error, we use negative squared error

(Euclidean distance) as the similarity measure, i.e., s(i, k) = −‖xi−xk‖2 [133] (the objective

turns into maximizing s(i, k)). Since unsupervised learning is a notoriously difficult task, it

seems impossible to achieve a perfect clustering result. In consequence, besides of several

clearly separated bot clusters (i.e. clusters of bots) and benign clusters (i.e. clusters of benign

hosts), we expect some clusters to include both benign hosts and bots, which we call mixed

clusters. For ease of exposition, the bot clusters and mixed clusters are collectively called

bot-included clusters. In the following section, we will show how we use supervised learning

to identify and further examine bot-included clusters, as well as the method of spotting bots

inside them.

5.3.3 Dynamic Group Behavior Analysis

In this section, we introduce DGBA with the objective of identifying bot-included clusters.

DGBA is based on our intuition that the bot-included clusters have cluster-level aggregated

characteristics that are distinguishable from benign clusters. Whereas the connection activity

of an individual host is highly dynamic and unidentifiable, we believe the group connection

behavior will help us identify bots’ communications.

125

Figure 5.2: Cluster connectivity feature

To enhance the detection capability, we propose two modules: DGBA training and DGBA

detection, to extract features from the training set and testing set respectively. The purpose

of DGBA training is to extract the representative group behavior from a collection of labeled

P2P hosts to build SVM classifiers, whereas DGBA detection searches for the abnormal

behaviors from every unlabeled cluster to spot P2P bots. Thus, we propose to use different

statistical aspects of the collected features from a group to represent group-level training

and detection features, respectively. Namely, the training features capture the average group

behavior, while the detection features capture the extreme group behavior (i.e. the maximum

or the minimum). Note that all the features below are extracted from the collection of traces

inside each detection period if not otherwise stated.

Cluster Connectivity Feature

Cluster connectivity feature captures the aggregated connectivity of the peers inside each

cluster. A connection between two hosts can be successful or failed. We define a good

connection as a successfully established connection between two hosts, one from the cluster

and one from its outside. We consider a TCP connection as good if it takes a complete

SYN, SYN/ACK, ACK handshake flags, and also a UDP connection as good if at least one

126

response packet is followed by a request packet. We denote the good connection set of host

hi as Ci which includes all the successful connections of host hi.

Training feature: Cluster connectivity feature for DGBA training is defined as the aver-

age number of good connections among all the P2P hosts of each type, i.e.,∑M

i=1 |Ci|/M ,

assuming M hosts of one specific type exist in the training set.

In order to see the discriminatory strength of this feature, we run an experiment using 24-

hour training data (refer to Section 5.5.1 for the data sets used in the experiment) to show the

cluster connectivity features of different P2P bots and legitimate hosts running various P2P

applications. The box-plot results (measured in 24 hours) are shown in Fig. 5.2, from which

we notice different types of P2P hosts indeed exhibit varied cluster connectivity features. In

particular, ZeroAccess bot stands out with a significantly larger amount of good connections.

We attribute the difference to several factors including: (1) the botnet network size; (2) the

botnet peer discovery mechanisms. For instance, the bots in a populous network with a

more aggressive peer discovery mechanism are supposed to have more network connections.

Among all six types of P2P hosts, ZeroAccess bot makes much more good connections than

any other bots and benign hosts, suggesting its larger network size and aggressive peer

discovery [44].

Detection feature: Cluster connectivity feature for DGBA detection is defined as the max-

imal number of good connections among all the hosts in the unlabeled cluster, i.e., maxM′

i=1|Ci|,

assuming M ′ hosts in the cluster. Fig. 5.2 shows a notable gap between ZeroAccess bots and

other types of hosts, which means ZeroAccess bots can be detected solely based on the clus-

ter connectivity feature. Specifically, since the detection feature corresponds to a maximum

connection number among M ′ hosts, the connectivity feature of a cluster containing ZeroAc-

cess bots will be exceptionally high, which immediately reveals the presence of ZeroAccess

bots.

127

(a) (b)

Figure 5.3: (a): The shared neighbor ratio of one Emule host pair compares with that of oneZeroAccess host pair (b): Group shared neighbor ratio

Shared Neighbor Feature

Shared neighbor feature captures the amount of shared connections between every pair of

hosts in each cluster. The set of shared neighbors of host hi and hj can be written as: Ci⋂Cj.

We further define pairwise shared neighbor ratio of a host pair as the ratio between the

number of shared neighbors and the number of total neighbors, i.e., sij = (Ci⋂Cj)/(Ci

⋃Cj)

for a host pair (hi, hj).

Training feature: Given the above definitions, shared neighbor feature for DGBA training

is represented by cluster shared neighbor ratio, simply defined as the average pairwise shared

neighbor ratio among all the host pairs of one type, i.e.,∑

i,j∈[1,M ],i 6=j{sij/M(M−1)

2}. Previous

work has adopted pairwise shared neighbor ratio sij [47] to distinguish between bots and

benign hosts. However, according to our experiment, pairwise shared neighbor ratio seems

ineffective in identifying certain pairs of P2P bots. In Fig. 5.3(a), we compare the pairwise

shared neighbor ratio of an emule host pair (who download the same file) with that of a

ZeroAccess bot pair. We find it almost impossible to make a distinction between these two

pairs, which brings false positives or false negatives. In contrast, group shared neighbor ratio

128

clearly differentiates ZeroAccess bots from the emule hosts with a large gap between them,

via feature aggregation from multiple hosts.

In addition, Fig. 5.3(b) shows different types of P2P bots and P2P hosts exhibit distin-

guishable shared neighbor features measured in 24 hours, where we observe P2P bots have

much higher group shared neighbor ratios compared with legitimate P2P hosts. The rea-

son is obvious - the bots from the same botnet search for the same commands published

by the botmaster [47], which makes their contacted peers more likely to be shared by oth-

er companions. Furthermore, although P2P botnets have decentralized C&C architecture,

botmasters still strive to make their P2P network robust against peer churns and provide

end-to-end communication with a minimum delay. This inherent C&C objective translates

into a convergence of contacted peers by a group of bots to ensure the reliable delivery

of C&C messages. On the other hand, different legitimate P2P hosts generally search for

different contents from their peers, which yields a more dispersed peer list.

Detection feature: Correspondingly, shared neighbor feature for DGBA detection is de-

fined as the maximal pairwise shared neighbor ratio among all the host pairs in each cluster,

i.e., maxi,j∈[1,M ′],i 6=jsij. The shared neighbor feature of every bot-included cluster will again

be dominated by the bots, since bots have significantly higher shared neighbor ratios, which

helps differentiate between the bot-included clusters and benign clusters.

Significant Connection Feature

Significant connection (SC) feature captures the amount of hot links in the network, i.e.,

the connections that contribute significantly larger amounts of network flows compared with

the other connections. The SCs extracted from the Internet traffic data have been used to

diagnose the network operation and quickly identify the anomalous events [135]. Similarly, we

129

try to identify SCs of bot groups for better understanding the bots’ behaviors and accurately

identifying bots’ presence.

Background of Entropy-based Approach: In the information theory, entropy quantifies

the amount of uncertainty contained in the data. Given a random variable R that may take

nR discrete values, suppose we sample R for n times, we observe R takes the value of ri for

ni times, which will produce a probability distribution for R, i.e., P (ri) = ni/n, ri ∈ R. The

entropy of R is defined as:

H(R) := −∑ri∈R

P (ri)logP (ri),

where H(R) is a function of support size nR and sample size n. To eliminate the dependency

on support and sample sizes, standardized entropy is proposed to provide a robust quantifier

for data variety or uniformity [135]:

Hs(R) :=H(R)

log min{nR, n},

where Hs(R) = 0 means that all the observations of R have the same value, thus no data

variety exists, while Hs(R) = 1 means all the observed values of R are different. Note that

in this chapter, n� nR, such that Hs(R) = H(R)/log n.

Let C be the set of all the observed values on R, then Hs(R) = 1 satisfies if and only if |C| = n

and P (ri) = 1/n for ri ∈ C, which means the observed values are uniformly distributed over

C. Thus, Hs(R) provides a measure for the data uniformity in the observed value set

C. Similar to conditional entropy, conditional standardized entropy Hs(R|C) is derived by

conditioning R based on C. As H(R|C) = H(R), we have Hs(R|C) = H(R)/log|C|. Hence,

when Hs(R|C) getting close to 1, the observed values are more uniformly distributed over

the set C, or less distinguishable from one another. On the other hand, Hs(R|C)� 1 depicts

130

that some values are more frequently observed. This measure is used to define significant

connections as follows.

Significant Connection Feature Extraction: Significant connection feature is quantified

by the conditional standardized entropy, and extracted from the traffic flows of every host.

Suppose a random variable R denotes the connections made by a host in the cluster. Given a

detection period E, let N be the total number of flows observed from/towards the host, and

C = {c1, c2, . . . , cm}, m ≥ 2 be the set of distinct connections in R that the observed flows

take. Then the induced probability distribution PC on R is given by PC(ci) = Ni/N , where

Ni is the number of flows from the connection ci. Based on PC , the conditional standardized

entropy, Hs(PC) := Hs(R|C), measures the degree of uniformity in the observed features C.

In particular, with Hs(PC) getting close to 1, the observed values in C are close to being

uniformly distributed, or indistinguishable from one another. Otherwise, some values in C

“jump out” from the rest.

To extract significant connections, we employ a dynamic threshold algorithm [135] to separate

C into two sets: the significant connection set Cs and the indistinguishable connection set

Ci. In this algorithm, Cs becomes the most significant connection set under two conditions:

(1). if Cs is the smallest subset of C such that the probability of any value in Cs is larger

than that of the remaining values; (2). if the values in Ci are nearly uniformly distributed,

i.e., Hs(PCi) > λ, where λ is the significance threshold that is close to one (e.g., 0.9).

We first sort the connections in C in a decreasing order of their probabilities PC as C =

{c1, c2, . . . , cm}. Then, the dynamic threshold algorithm will find Cs = {c1, c2, . . . , ck}, Ci =

{ck+1, . . . , cm}, where k is the smallest integer such that Hs(PCi) > λ. Thus, c∗ = ck is called

the cut-off threshold such that the probability distribution of the remaining connections

almost becomes uniformly distributed. Curious readers please refer to [135] for a complete

algorithm design.

131

(a) (b)

Figure 5.4: (a): Significant connection feature (b): Significant connection feature of Kelihosand ZeroAccess bots in 24 hours

Finally, the significant connection feature is simply the number of connections in Cs, i.e.

|Cs|.

Training feature: We define the SC feature for DGBA training as the average number of

SCs for all the hosts of one type. Fig. 5.4 shows the SC features of Sality bots and three

other types of P2P hosts measured in 24 hours. Compared with Sality bots, these legitimate

P2P hosts produce a larger number of SCs.

We believe this distinctive phenomenon is attributed to the following fact: the SCs in a

botnet indicate the existence of some active bots that are pivotal to the P2P infrastructure.

These active bots may be well connected with a high bandwidth connection, or may be close

to the botmaster. Few number of distinctive connections favors the botnets in remaining

stealthy under the radar of numerous intrusion detection systems. Conversely, benign P2P

hosts yield a much higher number of SCs due to their unorganized nature.

Interestingly, the traffic flows from ZeroAccess and Kelihos bots reveal unique SC patterns

as shown in Fig. 5.4(b). ZeroAccess bots simply have none SCs, while Kelihos bots suddenly

132

Figure 5.5: Significant connection volatility

generate a large amount of SCs from a “hot” period between 7pm to 1am. This period

can be interpreted as a peak period of C&C message exchanging, with so many suddenly

emerging hot links. The study of this abrupt phenomenon and the exact origin of SCs of

botnets are out of scope of this chapter, but may become research topics on their own rights.

Detection feature: Among all the hosts in the cluster, SC feature for DGBA detection is

defined as the minimal number of SCs, or the maximal number if it exceeds an empirical

threshold α. Thus in most cases, the SC feature of bot-included cluster will be dominated by

the bots with less SCs. However, the number of SCs of Kelihos bots skyrockets during the

“hot” period, which far exceeds that of the normal hosts. Hence, by setting an appropriate

threshold α (e.g. 200), we guarantee the group-level feature of the cluster containing Kelihos

bots will be ruled by the bots’ behavior, which reveals the existence of Kelihos bots.

Temporal Feature

Lastly, temporal feature captures the dynamic evolvement of SC sets. Instead of performing

feature extraction per one-hour detection period, temporal features are computed at the end

of each day to combat noise and disturbance, which are represented by significant connection

133

volatility, measuring whether the cluster has the same set of SCs over time. We assume the

number of distinct SCs for host hi over the whole day is Ui, and the number of SCs during

k-th hour is Sik, i = {1, . . . ,M}, k = {1, . . . , 24}. SC volatility of host i is defined as:

Φi = Ui∑24k=1 Sik

. Obviously, if the SC sets of the 24 hours are all different, we have Φi = 1. On

the contrary, when the same set of SCs appears every hour, we have Φi = 1/24. In general,

the less volatile the set of SCs is, the closer Φi is toward zero.

Training feature: The temporal feature for DGBA training is represented by the average

SC volatility of all the hosts of the same type, expressed as: 1M

∑Mi=1 Φi. Fig. 5.5 shows

different temporal features for various P2P bots and legitimate P2P hosts. We notice that

Sality and ZeroAccess bots have small volatility features, while emule hosts and Kelihos

bots have a moderate value of SC volatility. SC volatility is related to a number of factors,

including the total number of SCs, the size of P2P networks and how dynamic the network

connections are. Generally speaking, the SCs of bots are less volatile than that of the benign

hosts, because the pivotal points of the botnets are required to be extremely robust and

reliable to support network-wide C&C communication, thus are less likely to be volatile over

time.

Detection feature: The temporal feature for DGBA detection captures the minimal SC

volatility of all the hosts in the cluster, i.e., minΦi. Therefore, the temporal feature of bot-

included cluster will be determined by the bots, whose SCs appear less volatile. In the end,

a smaller value of temporal feature reveals the presence of bots (i.e., Sality or ZeroAccess),

while a larger value represents the temporal feature of benign hosts (i.e., Skype or Bittorrent).

134

5.3.4 Training and Classification

Data Preprocessing: Data preprocessing tries to cope with the issue that the collective

features extracted from the network flow data have different data ranges. To make sure every

feature in the feature sets is given equal importance, we perform feature-wise normalization

to shift and re-scale each feature value so that they lie within the range of [0, 1].

Multi-class Classification: Support vector machine is adopted as our main classification

method due to its robustness, efficiency and excellent non-linear classification performance.

In particular, we use multi-class SVM classification to assign each cluster one label corre-

sponding to a specific type of botnet or a non-bot host. We denote the multiple labels as

{B1, B2, . . . , Bk}, assuming k − 1 classes of botnets with the last class representing non-bot

label. The basic component of SVM method is a binary classification mechanism, which

classifies an unlabeled cluster based on the distance of its feature to the decision hyperplane

with norm vector w and constant b:

f(x) = wTx + b =∑∀i

yiαiK(xi,x) + b, (5.1)

where xi is the feature vector of host i from the training set, yi ∈ {−1, 1} denotes the label

of the training data, and the parameters αi determines whether the host i is a support vector

(αi > 0) or not (αi = 0). The feature vector xi is transformed into a higher dimensional

space by a non-linear kernel function K(xi,x).

Two-class SVM determines w and b by searching for the optimal hyperplane to separate

the feature space into two parts. This is also termed as a maximum margin approach, since

the objective is to maximize the distance between training data and decision hyperplane.

Multi-class SVM model is built by combining multiple two-class SVM models. For a K-

class SVM model (K > 2), we use “one-versus-one” approach [136], in which K(K − 1)/2

135

classifiers are trained on all possible pairs of classes, and then a voting strategy is used to

classify the clusters according to which class has the highest number of votes. Note that the

training of classifier employs DGBA training features, while the voting-based classification

relies on DGBA detection features. Finally, the clusters labeled as specific types of botnets

are deemed as bot-included clusters, demanding a further inspection.

5.3.5 Refined Bot Identification

After labeling bot-included clusters, the final step is to extract bots from the cluster based on

their individual connection features. Taking advantage of the distinct behaviors of bots and

benign hosts shown in Section 5.3.3, we devise a feature test to separate bots from benign

hosts who happen to be in the same cluster. The feature test exploits the differences of

various connection features between bots and benign hosts, which is shown in Algorithm 3.

A number of threshold values are defined to empirically set apart bots and benign hosts (e.g.

λ1 = 8000, λ2 = 0.2, λ3 = 10, λ4 = 200, λ5 = 0.2). As long as one type of features satisfies

the statement, the host is identified as bot.

Algorithm 3 Feature Test

1: for each bot-included cluster do2: for each host in the cluster do3: host ← benign host label4: if number of connections > λ1 or

shared neighbor ratio with any peer > λ2 ornumber of significant connections < λ3 or > λ4 orsignificant connection volatility < λ5 then

5: host ← bot label6: end if7: end for8: end for

136

Group-level feature ex-traction

Training feature extract-ed from labeled hosts

Detection feature extract-ed from a cluster

Cluster connectivity feature average maximum

Shared neighbor feature average maximum

Significant connection feature average maximum and minimum

Temporal feature average minimum

Table 5.3: Group-level feature summary

5.3.6 Putting Them All Together

A summary of PeerClean detection system is presented in this section. For the labeled

hosts in the training data set, PeerClean extracts their training features corresponding to

the average of each group-level feature using DGBA training, then builds a multi-class SVM

model. For the hosts in the testing data set, PeerClean first clusters the hosts based on their

flow statistical features. Each cluster undergoes DGBA detection to extract the detection

features corresponding to the maximum or minimum of each group-level feature. Then, each

cluster is designated a label by the SVM model to indicate whether it contains a specific type

of bots. Finally, PeerClean employs a refined bot identification algorithm to pick out the

bots from each bot-included cluster. A summary of group-level features is listed in Table 5.3.

5.3.7 Evasion Mechanisms and Limitations

PeerClean detects botnets without relying on packet payloads, which already raises the bar

for botnet authors. In the following, we discuss the potential evasion mechanisms that botnet

authors might use to thwart PeerClean.

The bots may disrupt the clustering mechanism by not following the same transmission pro-

tocol. However, that will increase the complexity of bot implementations and will also affect

the efficiency of C&C message exchange. Evading DGBA is even harder to achieve. The

137

possible attempts to evade the DGBA detection include lowering the connection number,

lowering the shared neighbor ratio, raising the significant connection number and raising

the significant connection volatility. For example, ZeroAccess bots need to reduce the good

connection number from 15000 to 2000 per hour, who lose nearly 86.7% connections. Mean-

while, the shared neighbor ratios should drop from at least 30% to nearly 0, which basically

suggests constructing a C&C delivery network with few shared neighbors between every two

peers. The SC number should increase to between 20 and 60 per hour, and SC volatility

should raise above more than 30%, which may disrupt the botnet operations, especially if

we consider the stealthiness and reliability requirements of P2P botnets. Generally, the P2P

bots always endeavor to meet the requirements by avoiding involving a large number of SCs

to maintain a low profile, and by preventing from adapting SC sets over time.

Therefore, the change of one or more connection features will greatly affect the C&C de-

livering operation, and may compromise the stealthiness of the botnets. Furthermore, the

collective features enlarge the gaps between the bots and benign hosts. To make the collec-

tive features indistinguishable from those of benign hosts will require substantial work on

designing a complex botnet. We leave the design of such botnets as future work.

Since PeerClean identifies the bots based on the traffic flow statistics, a limitation of Peer-

Clean is in identifying the bot-infected hosts running benign P2P application simultaneously

and persistently. In this case, the bot traffic might be obscured by the traffic from benign

P2P applications. Since PeerClean performs detection per hour, the smart bots would have

to run benign P2P applications all the time to prevent them from being discovered. However,

most benign P2P nodes have fast peer churn rate with short communication sessions [131].

Thus, it is unlikely for the P2P hosts to run benign P2P applications with P2P bot protocol

persistently, which will eventually reveal the bots’ presence at a certain point of time. On the

other hand, the future bots might intentionally run the bot protocol together with benign

138

P2P applications. Nevertheless, this will affect the communication efficiency of bots, which

might lead to a high peer churn rate or a complete disruption of C&C communications.

5.4 Discussion

In this section, we discusses two important issues that PeerClean is facing. Note that Peer-

Clean identifies a host by its IP address captured in the traffic flow data. However, in practice,

the IP address is not always a reliable source for correctly identifying a particular host, due

to dynamic address assignment enabled by DHCP protocol. Thus, some IP addresses may

belong to the same host, which might confuse the PeerClean detection. Fortunately, the

hosts tend to finish running applications before they went down and later returned with a

different IP address. Also, the DHCP protocol will not assign IP addresses very frequently.

Therefore, during a one-hour epoch, few IP addresses, if any, associated with the same hosts

will be clustered together. These small number of duplicate hosts will only cause negligible

impacts to the PeerClean system. Thus, we simply regard different IP addresses as different

hosts.

Another issue is that some hosts may be behind NATs, such that they will appear to have

the same IP address to the outside. We do not consider this type of NATed bots in our

system design, since these bots are typically not recruited in the P2P infrastructure of a

botnet, as they cannot be contacted by the other peers in the Internet. In fact, NATed bots

will not become the P2P overlay of a botnet. On the other hand, legitimate hosts may also

live behind NATs, such that the traffic flow from this NATed IP will be aggregated traffic

from multiple hosts. As long as they are not clustered into the same cluster with bots, they

will not cause serious issues. Luckily, the probability that these NATed IPs are clustered

together with bots will be a negligible value, due to their distinctive traffic flow patterns

139

compared to bots’ traffic flow patterns. Therefore, we simply regard this NATed IP as one

single special host.

5.5 Evaluation

In this section, we evaluate the bot identification performance of PeerClean system. We

first describe the collected data sets (Section 5.5.1). Then, we show that PeerClean can

well separate different types of P2P bots into different clusters, but may falsely incorporate

some benign P2P hosts who have bot-like traffic patterns (Section 5.5.2). After generating

host clusters, DGBA is carried out to extract group-level connection features from each

cluster. By separating the data set into training and testing sets, a multi-class SVM model

is trained using the labeled training set. We then evaluate the classification performance and

refined bot identification performance in Section 5.5.3 and Section 5.5.4, respectively. The

performance comparison with two existing approaches based on network flow patterns and

pairwise shared neighbor ratio is detailed in Section 5.5.5. Finally, we evaluate the system

scalability in a large network.

5.5.1 Data Collection

We use the traffic trace captured from the edge routers of a large campus network, which

have two /16 subnets. The traffic rate is about 5000 flows per second, and was captured for

one whole day in April 2013. We focus on the TCP and UDP traffic in this traffic trace.

However, as the network flow trace does not include traffic payload, we cannot unveil the

ground truth about whether or not the active hosts are running legitimate P2P applications.

To provide the ground truth data from legitimate P2P hosts, we run three of the most

140

popular P2P applications in our lab machines: emule, bittorrent and skype, and collect their

network flow traces. To make the traffic traces more representative, we interact with the

P2P hosts using AutoIt script [137] to randomly select contents to be downloaded/uploaded

(for emule and bittorrent application), or randomly generate texts to be transmitted (for

skype application) at random time periods. In total, we collected one-day traces from 100

bittorrent clients, 100 skype clients and 100 emule clients in April 2013.

We also collected the network traces for three recent P2P botnets: Sality, Kelihos and

ZeroAccess in April 2013. These network traces were gathered by purposefully running

Sality, Kelihos and ZeroAccess samples in a controlled environment, in which we carefully

block spamming, scanning, and Denial of Service attack activities. They contain 24-hour

traces for 6 Sality bots, 4 Kelihos bots and 4 ZeroAccess bots. Since the major malicious

activities were blocked during the collection of network traces, the collected traces mainly

include C&C traffic, e.g., for peer discovery, command exchanging, etc. Note that these

traces are collected when the three botnets are fully active. The traffic summary is listed in

Table 5.4. The traffic data from 300 legitimate P2P clients and 14 P2P bots constitute our

ground truth data set.

To make the evaluation more realistic, we overlay the traffic traces from 300 legitimate P2P

hosts and 14 P2P bots onto the campus traffic trace by assigning them to randomly selected

hosts of the campus network. In order to reduce the traffic volume, we eliminate flows from

well-known and extremely busy servers such as DNS servers, email servers, popular website

servers (e.g. google, facebook, youtube, etc.). After that, P2P host identification searches

for the hosts with a high percentage of failed connections (with threshold of 5%). In total,

we find 1097 hosts involved in P2P applications during the day, including all the 314 P2P

hosts serving as ground truths and additional 783 hosts in the campus network as shown in

Table 5.4. Therefore, we validate the effectiveness of P2P host identification, since all the

141

Trace Size Dur Pkts TCP/UDPFlows

clients

Campus 20.7G 24h 21.5G 401,661,350 34743

Bittorrent 6.7G 24h 854M 62,674,080 100

Skype 1.1G 24h 376M 12,615,840 100Emule 1.6G 24h 406M 18,978,800 100

Sality 40M 24h 10.8M 565,490 6Kelihos 224M 24h 23.5M 3,249,931 4ZeroAccess 4.6G 24h 166.9M 69,896,829 4

P2P incampus

487M 24h 608M 7,127,054 783

Table 5.4: Traffic summary (‘P2P in campus’ denotes the traffic flows of the campus networkafter P2P host identification)

314 ground-truth P2P hosts are correctly identified.

5.5.2 Clustering P2P Hosts

In this section, we evaluate the P2P host clustering performance of PeerClean system. Based

on the extracted flow features in Section 5.3.1, we perform AP clustering to group together

P2P bots of the same type. During the flow feature extraction, we find that almost all of

the traffic flows from Kelihos bots adhere to TCP protocol, while Sality and ZeroAccess bots

mostly generate UDP traffic. Hence, we use both the TCP and UDP traffic patterns for host

clustering.

The data set contains 24-hour flow traces from 1097 P2P hosts, which is divided into 24

sections with one hour per section. For each data section, we extract the flow statistical

features of every host who has 100+ outgoing TCP flows and 100+ incoming TCP flows

for the purpose of building representative flow patterns. Then, host clustering is carried out

using AP clustering method based on the extracted flow statistical features. Note that, since

the last two features in Table 5.2 are refreshed at the end of the day, they will only be used

for clustering at the last hour.

142

Hour 2 4 6 8 10 12 14 16 18 20 22

ClusterNum.

29 25 24 27 30 29 32 25 29 30 28

SalityClusterIndex

28 22 22 26 28 2728

31 22 22 29 26

ZeroAccessClusterIndex

29 25 24 27 30 29 32 25 29 30 28

BSR1 1 1 1 1 1 1 1 1 1 1 1BSR2 1 1 1 1 1 1 1 1 1 1 1

Table 5.5: Clustering result using UDP traffic (BSR1, BSR2 denotes the BSRs of Sality andZeroAccess bots respectively)

Hour 2 4 6 8 10 12 14 16 18 20 22

ClusterNum.

21 24 25 15 22 25 26 21 24 20 18

KelihosClus-terIndex

13 24 23 14 19 6 5 20 6 14 17

BSR3 1 1 1 1 1 1 1 1 1 1 1

Table 5.6: Clustering result using TCP traffic (BSR3 denotes the BSR of Kelihos bots)

We evaluate the clustering performance in terms of the ability of producing well separat-

ed and compact bot clusters, for which we propose two performance criterion. We define

mix-clustered bots as the bots mistakenly residing in the cluster of other types of bots, and

the complement of which are called separate-clustered bots. The two performance criterion

for evaluating the separation and compactness performances of bot clustering are: (1) Bot

Separation Ratio (BSR), which is defined for each type of bot as the ratio of the number

of separate-clustered bots and the overall number of bots of this type; (2) Bot Compact-

ness Ratio (BCR), which is defined as the number of separate-clustered bots and the overall

number of hosts (whether benign or not) assigned into the same cluster.

Bot Separation Performance: As observed from the per-hour clustering result3 in Table

5.5 and 5.6, Sality, Kelihos and ZeroAccess bots are assigned into different clusters with the

cluster index shown in the tables, i.e., all three types of bots are well separated from each

3Note that we only count the clusters containing more than one node.

143

Figure 5.6: Box plot of Bot Compactness Ratio

other, which makes their BSRs achieve one. Moreover, almost all the bots of one type are

grouped into the same cluster, with an exception of Sality bots who are divided into two

clusters at the 12-th hour. Nevertheless, none of these clusters contains more than one type

of bots, which demonstrates the perfect separation of different types of bots.

Bot Compactness Performance: The excellent bot separation performance indicates that

the bot-included cluster completely excludes other types of bots. However, every bot-

included clusters may still incorporate some benign P2P hosts, who accidentally display

a similar traffic pattern during the detection period. BCR quantifies the clustering capabil-

ity to exclude benign P2P bosts out of the bot-included cluster. BCR achieves one if the

bot-included cluster contains zero benign P2P host. Accumulating the 24-hour BCR results,

we plot BCR box-plot performance of three types of bots in Fig. 5.6. On average, Sality and

ZeroAccess bot clusters falsely include 3 benign hosts respectively, while Kelihos bot clusters

falsely include 12 benign hosts, which indicates that the clustering mechanism is not effective

in generating compact bot clusters. Further inspection of the falsely included benign hosts

shows they have traffic profiles that are highly similar to the bots’ traffic. Based on the

experimental results, we claim that network flow features are not sufficient to discriminate

P2P bots from benign P2P hosts.

144

5.5.3 Identifying Bot-included Clusters via Classification

In this section, we evaluate the bot cluster identification using multi-class SVM classification

with Gaussian kernel. Since we only have a limited number of labeled bots during every

hour, we enlarge the training space by incorporating a half day of labeled bots and benign

hosts into the training set. Consequently, the training set contains 36 clusters (3 clusters

per hour) of labeled bots (with labels ‘Sality’, ‘Kelihos’, ‘ZeroAccess’) and 36 clusters of

labeled legitimate P2P hosts running bittorrent, emule and skype (with labels ‘Non-Bot’).

We extract the training features from all the 72 labeled clusters to build the SVM classifier.

Then, we use the next half day of bots and benign hosts as testing set, which includes a

total of 37 bot-included clusters and 545 benign clusters.

After host clustering, DGBA detection process extracts four different types of group-level

connection features from every cluster in the testing set. Then, the SVM model predicts the

labels of clusters. Since the classification module relies on four different types of features,

we train the classifier on each individual group behavior feature in order to understand their

relative importance.

Table 5.7 shows the classification performance using different types of features for training.

The classification based on either shared neighbor feature or significant connection feature

have high accuracy and recall, but only achieve moderate precision. Looking into the clas-

sification results, we find that the classification produces few false negatives but many false

positives, i.e., bot-included clusters are unlikely to be regarded as benign cluster with these

two features, but many benign clusters are falsely considered as bot-included clusters. On

the other hand, cluster connectivity feature seems unable to discriminate bots from benign

hosts, as it brings substantial false positives and false negatives. We observe that the cor-

rectly classified bots mainly belong to ZeroAccess botnet, which agrees with our analysis in

145

Group Behavior Fea-ture

Accuracy Precision Recall

Cluster ConnectivityFeature

51.8% 7.9% 34.3%

Shared Neighbor Fea-ture

92.7% 68.8% 91.7%

Significant ConnectionFeature

91.8% 66.7% 90%

Temporal Feature 71.3% 3.1% 66.7%

All Features 98.8% 94.6% 100%

Table 5.7: Classification accuracy when trained on one type of feature. Shared neighbor fea-ture and significant connection feature present the best classification accuracy. The classifierachieves the best performance when combining all the features. Accuracy=(TP+TN)/all;Precision=TP/(TP+FP); Recall=TP/(TP+FN).

Section 5.3.3. Finally, temporal feature is designed to be updated at the end of the day, thus

is only used for bot classification in the final hour. Again, many false positives arise due to

the inseparability of bots’ features and benign hosts’ features. However, the combination of

all features provide the best result for detecting bot-included clusters. Overall, we only find

two false positives with none false negatives.

It is worth noting that the training set constitutes 50% of the whole data set in the previous

evaluation. Here, we also evaluate the classification performance by varying the percentage

of training data, since it is always difficult to collect traffic traces from labeled bots. As

shown in Fig. 5.7, PeerClean can still retain more than 70% classification accuracy when the

training sets contain traces from merely 10% of labeled hosts. This suggests that PeerClean is

robust against small sets of training data, such that PeerClean will have a wide applicability

under different network sizes.

146

Figure 5.7: Classification performance with different percentages of training data

Bot Type BotNum.

BenignHostNum.

CorrectlyIdentified

Falsely I-dentified

Sality 72 36 69(95.8%) 0 (0%)

Kelihos 48 123 47(97.9%) 5 (4.1%)

ZeroAccess 48 42 48(100%) 2 (4.8%)

Table 5.8: Refined bot identification performance (the percentage in the parenthesis denotesthe bot detection rate and false alarm rate respectively)

5.5.4 Refined Bot Identification Performance

Bot-included clusters contain a considerable amount of benign hosts as shown in Section

5.5.2, thus we use refined bot identification to extract the bots inside each bot-included

cluster. Feature test in Algorithm 3 is utilized to perform refined bot identification. We

run feature test on all the 39 bot-included clusters identified through SVM classification,

including 13 Sality clusters, 12 Kelihos clusters, 12 ZeroAccess clusters and 2 false positives.

The bot identification performance is depicted in Table. 5.8, which shows the bot number and

benign host number in the bot-included clusters. In summary, the refined bot identification

correctly identifies more than 95.8% of bots, and falsely triggers less than 4.8% alarms.

147

5.5.5 Comparisons with Other Detection Approaches

In this section, we present performance comparison of PeerClean system with other types of

detection approaches: method A and method B. Method A relies on network flow statistical

features to detect bot communications [53]. As shown in Section 5.5.2, a lot of benign hosts

appear to have similar flow traffic pattern like bots. We perform host clustering based on flow

statistical features to separate different patterns of flow traffic. Then, the clusters containing

bots are regarded as bot clusters, and all the benign hosts inside the bot clusters are falsely

labeled as bots. The number of false positives using method A is listed in Table 5.9, which

is unacceptably high.

Method B builds upon PeerClean system, but utilizes pairwise shared neighbor ratios to

identify bots [47] instead of a set of group-level connection features. For each cluster derived

from host clustering, we compute the pairwise shared neighbor ratio for every pair of hosts

in the cluster. If the pairwise shared neighbor ratio is higher than an empirical threshold, we

identify this pair of hosts as bots. From the results in Table 5.9, we find method B not only

has low accuracy of bot identification, but is also more likely to produce considerable false

alarms. The reason for its inadequate bot identification is mainly due to the traffic dynamics,

which causes method B to miss many bots but falsely include benign hosts. On the other

hand, PeerClean is able to significantly improve the identification accuracy by extracting

and modeling group-level features instead of individual or pairwise features, which manifests

itself by comparing Table 5.9 with Table 5.8.

5.5.6 System Scalability

The running time of PeerClean system depends on the clustering method, group feature

extraction and SVM classification method. The SVM classification method intends to classify

148

Bot Type BotNum.

Falsely Identified:Method A

Correctly Identi-fied: Method B

Falsely Identified:Method B

Sality 72 36 51 (70.8%) 14 (38.9%)

Kelihos 48 123 38 (79.2%) 21 (17.1%)

ZeroAccess 48 42 33 (68.8%) 10 (23.8%)

Table 5.9: Performance comparison with method A and method B (threshold: 0.2).

Figure 5.8: Running time of AP clustering

a small number of meaningful clusters. The running time can be negligible. On the other

hand, group feature extraction tries to collect the group behavior features in the whole

cluster. If one cluster contains a large number of P2P hosts, the group behavior feature

extraction will consume a considerable amount of time. In the case of large clusters, instead

of extracting all the group features, we can only extract one single feature to tradeoff the

complexity and performance. Based on the results in Table 5.7, we can extract only the

shared neighbor feature or significant connection feature for large clusters without degrading

too much performance, which will greatly reduce the computational complexity.

Finally, we evaluate the time consumption of AP clustering algorithm. We use a commodity

computer system with AMD FX(tm)-8120 processor and 16GB memory. The running time

corresponding to different number of P2P hosts are shown in Fig. 5.8. The time consumption

for a large amount of P2P hosts (up to 8000) is still constrained in one hour period. Since

PeerClean works within a one-hour period, the time consumption is acceptable. Alterna-

149

tively, leveraged affinity propagation method [133] can be used to efficiently deal with large

data sets, and we leave it for future work.

5.6 Summary

P2P C&C infrastructure has become a popular choice for the future botnets, which is ex-

tremely resilient to even sophisticated takedown attempts. The ability to identify botnets

inside a network is particularly important to the network administrators. Toward this di-

rection, we present PeerClean, a new network flow-based system to identify and classify

botnets with a high accuracy. PeerClean leverages a novel dynamic group behavior analysis

to extract and model a set of robust and reliable connection features at the group level. In

our experimental evaluation on real-world network traces, PeerClean shows excellent detec-

tion accuracy on various types of botnets. An interesting future direction is to apply the

group behavior analysis to other types of applications to help identify the network behaviors

which would be otherwise unnoticeable. We can also further merge and correlate the group

behaviors of different applications to detect anomalous communication patterns and identify

newly emerging attacks.

Chapter 6

Conclusion and Future Research

In this dissertation, we have identified and explored key points to apply cognitive technolo-

gies to enhance the security of the current cognitive communication systems including CRNs

and core networks. We designed effective and efficient countermeasure mechanisms to detect

and defend against the sophisticated and adaptive attackers that may exploit the security

vulnerabilities of CRNs. We also investigated IC-based jamming resilient wireless commu-

nications under SDR based reactive jammers. To protect communications in core networks,

we propose P2P botnet identification mechanisms to pinpoint P2P bots.

This dissertation contributes to advancing the state-of-the-art security research in modern

cognitive network design. The discovered security loopholes in the existing networking sys-

tems are astounding, which urgently call for a system redesign with a series of thorough

security tests or a comprehensive set of effective defending mechanisms in place. The pro-

posed defense mechanisms in this dissertation are designed correspondingly with real-world

implementation issues in mind, and are evaluated with prototype and testbed designs. We

believe these defense mechanisms make a compelling effort to guard and fortify the modern

cognitive networks. In this chapter, we summarize the research work and discuss future

research directions.

150

151

6.1 Research Summary

Recent advances in information technology have led to the proliferation of mobile devices

and intelligent IoT devices, which require more and more spectrum resources to support the

growing demand of world-wide and ubiquitous Internet connections. A shortage of spectrum

resources have become a serious issue which will hinder the development of mobile technolo-

gies and their global markets. Cognitive radio has been envisioned as a new paradigm to

improve the spectrum utilization and to provide opportunistic access to the licensed bands.

However, the communication paradigm is untethered and ad hoc, therefore the communica-

tion channel is adversarial and the communication hosts are not trustworthy. Furthermore,

the CR technologies allow more powerful adaptive and reactive adversaries, who may in-

tentionally adjust their attack strategies through sensing and learning capabilities. Finally,

the communications in core networks are further disrupted by sophisticated botnets who

are organized to commit cyber crimes. The untrusted networking and cognitive empowered

adversarial environments create new challenges in communication and network security.

In the dissertation, we have explored security research in the area of cognitive communica-

tions in cognitive networks. We identified security vulnerabilities of distributed spectrum

sensing mechanisms in CR ad hoc networks. We then developed a monitoring framework

to identify the malicious network activities in CRNs. We established jamming resilient

communication mechanisms to mitigate SDR-based reactive jamming attacks by employing

MIMO IC technology. Finally, we put forward a detection system to uncover P2P botnets

by leveraging group behavior analysis and machine learning technology.

Our main finding and their implications can be concluded as follows.

• We focused on the protection of distributed spectrum sensing in CR ad hoc networks.

We first identified various attacks that can disrupt the consensus algorithm or stealthily

152

subvert the sensing results, especially the covert adaptive attacks with learning capa-

bility. We then developed a robust distributed outlier detection scheme with adaptive

local threshold to counter covert adaptive attacks by exploiting the state convergence

property. In addition, a hash-based computation verification scheme is presented to

effectively defend against colluding attackers. To the authors’ best knowledge, this is

the first article to address the security issues in distributed spectrum sensing in CR

networks. Our simulation results demonstrated the severe vulnerabilities of distribut-

ed spectrum sensing, and also showed that our protection mechanisms are effective

and efficient in enforcing a trustworthy reporting of sensing results. In conclusion,

we should pay attention to the adaptive adversaries with cognitive learning capability

while designing communication protocols in cognitive networks.

• We proposed a monitoring framework, SpecMonitor, which utilizes a non-parametric

density estimation method to model SUs’ channel usage pattern. This method makes

no assumptions on the unknown distribution of channel access pattern, thus offers ac-

curate and flexible models which can be updated in an online fashion with acceptable

complexity. Moreover, we designed a sliding window method to perform online learn-

ing of data dynamics, and an accumulative combination method to further improve

modeling accuracy. We considered two levels of monitoring objectives: frame-level and

user-level, to diagnose different network issues. Based on the predicted traffic pat-

tern, We casted the monitoring optimization problem as a sniffer channel assignment

problem with objective of maximizing the corresponding QoMs, for which we designed

near-optimal monitoring algorithms. Our simulation and experimental results both

showed that SpecMonitor has superior capturing capability with low channel switch-

ing overhead.

• We proposed a novel defense mechanism for jamming resilient OFDM communication

153

based on MIMO IC technique, which tracks the jamming signal’s direction in real-time

before canceling it out. We devised an iterative channel tracking mechanism using

multiple pilots to estimate the sender and jammers channels alternately and iteratively

in a timely fashion. More importantly, we introduced an enhanced defense mechanism

leveraging sender signal enhancement (SSE) and message feedback techniques, which

strategically enhances the projected sender signal strength via signal rotation, resulting

in an improved anti-jamming performance. A tactical IC scheme is designed not only

to protect the forwarding frame transmission, but also to guard the feedback messages

against jamming. Our prototype experimental results demonstrated that the proposed

defense mechanisms can effectively sustain operational OFDM communications under

reactive jamming attack with considerable throughput.

• We proposed a dynamic group behavior analysis method to investigate the group-level

connection behaviors inside botnets. We applied DGBA to every host cluster so as to

extract the aggregated connection features. The propose detection system, PeerClean,

then trains a support vector machine (SVM) classifier using the group-level training

features, and labels each cluster using the SVM classifier subsequently. To improve

the detection performance, we trained the classifier using average group behavior, but

explore the extreme group behavior for the detection. While almost all the existing

network-based detection approaches identify the bots without specifying their corre-

sponding botnet types, we consider the botnet type as a useful piece of information for

the network administrators to evaluate the damaging impacts and decide correspond-

ing bot-specific countermeasures. Furthermore, the PeerClean system is tailored to

support real-time bot detection, and to enable a quick response to the bot infections.

Real world experiments with traffic captured in a campus environment showed that

PeerClean is able to identify and classify botnets with a high accuracy. Throughout

154

the experiments, we find that different botnets appear to have significantly different

network behaviors including traffic patterns and connection patterns. Therefore, in-

herent, robust and distinguishable features are still required to effectively detect the

botnets.

6.2 Future Research Directions

We can further investigate the following open problems for robust and reliable spectrum

sensing based on distributed outlier detection.

• We chose the distance-based approach due to the nature of localized/distributed detec-

tion. In future research, we can investigate and evaluate other types of outlier detection

techniques. For instance, a deviation-based approach identifies outliers that deviate

from the general characteristics of the set [138], i.e., the sensing reports minimizing the

local set variance after being removed will be flagged as falsified reports. Moreover,

density-based approach compares the density around a node with that around its lo-

cal neighbors [139,140], assuming a significant distinction between the density around

an outlier and that around normal neighbors. We can design protocols that enable

detection while minimizing the message exchanges and delay induced by the protocol

running in a distributed fashion. The performance of these different outlier detection

protocols can be evaluated in terms of detection probability, false alarm rate, and com-

putation and communication costs. Theoretical analysis of performance guarantees on

bounded miss detection and false alarm probabilities can also be investigated.

• Although the robustness and fault tolerance of various outlier detection schemes have

been reported before [61, 134, 141, 142], their impacts when applied to distributed

155

consensus-based protocols are not known. The effectiveness of outlier detection ap-

proaches relies on an assumption that a vast majority of nodes in the network are

honest. Therefore, the number of malicious nodes has a direct impact on our outlier

detection based consensus protocol. We can initiate a thorough theoretical and ex-

perimental study on the relationship between the number of malicious nodes and the

detection performance, the result of which will provide an upper bound of malicious

nodes tolerable within the detection zone of our outlier detection approach.

• The malicious behavior modeling is an open problem for robust and reliable spectrum

sensing. In our research, we consider persistent attackers who attack all the time to

manipulate the sensing results. However, in general, such persistent attack can be more

easily detected than sophisticated attack behaviors such as random, opportunistic and

insidious attack behaviors considered in [143]. When attackers perform such smart and

colluded attacks, the detection rate may be greatly reduced and hence it calls for more

advanced countermeasures to improve the detection rate and restrict the damaging

impacts.

• We can further extend the proposed detection approach to counteract data falsification

attacks in a mobile ad hoc network (MANET), where both the SUs and attackers are

mobile. In this case, the neighborhood of every node is constantly changing, which

adds another layer of challenges to the detection approaches, since nodes may execute

computations involving different neighbors during every iteration of consensus. We can

theoretically and experimentally evaluate the performance of the consensus algorithm

in a mobile network, and propose mobility-aware protection mechanisms to enforce

robust spectrum sensing. In addition, the proposed outlier detection approach can be

generalized to incorporate different channel propagation models for both indoor and

outdoor. In other words, we can investigate the configuration of detection threshold in

156

different networking environments with different channel propagation models.

We can extend our research effort in the following directions for the Non-Parametric predic-

tion based SpecMonitor framework.

• Our monitoring framework can be applied to model and then capture different types

of traffic, for different monitoring objectives such as detecting ICMP flooding attacks,

TCP/UDP spoofing attacks [144], etc. The performance of monitoring various types of

attacking traffic can be evaluated by simulation as well as in a controlled experimental

environment to ensure the wide applicability of our monitoring framework.

• The ultimate objective of the monitoring service is to provide a deep understanding

of benign traffic patterns and identify abnormal network behaviors such as secondary

user spectrum abuse [28] and network intrusions. Further extension can be made to

apply non-parametric machine learning techniques such as non-parametric Bayesian

inference model [145,146] to quickly classify benign traffic and identify dishonest user

behaviors and intrusions by learning and classifying traffic patterns in CRNs.

We can also explore the following open research problems for MIMO IC based jamming

resilient communication mechanism.

• In a traditional multi-hop network with jammers, the link being jammed will be un-

able to transfer information. However, by employing the proposed IC-based jamming

resilient communication scheme, the jammed link is still capable of transmitting infor-

mation, which introduces new optimization problems associated with rate allocation,

resource scheduling, and relay selection in the presence of jammers, while under the

protection of IC-based mechanisms. We can broadly study the newly emerged problem

sets to enable cross-layer optimization of networking under jamming attacks.

157

• We can extend our jamming cancellation mechanism to cancel jamming signals from

mobile jammers. The channel between a mobile jammer and intended receiver will

constantly change, which makes it difficult to estimate instantaneous jamming chan-

nels. We can carry out experiments to evaluate jamming signal cancellation schemes

in the presence of reactive jammers with different mobility patterns.

We can explore the following open research problems for group behavior analysis based

PeerClean system.

• We can apply group behavior analysis to identify newly emerging botnets with novel

botnet behaviors. We may use extracted group-level features to define the “normal”

behavior of benign host groups, then the behaviors those deviate to a certain extent

from the normal behavior will be regarded as anomalous behaviors. In the first place,

we need to gather together the bots from such unseen botnets to investigate their

group-level behaviors. Then, we can compare their group-level behaviors with normal

behavior to gauge the group’s abnormality, which can be used to uncover previously

unidentified botnets.

• We can apply PeerClean system to detect more types of P2P botnets. In future re-

search, PeerClean system can utilize cognitive technology to learn the bot behaviors

and adapt the strategy to uncover bots’ presence. In the current PeerClean system,

we discovered several features based on real traffic data analysis. Future research may

focus on developing advanced machine learning mechanisms to automatically extract

reliable and robust features to tell apart bots and benign hosts.

• In future, we will see how the botmasters adapt the bot behaviors to evade the new

detection system. PeerClean can be designed to incorporate the strategies to counteract

botmasters’ adaptation. The arm race between the botmasters and detectors is an

158

open problem worth studying. Finally, we can evaluate the real-time traffic analysis

performance of PeerClean system in the real production networks.

Bibliography

[1] D. D. Clark, C. Partridge, J. C. Ramming, and J. T. Wroclawski, “A knowledge plane

for the internet,” in Proceedings of the 2003 Conference on Applications, Technologies,

Architectures, and Protocols for Computer Communications, ser. SIGCOMM ’03, 2003,

pp. 3–10.

[2] R. Thomas, D. Friend, L. DaSilva, and A. MacKenzie, “Cognitive networks: adap-

tation and learning to achieve end-to-end performance objectives,” Communications

Magazine, IEEE, vol. 44, no. 12, pp. 51–57, Dec 2006.

[3] C. Fortuna and M. Mohorcic, “Trends in the development of communication networks:

Cognitive networks,” Computer networks, pp. 1354–1376, January 2009.

[4] “Cognitive.” [Online]. Available: http://www.merriam-webster.com/dictionary/

cognitive

[5] J. Mitola and G. Q. Maguire, “Cognitive radio: making software radios more personal,”

IEEE Personal Communications, vol. 6, no. 4, pp. 13–18, August 1999.

[6] “Ettus research llc.” [Online]. Available: http://www.ettus.com

[7] “Warp: Wireless open access research platform.” [Online]. Available: http://http:

//warpproject.org/trac

159

http://www.merriam-webster.com/dictionary/cognitive

http://www.merriam-webster.com/dictionary/cognitive

http://www.ettus.com

http://http://warpproject.org/trac

http://http://warpproject.org/trac

160

[8] “Information security.” [Online]. Available: http://en.wikipedia.org/wiki/

Information security

[9] A. Simmonds, P. Sandilands, and L. Van Ekert, “An ontology for network security

attacks,” Applied Computing, vol. 3285, pp. 317–323, 2004.

[10] S. Haykin, “Cognitive radio: brained-empowered wireless communications,” Selected

Areas in Communications, IEEE Journal on, vol. 23, no. 2, pp. 201–220, 2005.

[11] I. F. Akyildiz, W.-Y. Lee, M. C. Vuran, and S. Mohanty, “Next generation/dynamic

spectrum access/cognitive radio wireless networks: a survey,” Computer Networks,

vol. 50, no. 13, pp. 2127–2159, Sep. 2006.

[12] S. Haykin, D. J. Thomson, and J. H. Reed, “Spectrum sensing for cognitive radio,”

Proceedings of the IEEE, vol. 97, no. 5, pp. 849–877, May 2009.

[13] I. F. Akyildiz, B. F. Lo, and R. Balakrishnan, “Cooperative spectrum sensing in cog-

nitive radio networks: a survey,” Physical Communication, vol. 4, pp. 40–62, 2011.

[14] I. F. Akyildiz, W.-Y. Lee, and K. R. Chowdhury, “CRAHN: cognitive radio ad hoc

networks,” Ad Hoc Networks, vol. 7, no. 5, pp. 810–836, July 2009.

[15] R. Chen, J.-M. Park, and J. H. Reed, “Defense against primary user emulation attacks

in cognitive radio networks,” Selected Areas in Communications, IEEE Journal on,

vol. 26, no. 1, pp. 25–37, 2008.

[16] R. Chen, J.-M. Park, and B. Kaigui, “Robust distributed spectrum sensing in cognitive

radio networks,” in INFOCOM 2008, IEEE, April 2008, pp. 1876–1884.

[17] O. Fatemieh, R. Chandra, and C. A. Gunter, “Secure collaborative sensing for crowd-

sourcing spectrum data in white space networks,” in New Frontiers in Dynamic Spec-

trum, 2010. (DySPAN ’2010) IEEE Symposium on, April 2010, pp. 1–12.

http://en.wikipedia.org/wiki/Information_security

http://en.wikipedia.org/wiki/Information_security

161

[18] A. W. Min, K.-H. Kim, and K. G. Shin, “Robust cooperative sensing via state esti-

mation in cognitive radio networks,” in New Frontiers in Dynamic Spectrum, 2011.

(DySPAN ’2011) IEEE Symposium on, 2011.

[19] Z. Li, F. R. Yu, and M. Huang, “A distributed consensus-based cooperative spectrum-

sensing scheme in cognitive radios,” Vehicular Technology, IEEE Transactions on,

vol. 59, no. 1, pp. 383–393, 2010.

[20] F. R. Yu, M. Huang, and H. Tang, “Biologically inspired consensus-based spectrum

sensing in mobile ad hoc networks with cognitive radios,” Network, IEEE, vol. 24,

no. 3, pp. 26–30, May 2010.

[21] J. L. Burbank, “Security in cognitive radio networks: the required evolution in ap-

proaches to wireless network security,” in Cognitive Radio Oriented Wireless Networks

and Communications, 2008. CrownCom 2008. 3rd International Conference on, May

2008, pp. 1–7.

[22] Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against state estimation

in electric power grids,” in Proceedings of the 16th ACM Conference on Computer and

Communications Security (CCS), 2009, pp. 21–32.

[23] W. Xu, W. Trappe, Y. Zhang, and T. Wood, “The feasibility of launching and detecting

jamming attacks in wireless networks,” in Proceedings of the 6th ACM International

Symposium on Mobile Ad Hoc Networking and Computing, ser. MobiHoc ’05, 2005,

pp. 46–57.

[24] M. Li, I. Koutsopoulos, and R. Poovendran, “Optimal jamming attacks and network

defense policies in wireless sensor networks,” in Proc. of IEEE INFOCOM, May 2007.

162

[25] M. Strasser, B. Danev, and S. Capkun, “Detection of reactive jamming in sensor

networks,” ACM Transactions on Sensor Networks(TOSN), vol. 7, no. 16, pp. 1–29,

2010.

[26] T. X. Brown and A. Sethi, “Potential cognitive radio denial-of-service vulnerabilities

and protection countermeasures: a multi-dimensional analysis and assessment,” Jour-

nal of Mobile Networks and Applications, vol. 13, no. 5, pp. 516–532, Oct. 2008.

[27] S. Liu, Y. Chen, W. Trappe, and L. J. Greenstein, “Aldo: An anomaly detection

framework for dynamic spectrum access networks,” in INFOCOM 2009, IEEE, April

2009, pp. 675–683.

[28] L. Yang, Z. Zhang, B. Y. Zhao, C. Kruegel, and H. Zheng, “Enforcing dynamic spec-

trum access with spectrum permits,” in Proceedings of the thirteenth ACM interna-

tional symposium on Mobile Ad Hoc Networking and Computing, ser. MobiHoc ’12,

2012, pp. 195–204.

[29] J. Yeo, M. Youssef, and A. Agrawala, “A framework for wireless LAN monitoring and

its applications,” in Wise 2004, October 2004, pp. 70–79.

[30] Y.-C. Cheng, J. Bellardo, P. Benko, A. C. Snoeren, G. M. Voelker, and S. Savage,

“Jigsaw: Solving the puzzle of enterprise 802.11 analysis,” in SIGCOMM ’06, Sep.

2006, pp. 39–50.

[31] Y.-C. Cheng, M. Afanasyev, P. Verkail, P. Benko, J. Chiang, A. C. Snoeren, S. Savage,

and G. M. Voelker, “Automating cross-layer diagnosis of enterprise wireless networks,”

in SIGCOMM ’07, Aug. 2007, pp. 25–36.

[32] A. Wood and J. Stankovic, “Denial of service in sensor networks,” Computer, vol. 35,

no. 10, pp. 54–62, 2002.

163

[33] K. Pelechrinis, M. Iliofotou, and S. Krishnamurthy, “Denial of service attacks in

wireless networks: The case of jammers,” Communications Surveys Tutorials, IEEE,

vol. 13, no. 2, pp. 245–257, 2011.

[34] M. Wilhelm, I. Martinovic, J. B. Schmitt, and V. Lenders, “Reactive jamming in

wireless networks - how realistic is the threat?” in Proc. of WiSec, June 2011.

[35] A. Cassola, W. Robertson, E. Kirda, and G. Noubir, “A practical, targeted, and

stealthy attack against wpa enterprise authentication,” in Proceedings of the 20th An-

nual Network and Distributed System Security Symposium (NDSS ’13), February 2013.

[36] S. Gollakota, S. D. Perli, and D. Katabi, “Interference alignment and cancellation,” in

Proc. of SIGCOMM, August 2009.

[37] S. Gollakota, F. Adib, D. Katabi, and S. Seshan, “Clearing the RF smog: Making

802.11 robust to cross-technology interference,” in Proc. of SIGCOMM, August 2011.

[38] K. C.-J. Lin, S. Gollakota, and D. Katabi, “Random access heterogeneous MIMO

networks,” in Proc. of SIGCOMM, August 2011.

[39] Damballa, “Peer-to-peer: A growing tactic used for threat command-and-

control,” https://www.damballa.com/downloads/r pubs/wp peer-to-peer a growing

tactic.pdf, 2013.

[40] “Latest kelihos botnet shut down live at rsa conference 2013,” http://threatpost.com/,

2013.

[41] N. Falliere, “Sality: Story of a peer to-peer viral network,” http://www.symantec.

com/connect/downloads/whitepaper-sality-story-peer-peer-viral-network, July 2011.

[42] J. Wyke, “The zeroaccess botnet - mining and fraud for massive financial gain,” Sep.

2012.

https://www.damballa.com/downloads/r_pubs/wp_peer-to-peer_a_growing_tactic.pdf

https://www.damballa.com/downloads/r_pubs/wp_peer-to-peer_a_growing_tactic.pdf

http://threatpost.com/

http://www.symantec.com/connect/downloads/whitepaper-sality-story-peer-peer-viral-network

http://www.symantec.com/connect/downloads/whitepaper-sality-story-peer-peer-viral-network

164

[43] T. Werner, “Kelihos.c: Same code, new botnet, 2012,” http://www.crowdstrike.com/

blog/kelihosc-same-code-new-botnet/index.html, Mar. 2012.

[44] C. Rossow, D. Andriesse, T. Werner, B. Stone-Gross, D. Plohmann, C. J. Dietrich, and

H. Bos, “P2PWNED: Modeling and evaluating the resilience of peer-to-peer botnets,”

in IEEE Symposium on Security & Privacy, May 2013.

[45] S. Nagaraja, P. Mittal, C.-Y. Hong, M. Caesar, and N. Borisov, “Botgrep: Finding

p2p bots with structured graph analysis,” in Proc. of USENIX Security’10, 2010.

[46] T.-F. Yen and M. K. Reiter, “Are your hosts trading or plotting? telling p2p file-

sharing and bots apart,” in Proc. of ICDCS, June 2010.

[47] J. Zhang, R. Perdisci, W. Lee, U. Sarfraz, and X. Luo, “Detecting stealthy p2p botnets

using statistical traffic fingerprints,” in Dependable Systems Networks (DSN), 2011

IEEE/IFIP 41st International Conference on, June 2011.

[48] Z. Xu, L. Chen, G. Gu, and C. Kruegel, “Peerpress: Utilizing enemies’ p2p strength

against them,” in Proc. of ACM CCS’12, October 2012.

[49] C. Kolbitsch, P. M. Comparetti, C. Kruegel, E. Kirda, X. Zhou, and X. Wang, “Effec-

tive and efficient malware detection at the end host,” in Proc. of USENIX Security’09,

August 2009.

[50] G. Gu, P. Porras, V. Yegneswaran, and M. Fong, “Bothunter: Detecting malware

infection through IDS-driven dialog correlation,” in Proc. of USENIX Security’07,

August 2007.

[51] G. Gu, R. Perdisci, J. Zhang, and W. Lee, “Botminer: Clustering analysis of network

traffic for protocol- and structure-independent botnet detection,” in Proc. of USENIX

Security’08, 2008.

http://www.crowdstrike.com/blog/kelihosc-same-code-new-botnet/index.html

http://www.crowdstrike.com/blog/kelihosc-same-code-new-botnet/index.html

165

[52] J. Zhang, X. Luo, R. Perdisci, G. Gu, W. Lee, and N. Feamster, “Boosting the scala-

bility of botnet detection using adaptive traffic sampling,” in Proc. of AsiaCCS, March

2011.

[53] L. Bilge, D. Balzarotti, W. Robertson, E. Kirda, and C. Kruegel, “DISCLOSURE:

Detecting botnet command and control servers through large-scale netflow analysis,”

in Proc. of ACSAC, Dec. 2012.

[54] T.-F. Yen and M. K. Reiter, “Revisiting botnet models and their implications for

takedown strategies,” in Proceedings of the First international conference on Principles

of Security and Trust, 2012.

[55] Q. Zhao and B. Sadler, “A survey of dynamic spectrum access,” Signal Processing

Magazine, IEEE, vol. 24, no. 3, pp. 79–89, 2007.

[56] Q. Yan, M. Li, T. Jiang, W. Lou, and T. Hou, “Vulnerability and protection for dis-

tributed consensus-based spectrum sensing in cognitive radio networks,” in INFOCOM

2012, IEEE, March 2012, pp. 900–908.

[57] Q. Yan, M. Li, F. Chen, T. Jiang, W. Lou, T. Hou, and C.-T. Lu, “Non-parametric

passive traffic monitoring in cognitive radio networks,” in INFOCOM 2013, IEEE,

April 2013, pp. 1264–1272.

[58] Q. Yan, H. Zeng, T. Jiang, M. Li, W. Lou, and T. Hou, “Mimo-based jamming resilient

communication in wireless networks,” in INFOCOM 2014, IEEE, April 2014.

[59] H. Li and Z. Han, “Catch me if you can: an abnomality detection approach for col-

laborative spectrum sensing in cognitive radio networks,” Wireless Communications,

IEEE Transactions on, vol. 9, no. 11, pp. 3554–3565, 2010.

166

[60] O. Fatemieh, A. Farhadi, R. Chandra, and C. A. Gunter, “Using classification to pro-

tect the integrity of spectrum measurements in white space networks,” in Proceedings

of the 18th Annual Network and Distributed System Security Symposium (NDSS ’11),

February 2011.

[61] A. W. Min, K. G. Shin, and X. Hu, “Attack-tolerant distributed sensing for dynamic

spectrum sensing access networks,” in Network Protocols, 2009. (ICNP ’2009) 17th

International Conference on, October 2009, pp. 294–303.

[62] O. Fatemieh, M. LeMay, and C. A. Gunter, “Reliable telemetry in white spaces using

remote attestation,” in Proceedings of the 27th Annual Computer Security Applications

Conference, 2011, pp. 323–332.

[63] K. Zeng, P. Paweczak, and D. Cabric, “Reputation-based cooperative spectrum sensing

with trusted nodes assistance,” Communications Letters, IEEE, vol. 14, no. 3, pp. 226–

228, 2010.

[64] P. Kaligineedi, M. Khabbazian, and V. Bhargava, “Secure cooperative sensing tech-

niques for cognitive radio systems,” in IEEE International Conference on Communi-

cations, ICC ’08, 2008, pp. 3406–3410.

[65] P. Kaligineedi, M. Khabbazian, and V. K. Bhargava, “Malicious user detection in a

cognitive radio cooperative sensing system,” Wireless Communications, IEEE Trans-

actions on, vol. 9, no. 8, pp. 2488–2497, 2010.

[66] F. R. Yu, H. Tang, M. Huang, Z. Li, and P. C. Mason, “Defense against spectrum

sensing data falsification attacks in mobile ad hoc networks with cognitive radios,” in

Military Communications Conference, 2009. MILCOM 2009. IEEE, October 2009, pp.

1–7.

167

[67] R. Zhang, J. Zhang, Y. Zhang, and C. Zhang, “Secure crowdsourcing-based cooperative

spectrum sensing,” in INFOCOM 2013, IEEE, April 2013.

[68] S. Li, H. Zhu, Z. Gao, X. Guan, and K. Xing, “Yousense: Mitigating entropy selfishness

in distributed collaborative spectrum sensing,” in INFOCOM 2013, IEEE, April 2013.

[69] C. Wang, X.-Y. Li, C. Jiang, S. Tang, Y. Liu, and J. Zhao, “Scaling laws on multicast

capacity of large scale wireless networks,” in INFOCOM 2009, IEEE, April 2009, pp.

1863 –1871.

[70] W. Wang, H. Li, Y. Sun, and Z. Han, “Securing collaborative spectrum sensing against

untrustworthy secondary users in cognitive radio networks,” EURASIP Journal on

Advances in Signal Processing, vol. 2010, Jan. 2010.

[71] Y. Liu, P. Ning, and H. Dai, “Authenticating primary users’ signals in cognitive radio

networks via integrated cryptographic and wireless link signatures,” in Security and

Privacy (SP), 2010 IEEE Symposium on, 2010, pp. 286–301.

[72] X. Tan, K. Borle, W. Du, and B. Chen, “Cryptographic link signatures for spectrum

usage authentication in cognitive radio,” in Proceedings of the fourth ACM conference

on Wireless network security, 2011, pp. 79–90.

[73] Z. Jin, S. Anand, and K. P. Subbalakshmi, “Mitigating primary user emulation attacks

in dynamic spectrum access networks using hypothesis testing,” SIGMOBILE Mobile

Computing and Communications Review, vol. 13, no. 2, pp. 74–85, Sep. 2009.

[74] S. Anand, Z. Jin, and K. P. Subbalakshmi, “An analytical model for primary user

emulation attacks in cognitive radio networks,” in New Frontiers in Dynamic Spectrum

Access Networks, 2008. DySPAN 2008. 3rd IEEE Symposium on, 2008, pp. 1–6.

168

[75] H. Chan, A. Perrig, and D. Song, “Secure hierarchical in-network aggregation in sensor

networks,” in CCS ’06 Proceedings of the 13th ACM conference on Computer and

communications security, October 2006.

[76] A. Goldsmith, Wireless Communications. Cambridge University Press, 2005.

[77] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and cooperation in net-

worked multi-agent systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 215–233,

2007.

[78] R. Tandra, S. M. Mishra, and A. Sahai, “What is a spectrum hole and what does it

take to recognize one?” Proceedings of the IEEE, vol. 97, no. 5, pp. 824–848, May

2009.

[79] R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge University Press, 2005.

[80] W. Ren, R. W. Beard, and E. M. Atkins, “A survey of consensus problems in multi-

agent coordination,” in American Control Conference, 2005. Proceedings of the 2005,

vol. 3, June 2005, pp. 1859–1864.

[81] V. Yadav and M. V. Salapaka, “Distributed protocol for determining when averaging

consensus is reached,” in Proceedings of 2007 Allerton Conference on Communication,

Control, and Computing, 2007.

[82] F. Mosteller and J. W. Tukey, Data analysis and regression: A second course in statis-

tics. Addison-Wesley Publishing Company, 1977.

[83] A. Perrig, R. Szewczyk, J. Tygar, V. Wen, and D. E. Culler, “SPINS: Security protocols

for sensor networks,” Wireless Networks, vol. 8, pp. 521–534, 2002.

169

[84] I. Khalil, S. Bagchi, and N. B. Shroff, “LITEWORP: a lightweight countermeasure

for the wormhole attack in multihop wireless networks,” in DSN 2005, Dependable

Systems and Networks, 2005.

[85] A. Balachandran, G. M. Voelker, P. Bahl, and P. V. Rangan, “Chracterizing user

behavior and network performance in a public wireless LAN,” in SIGMETRICS 2002,

ACM, June 2002, pp. 195–205.

[86] D.-H. Shin and S. Bagchi, “Optimal monitoring in multi-channel multi-radio wireless

mesh networks,” in MobiHoc’09, May 2010.

[87] A. Chhetri, H. Nguyen, G. Scalosub, and R. Zheng, “On quality of monitoring for

multi-channel wireless infrastructure networks,” in MobiHoc’10, Sep. 2010.

[88] P. Arora, C. Szepesvari, and R. Zheng, “Sequantial learning for optimal monitoring of

multi-channel wireless networks,” in INFOCOM 2011, IEEE, 2011.

[89] S. Chen, Z. Kai, and P. Mohapatra, “Efficient data capturing for network forensics

in cognitive radio networks,” in Network Protocols, 2011. (ICNP ’2011) 19th Interna-

tional Conference on, 2011, pp. 176–185.

[90] S. Yi, K. Zeng, and J. Xu, “Secondary user monitoring in unslotted cognitive radio

networks with unknown models,” in Wireless Algorithms, Systems, and Applications,

ser. Lecture Notes in Computer Science, vol. 7405, 2012, pp. 648–659.

[91] P. Bahl, R. Chandra, T. Moscibroda, R. Murty, and M. Welsh, “White space network-

ing with wi-fi like connectivity,” in SIGCOMM ’09, August 2009, pp. 27–38.

[92] D. Cabric, A. Tkachenko, and R. W. Brodersen, “Experimental study of spectrum

sensing based on energy detection and network cooperation,” in Proceedings of the first

170

international workshop on Technology and policy for accessing spectrum, ser. TAPAS

’06, 2006.

[93] D. Murray, M. Dixon, and T. Koziniec, “Scanning delays in 802.11 networks,” in The

2007 International Conference on Next Generation Mobile Applications, Services and

Technologies, Sept. 2007, pp. 255–260.

[94] S. Narlanka, R. Chandra, P. Bahl, and J. I. Ferrell, “A hardware platform for utilizing

tv bands with a wi-fi radio,” in IEEE LANMAN, June 2007.

[95] Z. I. Botev, J. F. Grotowski, and D. P. Kroese, “Kernel density estimation via diffu-

sion,” Annals of Statistics, vol. 38, no. 5, pp. 2916–2957, Nov. 2010.

[96] C. Heinz and B. Seeger, “Towards kernel density estimation over streaming data,” in

International Conference on Management of Data (COMAD), Dec. 2006.

[97] G. W. Corder and D. I. Foreman, Nonparametric Statistics for Non-Statisticians: A

Step-by-Step Approach. New York, USA: Wiley, 2009.

[98] A. Srinivasan, “Distributions on level-sets with applications to approximation algo-

rithms,” in FOCS, 2001.

[99] F. Zhang, W. He, X. Liu, and P. G. Bridges, “Inferring users’ online activities through

traffic analysis,” in WiSec 2011, June 2011, pp. 59–69.

[100] “Airpcap Adapter,” http://www.riverbed.com/products-solutions/products/

network-performance-management/wireshark-enhancement-products/, 2013.

[101] V. Brik, S. Banerjee, M. Gruteser, and S. Oh, “Wireless device identification with

radiometric signatures,” in Proceedings of the 14th ACM international conference on

Mobile computing and networking, ser. MobiCom ’08, 2008, pp. 116–127.

http://www.riverbed.com/products-solutions/products/network-performance-management/wireshark-enhancement-products/

http://www.riverbed.com/products-solutions/products/network-performance-management/wireshark-enhancement-products/

171

[102] S. Liu, L. Lazos, and M. Krunz, “Thwarting inside jamming attacks on wireless broad-

cast communications,” in Proc. of Wisec, June 2011.

[103] R. Zhang, Y. Zhang, and X. Huang, “JR-SND: jamming-resilient secure neighbor dis-

covery in mobile ad hoc networks,” in Proc. of ICDCS, June 2011, pp. 529–538.

[104] M. Strasser, C. Popper, S. Capkun, and C. Mario, “Jamming-resistant key establish-

ment using uncoordinated frequency hopping,” in Proc. of IEEE S&P, May 2008.

[105] Y. Liu, P. Ning, H. Dai, and A. Liu, “Randomized differential DSSS: jamming-resistant

wireless broadcast communication,” in Proc. of IEEE INFOCOM, March 2010.

[106] G. Noubir, R. Rajaraman, B. Sheng, and B. Thapa, “On the robustness of ieee802.11

rate adaptation algorithms against smart jamming,” in Proc. of WiSec, June 2011.

[107] W. Xu, W. Trappe, and Y. Zhang, “Anti-jamming timing channels for wireless net-

works,” in Proc. of WiSec, 2008.

[108] Y. Liu and P. Ning, “Bittrickle: Defending against broadband and high-power reactive

jamming attacks,” in Proc. of IEEE INFOCOM, 2012.

[109] T. D. Vo-Huu, E.-O. Blass, and G. Noubir, “Counter-jamming using mixed mechanical

and software interference cancellation,” in Proc. of WiSec, April 2013.

[110] R. Miller and W. Trappe, “Subverting MIMO wireless systems by jamming the channel

estimation procedure,” in Proc. of WiSec, March 2010.

[111] M. Han, T. Yu, J. Kim, K. Kwak, S. Lee, S. Han, and D. Hong, “OFDM channel esti-

mation with jammed pilot detector under narrow-band jamming,” IEEE Transactions

on Vehicular Technology, vol. 57, no. 3, pp. 1934–1939, 2008.

172

[112] T. Clancy, “Efficient OFDM denial: Pilot jamming and pilot nulling,” in Proc. of ICC,

2011.

[113] W.-L. Shen, Y.-C. Tung, K.-C. Lee, K. C.-J. Lin, S. Gollakota, D. Katabi, and M.-S.

Chen, “Rate adaptation for 802.11 multiuser MIMO networks,” in Proc. of MobiCom,

August 2012.

[114] H. Kim and K. G. Shin, “In-band spectrum sensing in cognitive radio networks: energy

detection or feature detection?” in MobiCom 2008, September 2008, pp. 14–25.

[115] D. Giustiniano, V. Lenders, J. B. Schmitt, M. Spuhler, and M. Wilhelm, “Detection

of reactive jamming in dsss-based wireless networks,” in Proceedings of the Sixth ACM

Conference on Security and Privacy in Wireless and Mobile Networks, ser. WiSec ’13,

2013, pp. 43–48.

[116] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge

University Press, 2005.

[117] W. Xu, W. Trappe, and Y. Zhang, “Channel surfing and spatial retreats: Defenses

against wireless denial of service,” in Proc. of WiSe, 2004.

[118] S. Gollakota and D. Katabi, “ZigZag decoding: Combating hidden terminals in wireless

networks,” in Proc. of SIGCOMM, August 2008, pp. 159–170.

[119] K. Miller, A. Sanne, K. Srinivasan, and S. Vishwanath, “Enabling real-time inter-

ference alignment: promises and challenges,” in Proceedings of the thirteenth ACM

international symposium on Mobile Ad Hoc Networking and Computing, 2012, pp.

55–64.

[120] W. Stallings, Data and Computer Communications (9th Edition). Prentice Hall, 2010.

173

[121] G. Gu, J. Zhang, and W. Lee, “Botsniffer: Detecting botnet command and control

channels in network traffic,” in Proc. of NDSS’08, 2008.

[122] T.-F. Yen and M. K. Reiter, “Traffic aggregation for malware detection,” in Proc. of

DIMVA ’08, 2008.

[123] P. Wurzinger, L. Bilge, T. Holz, J. Goebel, C. Kruegel, and E. Kirda, “Automatically

generating models for botnet detection,” in Proc. of ESORICS’09, 2009, pp. 232–249.

[124] H. Zhang, D. Yao, and N. Ramakrishnan, “Detection of stealthy malware activities

with traffic causality and scalable triggering relation discovery,” in Proceedings of the

9th ACM Symposium on Information, Computer and Communications Security (ASI-

ACCS ’14), 2014, pp. 39–50.

[125] B. Coskun, S. Dietrich, and N. Memon, “Friends of an enemy: Identifying local mem-

bersof peer-to-peer botnets using mutual contacts,” in Proc. of ACSAC, 2010.

[126] Y. Zhao, Y. Xie, F. Yu, Q. Ke, Y. Yu, Y. Chen, and E. Gillum, “Botgraph: large scale

spamming botnet detection,” in Proc. of NSDI’09, 2009.

[127] M. Jelasity and V. Bilicki, “Towards automated detection of peer-to-peer botnets: on

the limits of local approaches,” in Proc. of LEET’09, 2009.

[128] L. Li, S. Mathur, and B. Coskun, “Gangs of the internet: Towards automatic discovery

of peer-to-peer communities in the internet,” in Proc. of CNS, 2013, pp. 167–175.

[129] D. Dagon, G. Gu, C. Lee, and W. Lee, “A taxonomy of botnet structures,” in Proc.

of ACSAC’07, 2007.

[130] S. Sen, O. Spatscheck, and D. Wang, “Accurate, scalable in-network identification of

p2p traffic using application signatures,” in Proc. of WWW’04, 2004, pp. 512–521.

174

[131] D. Stutzbach and R. Rejaie, “Understanding churn in peer-to-peer networks,” in Pro-

ceedings of the 6th ACM SIGCOMM conference on Internet measurement, October

2006.

[132] O. Maimon and L. Rokach, Data Mining and Knowledge Discovery Handbook.

Springer-Verlag New York, Inc., 2005.

[133] B. J. Frey and D. Dueck, “Clustering by passing messages between data points,”

Science, vol. 315, no. 5814, pp. 972–976, 2007.

[134] P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining. Addison-

Wesley, 2005.

[135] K. Xu, Z.-L. Zhang, and S. Bhattacharyya, “Profiling internet backbone traffic: Be-

havior models and applications,” in Proc. of SIGCOMM, August 2005.

[136] C. M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.

[137] “Autoit script,” http://www.autoitscript.com/site/autoit/, 2013.

[138] A. Arning, R. Agrawal, and P. Raghavan, “A linear method for deviation detection

in large databases,” in KDD 1996. 2nd ACM International Conference on Knowledge

Discovery and Data Mining, 1996, pp. 164–169.

[139] M. Ester, H.-p. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discov-

ering clusters in large spatial databases with noise,” in KDD 1996. 2nd ACM Interna-

tional Conference on Knowledge Discovery and Data Mining, 1996, pp. 226–231.

[140] M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “LOF: Identifying density-based

local outliers,” in The 2000 ACM SIGMOD International Conference on Management

of Data, Proceedings of, 2000, pp. 93–104.

http://www.autoitscript.com/site/autoit/

175

[141] V. Hodge and J. Austin, “A survey of outlier detection methodologies,” Artificial

Intelligence Review, vol. 22, no. 2, pp. 85–126, Oct. 2004.

[142] M. Ding, D. Chen, K. Xing, and X. Cheng, “Localized fault-tolerant event boundary

detection in sensor networks,” in INFOCOM 2005. 24th Annual Joint Conference of

the IEEE Computer and Communications Societies. Proceedings IEEE, 2005.

[143] R. Mitchell and I. Chen, “Effect of intrusion detection and response on reliability of

cyber physical systems,” Reliability, IEEE Transactions on, vol. 62, no. 1, pp. 199–210,

March 2013.

[144] J. Mirkovic, G. Prier, and P. Reiher, “Attacking DDoS at the source,” in Network

Protocols, 2002. Proceedings. 10th IEEE International Conference on, 2002, pp. 312–

321.

[145] A. W. Moore and D. Zuev, “Internet traffic classification using bayesian analysis tech-

niques,” in Proceedings of the 2005 ACM SIGMETRICS international conference on

Measurement and modeling of computer systems, ser. SIGMETRICS ’05, 2005, pp.

50–60.

[146] N. L. Hjort, C. Holmes, P. Muller, and S. G. Walker, Bayesian Nonparametrics. Cam-

bridge University Press, April 2010.

Security Enhanced Communications in Cognitive Networks · Security Enhanced Communications in...

Documents

Transcript of Security Enhanced Communications in Cognitive Networks · Security Enhanced Communications in...