Exploiting Temporal Persistence to Detect Covert Botnet Channels Authors: Frederic Giroire, Jaideep...

17
Exploiting Temporal Persistence to Detect Covert Botnet Channels Authors: Frederic Giroire, Jaideep Chandrashekar, Nina Taft… RAID 2009 Reporter: Jing Chiu Email: [email protected] 111/03/1 5 1 Data Mining & Machine Learning Lab

Transcript of Exploiting Temporal Persistence to Detect Covert Botnet Channels Authors: Frederic Giroire, Jaideep...

Exploiting Temporal Persistence to Detect Covert Botnet ChannelsAuthors: Frederic Giroire, Jaideep Chandrashekar, Nina Taft…RAID 2009Reporter: Jing ChiuEmail: [email protected]

112/04/18 1Data Mining & Machine Learning Lab

Outlines

•Introduction•Methodology•Dataset Description•Evaluation•Conclusions

112/04/18 2Data Mining & Machine Learning Lab

Introduction• How do bots get rid of existing defenses?

▫Polymorphic engines ▫packing engines▫AV vendor reports 3000 distinct samples daily

• Anomaly detection methods for botnet▫Use traffic feature distributions for analysis▫Detect bots activated for generating attacks▫Latency exist from infection to activation

• Covert channel between bots and C&C server▫Last for an extended period▫Lightweight and spaced out over irregular time

period112/04/18 3Data Mining & Machine Learning Lab

Methodology

•Assumptions▫Communication between Zombie and C&C

server is not limited to a few connections▫Zombie is not programmed to use a

completely new C&C server at each new attempt

•Persistence and destination atoms▫Destination atoms for building white lists▫Persistence for lightweight repetition

112/04/18 4Data Mining & Machine Learning Lab

Methodology (cont.)

•Why use white lists?▫Regularly communicate hosts is a stable,

small set Examples:

Work related, news and entertainment websites Mail servers, update servers, patch servers,

RSS feeds Advantages:

Search fast Easy to management

▫These hosts require infrequent updating112/04/18 5Data Mining & Machine Learning Lab

Methodology (cont.)•Destination atoms

▫(dstService, dstPort, proto)▫Different domains: second level domain name

Yahoo.com, cisco.com▫The same domains: third level domain name

Mail.intel.com, print.intel.com▫Multiple ports is allowed

(ftp.service.com, 21:>1024, tcp)▫ When address cannot be mapped to names,

use IP address as service name▫Examples

112/04/18 6Data Mining & Machine Learning Lab

Methodology (cont.)

•Persistence metric▫d: destination atom W = [s1, s2,…, sn]

▫W: observation window si: measurement window

▫Timescale: (W,s)▫For each timescale(Wj,sj): 1≤j≤k

n

isd in

Wdp1

,11

),(

*)( )(max pdp j

j

112/04/18 7Data Mining & Machine Learning Lab

Methodology (cont.)

•C&C Detection Implementation▫Use long bitmap to track connections at

each timescale▫Procedure

Update bitmap, count persistence If updated persistence crosses the threshold

p*, raise alarm After enough samples, the persistence is

below the threshold, free bitmap up•Bitmap example112/04/18 8Data Mining & Machine Learning Lab

Dataset Description• End host traffic traces

▫ Collected at 350 enterprise user’s hosts▫ Over 5 week▫ Use 157 of the 350 traces, common 4 week period

• Botnet traffic traces▫ Collected 55 known botnet binaries▫ Executed inside a Windows XP SP2 VM and run for

as long as a week▫ Experience

A lot of binaries simply crashed the VM C&C deactivated Only 27 binaries yielded traffic 12 of the 27 binaries yielded traffic that lasted more

than a day List of sampled Botnet binaries

112/04/18 9Data Mining & Machine Learning Lab

Evaluation

•System Properties

CDF of p(d) across all the atoms seen in training data

Distribution of per host whitelist sizes (p* = 0.6)

112/04/18 10Data Mining & Machine Learning Lab

Evaluation•C&C Detection

•Other results

RoC curve False positives across usres(p* = 0.6)

112/04/18 11Data Mining & Machine Learning Lab

Evaluation

•Improvement in detection rate after filtering

112/04/18 12Data Mining & Machine Learning Lab

Conclusions• Introduce “persistence” as a temporal

measure of regularity in connection to “destination atoms”

•Persistence could help detect malware without▫protocol semantics ▫payloads

•Proposed a method for detecting C&C server and has no false negative in experiment

•Both centralized and p2p infrastructure could be uncovered by this method

•Low overhead and low user annoyance factor112/04/18 13Data Mining & Machine Learning Lab

Destination atoms

112/04/18 14Data Mining & Machine Learning Lab

Bitmap Example

112/04/18 15Data Mining & Machine Learning Lab

List of Botnet binaries

112/04/18 16Data Mining & Machine Learning Lab

C&C detection result

112/04/18 17Data Mining & Machine Learning Lab