Exploiting Temporal Persistence to Detect Covert Botnet Channels Authors: Frederic Giroire, Jaideep...
Transcript of Exploiting Temporal Persistence to Detect Covert Botnet Channels Authors: Frederic Giroire, Jaideep...
Exploiting Temporal Persistence to Detect Covert Botnet ChannelsAuthors: Frederic Giroire, Jaideep Chandrashekar, Nina Taft…RAID 2009Reporter: Jing ChiuEmail: [email protected]
112/04/18 1Data Mining & Machine Learning Lab
Outlines
•Introduction•Methodology•Dataset Description•Evaluation•Conclusions
112/04/18 2Data Mining & Machine Learning Lab
Introduction• How do bots get rid of existing defenses?
▫Polymorphic engines ▫packing engines▫AV vendor reports 3000 distinct samples daily
• Anomaly detection methods for botnet▫Use traffic feature distributions for analysis▫Detect bots activated for generating attacks▫Latency exist from infection to activation
• Covert channel between bots and C&C server▫Last for an extended period▫Lightweight and spaced out over irregular time
period112/04/18 3Data Mining & Machine Learning Lab
Methodology
•Assumptions▫Communication between Zombie and C&C
server is not limited to a few connections▫Zombie is not programmed to use a
completely new C&C server at each new attempt
•Persistence and destination atoms▫Destination atoms for building white lists▫Persistence for lightweight repetition
112/04/18 4Data Mining & Machine Learning Lab
Methodology (cont.)
•Why use white lists?▫Regularly communicate hosts is a stable,
small set Examples:
Work related, news and entertainment websites Mail servers, update servers, patch servers,
RSS feeds Advantages:
Search fast Easy to management
▫These hosts require infrequent updating112/04/18 5Data Mining & Machine Learning Lab
Methodology (cont.)•Destination atoms
▫(dstService, dstPort, proto)▫Different domains: second level domain name
Yahoo.com, cisco.com▫The same domains: third level domain name
Mail.intel.com, print.intel.com▫Multiple ports is allowed
(ftp.service.com, 21:>1024, tcp)▫ When address cannot be mapped to names,
use IP address as service name▫Examples
112/04/18 6Data Mining & Machine Learning Lab
Methodology (cont.)
•Persistence metric▫d: destination atom W = [s1, s2,…, sn]
▫W: observation window si: measurement window
▫Timescale: (W,s)▫For each timescale(Wj,sj): 1≤j≤k
n
isd in
Wdp1
,11
),(
*)( )(max pdp j
j
112/04/18 7Data Mining & Machine Learning Lab
Methodology (cont.)
•C&C Detection Implementation▫Use long bitmap to track connections at
each timescale▫Procedure
Update bitmap, count persistence If updated persistence crosses the threshold
p*, raise alarm After enough samples, the persistence is
below the threshold, free bitmap up•Bitmap example112/04/18 8Data Mining & Machine Learning Lab
Dataset Description• End host traffic traces
▫ Collected at 350 enterprise user’s hosts▫ Over 5 week▫ Use 157 of the 350 traces, common 4 week period
• Botnet traffic traces▫ Collected 55 known botnet binaries▫ Executed inside a Windows XP SP2 VM and run for
as long as a week▫ Experience
A lot of binaries simply crashed the VM C&C deactivated Only 27 binaries yielded traffic 12 of the 27 binaries yielded traffic that lasted more
than a day List of sampled Botnet binaries
112/04/18 9Data Mining & Machine Learning Lab
Evaluation
•System Properties
CDF of p(d) across all the atoms seen in training data
Distribution of per host whitelist sizes (p* = 0.6)
112/04/18 10Data Mining & Machine Learning Lab
Evaluation•C&C Detection
•Other results
RoC curve False positives across usres(p* = 0.6)
112/04/18 11Data Mining & Machine Learning Lab
Evaluation
•Improvement in detection rate after filtering
112/04/18 12Data Mining & Machine Learning Lab
Conclusions• Introduce “persistence” as a temporal
measure of regularity in connection to “destination atoms”
•Persistence could help detect malware without▫protocol semantics ▫payloads
•Proposed a method for detecting C&C server and has no false negative in experiment
•Both centralized and p2p infrastructure could be uncovered by this method
•Low overhead and low user annoyance factor112/04/18 13Data Mining & Machine Learning Lab