Presentation by Kathleen Stoeckle All Your iFRAMEs Point to Us
17th USENIX Security Symposium (Security'08), San Jose, CA, 2008
Google Technical Report Niels Provos Panayiotis Mavrommatis Moheeb
Abu Rajab Fabian Monrose
Slide 2
Outline Purpose Background Information Data Collection Results
Post-Infection Impact Related Work Conclusions Strengths and
Weaknesses
Slide 3
Purpose Analysis of malware using malicious URLs collected over
a ten month period. Identify malware trends. Raise questions about
the security practices employed by site administrators.
Slide 4
Background Information
Slide 5
Techniques for Delivering Web-Malware 1. Attackers use websites
in order to encourage visitors of the site to download and run
malware. 2. Drive-by Downloads Attackers target browser
vulnerabilities in order to automatically download and run a
malicious binary upon visiting the website (unknown to the
user).
Slide 6
Definitions Landing pages and malicious URLs URLs that initiate
drive-by downloads when users visit them. Landing sites - Sites
with top level domain names. Distribution site A remote site that
hosts malicious payloads. iFRAME An html element that makes it
possible to embed html inside another HTML document.
Slide 7
Existing Malware Installation Strategies Remote exploitation of
vulnerable network services Connection to malicious servers Inject
malicious content into benign websites Exploit scripting
applications
Slide 8
Malicious Binary Injection Techniques Lure web users to connect
to malicious servers that deliver exploits. (target vulnerabilities
of web browsers or plugins) Inject content into benign websites :
Exploit vulnerable scripting applications (p.4) Generally a link
that redirects to malicious website that hosts the script to
exploit browser. oInvisible HTML components (0 pixel iFRAMES) to
hide injected content. Use websites that allow users to contribute
content.
Slide 9
Drive-by Download.p.5
Slide 10
Data Collection Infrastructure and Methodology Pre-Processing
Verification
Slide 11
Inspect URLs in google repository and determine which trigger
drive-by downloads.
Slide 12
Pre-Processing Phase Mapreduce framework to process billions of
websites. Uses certain features to identify these sites: out of
place iFRAMES Obfuscated javascript iFRAMES to known distribution
sites One billion sites analyzed daily, 1 million pass on to
verification phase.
Slide 13
Verification Phase Determines whether URL from pre-processing
phase is malicious. Web honeynet: Execution-based heuristics
Anti-virus engines Criteria: Must meet threshold One http response
must be marked malicious by the anti-virus scanner A url that has
met threshold, but has no incoming payload is marked as suspicious.
One million scanned, 25,000 marked malicious per day.
Slide 14
Constructing Malware Distribution Networks Analysis of recorded
network traces. Combine malware delivery trees Live for 1 year
Focus on drive-by downloads
Slide 15
Results
Slide 16
Data Collection Summary 10 month period 3 million malicious
URLs found on 180,000 landing sites. Over 9,000 distribution sites
Data Collection Period January - October 2007 Total URLs checked
in-depth66,534,330 Total suspicious landing URLs3,385,889 Total
malicious landing URLs3,427,590 Total malicious landing
sites181,699 Total distribution sites9,340
Slide 17
Impact on Users At least 1 malicious URL returned in results
(approx. 1.3% of overall search queries) Most popular landing page
has a rank of 1,588 Of top 1 million URLs, 6,000 verified malicious
during inspection.
Slide 18
Malware Hosting Site Distribution by Country
Slide 19
Malware Landing Site Distribution by Country
Slide 20
Random URL Sample
Slide 21
Malicious URLs by Subject Percentage of landing sites
Slide 22
Malicious Content Injection Web malware is not tied to browsing
habits. Drive-by downloads can be triggered in benign websites:
Compromised Web server Third party contributed content
Slide 23
Webserver Software Outdated software with known vulnerabilities
Increased risk of content control by server exploitation. Ads 2% of
landing sites 12% overall search content returned landing pages
with malicious content. Short-lived compared to other malicious
content-injecting techniques 75% have long delivery chains (50%
with over six steps)
Slide 24
Properties of Malware Distribution Infrastructure Size Networks
that use only 1 landing site Networks that have multiple landing
sites IP Space Locality Concentrated on limited number of /8
prefixes. 70% malware distribution sites 58.*--62.* and
209.*--221.* Similar for scam hosting infrastructure 50% of landing
sites Distribution of Malware Binaries Across Domains Hosting: 90%
Single IP Address, 10% Multiple IP addresses Sub-folders of DNS
name: 512j.com/akgy 512j.com/alavin 512j.com/anti
mihanblog.com/abadan2 or mihanblog.com/askbox
Slide 25
Properties of Malware Distribution Infrastructure Examination
of overlapping landing sites. 80% of distributions networks share
at least 1 landing page. Multiple iFRAMES linking to different
malware distribution sites. 25% of malware distribution share at
least one binary. Binaries less frequently shared between
distribution sites compared to landing sites.
Anti-Virus Engine Detection Rates Pull-based delivery system
Evaluate detection rates of well known anti-virus engines against
suspected malware samples. Average of 70% for best engine (Even
best anti-virus engine with latest definitions fail to cover
significant percentage of web malware) False Positives 6%
Slide 30
Related Work Honeypots Moshschuk et al. Decrease in links to
spyware labeled executables over time. Provos et al. And Seifert et
al. Raised awareness of threats posed by drive-by downloads. Wang
et al. Exploits in Internet Explorer on Windows XP. 200/17,000 URLs
dangerous Malware Detection by Dynamic Tainting Analysis Insight
into mechanisms malware installs itself and operates.
Slide 31
Conclusions 1.3% of incoming search queries on google return at
least one link to a malicious site. Users lured into malware
distribution networks by content in online Ads. Avoiding dark
corners of the Internet does not limit exposure to malware.
Anti-virus engines are lacking.
Slide 32
Strengths and Weaknesses Useful survey about malware
installation. Broad data range Only examines google database For
the most part, evaluation was automated and due to the broad scope,
there is a lot missing in the analysis. Did not explain
acronymns
Slide 33
References All Your iFRAMEs Point to Us. Niels Provos and
Panayiotis Mavrommatis, Moheeb Abu Rajab, Fabian Monrose. 17th
USENIX Security Symposium (Security'08), San Jose, CA, 2008. All
Your iFRAMEs Point to Us