The STC Generation & The “Casino Business Model” · 2018-01-04 · •Corporate proxy IP...
Transcript of The STC Generation & The “Casino Business Model” · 2018-01-04 · •Corporate proxy IP...
Online Advertising: The Good, The Bad, and The Ugly
Yi-Min WangGroup Manager & Principal Researcher
Cybersecurity & Systems Management Group
Microsoft Research, Redmond
The Traffic-to-Money Converter &
The STC Generation
• The Traffic-to-Money Converter
Traffic-to-Money
ConverterTraffic Money
• STC = Search, Type, and Click
– The STC generation collectively generates a
lot of web traffic
• Traffic-to-money converter for the web
– Mass-market ads syndication programs
– Mass-market exploit affiliate programs
The STC Traffic
Search Type Click
MerchantsNon-Merchants
Search
Engine
Target web pages
Mass-Market
Advertisement
Syndication
Program #1
Search
Ads
Advertisement
Syndication or
Exploit Affiliate
Programs
Spam
Ads-Portal
Page
Typo Domain
Ads-Portal
Page
Ads by
G/Y/MSpyware
Vendors
The Good
The Bad
The Ugly
Hacked
Ads
Web Analytics &Advertising Syndication
MerchantsNon-Merchants
Target web pages
Mass-Market
Advertisement
Syndication
Program #1
Ads by
G/Y/M
The Good
<img alt="" border="0" name="DCSIMG" width="1" height="1"
src="http://statse.webtrendslive.com/DCSArO55rNH8I36lrbe6wexE5_5B8I/njs.gif?
dcsuri=/nojavascript&WT.js=No"/>
Where’s The Bug?
1x1 transparent-gif web bug magnified
Web Bug
http://WhatIsMyIP.com/
http://ip-address.domaintools.com/
Web Analytics: Example #1
Primary URL on Primary Domain
Secondary URL on
Third-Party Domain
Primary URL
statse.webtrendslive.com/.../dcs.gif
Show nothing
Web Analytics: Example #2
Primary URL: http://www.aidsmeds.com/
Secondary URL on
Third-Party Domain
Primary URL
google-analytics.com/__utm.gif
ssl.google-analytics.com/urchin.js
http://www.aidsmeds.com/
Secondary URL on
Third-Party Domain
Primary URL
google-analytics.com/__utm.gif
ssl.google-analytics.com/urchin.js
Advertising Syndication
Primary URL: http://www.aids.org/factSheets/
Secondary URL on
Third-Party Domain
Primary URL
pagead2.googlesyndication.com
/pagead/show_ads.js
Before ads are
displayed; even
without clicking
any ads
Show small ads
Potential Security and Privacy Concerns
• Scripts executed without user permission– Redirection to third-party domains happened without
user knowledge
• Not all URLs get recorded in browser history– Don’t know what, when, and why
• Many consumer machines have fixed IP addresses– Like car license plate number for information highway
• Corporate proxy IP addresses necessarily identify the company– Like company logo on company vans
• Redirected-to third-party domains can set cookies– Can ID <IP address, account> pair
1.8% 1.4% 1.1% 1.0% 1.0%
13.0%
3.7%3.0%
0.0%
2.0%
4.0%
6.0%
8.0%
10.0%
12.0%
14.0%
Googlesyndication.com
Doubleclick.net
Atdmt.c
om
Fastclick.net
Amazon.com
Advertising.com
Casalemedia.com
Overture.com
Top Syndication Servers
Co
ve
rag
e o
f T
op
On
e M
illi
on
UR
Ls
Traffic Cameras for the Information Highway
One camera in every 8th street corner
Domain Parking &Typo-squatting
Type
MerchantsNon-Merchants
Target web pages
Advertisement
Syndication or
Typo Domain
Ads-Portal
Page
The Ugly
Secondary URL on
Third-Party Domain
Primary URL
apps5.oingo.com/apps/domainp
ark/domainpark.cgi
Zero content
Show full-page ads
Domain Parking
Primary URL: http://VictorasSecret.com/
It used to be much uglier… (oingo.com)
DomainSponsor.com
Internet Real Estate Business
• Rule of thumb: every unique visitor is worth 5 cents on average– $7.00 / 365 / $0.05 = 0.38 unique visitors/ day
• How to attract traffic:– Generic name domains
• Sex.com ($12 million), Diamond.com ($7.5 million), Business.com ($7.5 million in 1999), Sweatpants.com ($8,500)
– Typo-squatting domains• http://VictorasSecret.com/
– Trademark domains• http://www.MicrosoftPowerpoint.com/
When typo of
slashdot got
slashdotted…
Strider Typo-Patrol
• Typo generation algorithm
– Missing-dot typos• wwwSouthwest.com
– Character-omission typos• MarthStewart.com
– Character-permutation typos• NYTiems.com
– Character-replacement typos• DidneyWorld.com
– Character-insertion typos• WashingtonPoost.com
Top Typo-squatting Domain Parking Servers
19%
14%
3.30% 3.30% 3.10%2.20%
0%2%4%6%8%
10%12%14%16%18%20%
Oingo.com
Domainsponsor.com
Sedoparking.com
Qsrch.com
Hitfarm
.com
Netster.com
Top Domain Parking Servers
% o
f 2,2
33 A
cti
ve T
yp
os
Strider URL Tracer with Typo-Patrolhttp://research.microsoft.com/URLTracer
One in every six active typo domains
was owned by Unasi/Domaincar
Overall, one in every four active typo
domains was parked with oingo.com
For More Information
• “The Web's Million-Dollar Typos”– http://www.washingtonpost.com/wp-
dyn/content/article/2006/04/29/AR2006042900279_pf.html
• “Strider Typo-Patrol”– http://research.microsoft.com/URLTracer
• “Strider Typo-Patrol SRUTI”– Usenix SRUTI’06 workshop
Search Spam
Search
MerchantsNon-Merchants
Search
Engine
Target web pages
Advertisement
SyndicationSpam
Ads-Portal
Page
The Ugly
Google search “coach handbag”
Spam Doorway: http://coach-handbag-
top.blogspot.com/ topsearch10.com
Content
Links
Redirection Spam
Primary URL: http://coach-handbag-top.blogspot.com/
Secondary URL on
Third-Party Domain
Primary URL
http://www.topsearch10.com/se
arch.php?aid=56979...
Redirect to full-page ads; cloaking
Spam Detection
• Content-based approach
– Information retrieval-based ranking
– Applied to too many fake pages that are never shown to any users (i.e., cloaking)
• Behavior-based approach
– Strider SearchMonkeys: mimicking human browsing in full fidelity
– Comment-spam hunting, cloaking detection, tracking redirection to known-spammer domain, etc.
– Turn search spam problem into system security problem
Strider Search Ranger System
Spam
Hunters
Search Monkeys
running actual browsers
Search
engines
Primary-
URL page
Third-party
domain page
Strider URL Tracer
Third-party
domain page
Spammed
forums
Redirection
report
Known-
Bad URLs
Unclassified URLs
grouped & ranked
by redirection
domains
Spam Verifier
Redirection Spam
Analyzer Known-bad
signatures Spam suspects URLs
Confirmed spam URLs
& redirection domains
1 2
3
4
Spammer-Targeted Categories
30.8%
14.2%
8.9%
27.5%
2.7%
7.6% 7.8%3.3% 3.9%
9.7% 11.6%
0%
5%
10%
15%
20%
25%
30%
35%
Dru
gs
Adult
Gam
bling
Rin
gtone
Money
Acc
essorie
s
Travel
Car
s
Furnitu
re
Musi
c
Ave
rage
Spammer-targeted Categories
Per-
Cate
go
ry S
pam
Perc
en
tag
e
Density DCG/Max
Top Spam Doorway Domains
493
396
296242 225 218 207 178 172 150 131 124 123 110
0
100
200
300
400
500
600
blo
gspot.co
mnet
scap
e.co
m
hom
eto
wn.a
ol.com
hom
eto
wn.a
ol.de
oas
.org
xoom
er.a
lice.
it
hom
e.a
ol.co
mfr
eew
ebs.
com
blo
gstu
dio
.com
max
pag
es.c
om
usa
id.g
ov
blo
gshari
ng.c
om
sitegtr
.com
foro
space
.com
blo
g.h
lx.c
om
# o
f S
pa
m A
pp
ea
ran
ce
3,882
Spam Percentages
77% 74%84%
91%78% 77%
95%
52%
99%
81% 85%93%
100% 95% 100%
0%
20%
40%
60%
80%
100%
blo
gsp
ot.co
mnet
scap
e.co
m
hom
etow
n.a
ol.c
om
hom
etow
n.a
ol.d
eoas
.org
xoom
er.a
lice.
it
hom
e.ao
l.com
free
web
s.co
m
blo
gst
udi
o.c
om
max
pag
es.c
om
usa
id.g
ov
blo
gsh
arin
g.co
msi
tegt
r.co
mfo
rosp
ace.
com
blo
g.h
lx.c
om
% U
RL
s D
ete
cte
d a
s S
pa
m
At least 3 out of 4 were spam!
Top .gov/.edu Doorway Domains
150
6354
35 34 32 27 25 24 22 18 17 16 15 13
0
2040
6080
100
120140
160
usa
id.g
ov
mit.e
du
gat
ech.e
du
ucs
d.e
du
tsm
u.e
du
cuden
ver
.edu
uco
nn.e
du
eva
nsvi
lle.e
du
har
vard
.edu
virgin
ia.e
du
apu.e
du
neu
.edu
dot.gov
uch
icag
o.e
du
was
hin
gto
n.e
du
# o
f S
pa
m A
pp
ea
ran
ce
s
Universal Redirectors
• www.google.com/url?q=http://evetamthes.the.forthnet.gr/login.htm
• rds.yahoo.com/_ylt=/*http://frft.networkforbestever.org/ps/
• store.adobe.com/cgi-bin/redirect/n=14630?http://rme19-funny-
ringtones.blogspot.com
• www.usaid.gov/cgi-bin/goodbye?http://xanax.anothervision.info
• www.ihs.gov/PublicInfo/Publications/Kids/safety/IHS_DisclaimerKids_prod.cfm?link_out=http://waypossible.com/dr/cas
ino
• www.fmcsa.dot.gov/redirect.asp?page=http://maxpages.com/tro
ctrocbas
• serifos.eecs.harvard.edu/proxy/http://pharmacy-goods.com/r/tramadol
• www.library.drexel.edu/cgi-bin/r.cgi?url=http://replica-
watches.20six.co.uk
Top Redirection Domains
1156
1022
879
649543
398 356 334 326 309 308 289 266 260 258
0
200
400
600
800
1000
1200
1400
pays
efe
ed.n
et
topse
arc
h10.c
om
topm
eds10
.com
them
p3dir
ect.co
m
sea
rchad
v.c
om
six
xx.in
fo
rightf
inder.
net
vip
-onlin
e-s
earc
h.in
foa3b
4.in
fo
topm
obile
10.c
om
yourf
astf
ind.o
rgare
arate
.com
find-m
ore
.biz
yourf
reevid
s.c
om
webre
sours
es.
info
# o
f S
pa
m A
pp
ea
ran
ce
s
Malicious Websites
Search Type
MerchantsNon-Merchants
Search
Engine
Target web pages
Exploit Affiliate
Programs
Spam
Ads-Portal
Page
Typo Domain
Ads-Portal
Page
Spyware
Vendors
The Bad
Hacked
Ads
Google search “pain killer”
Malicious Spam
Primary URL: http:// www. blogigo. de/ pain_killer
Vulnerability exploits;
Sometimes window closed
http:// biopharmasite. info/
directory.php
Secondary URL on
Third-Party Domain
Primary URL
Exploit failed… Click here to install…
TheRegister.com Malicious Banner
MySpace.com Malicious Banner
Where Did This Come From?
IDS
Firewall
Honeypot versus HoneyMonkey
Honeypot
Server
Process
Server-Side
Vulnerability
Server
Process
Malicious
or Hacked
Client
Malicious
Network
Packets
Client-Side
Vulnerability
Browser
Malicious
or Hacked
Web Server
HTTP
Request
Malicious
HTTP
Response
Blacklist
Honey
Monkey
Browser
= Spider Crawler
Takedown
FDR
Sandbox
Virtual
Machine
URL
Tracer
Other
Content
Provider
HoneyMonkey Blackbox Exploit Detection
Browser
Content
Provider
Obfuscated
Java Scripts
Exploit
Provider
Third-Party URLs
Malware Installation
Malicious Scripts
Density of Malicious Websites
Suspicious List Popular List
# URLs scanned 16,190 1,000,000
# Exploit URLs 207 (1.28%) 710 (0.071%)
# Exploit URLs
After Redirection
(Expansion Factor)
752
(263%)
1,036
(46%)
# Exploit Sites 288 470
SP2-to-SP1 Ratio 204/688 = 0.30 131/980 = 0.13
Infection Rate Heavily Depends on Patch Level(May~June 2005)
# Exploit URLs # Exploit Sites
Total 752 288
WinXP SP1-UP(UP=UnPatched)
688 268
WinXP SP2-UP 204 115
WinXP SP2-PP(PP=Partially Patched)
17 10
WinXP SP2-FP(FP=Fully Patched)
0 0
0
5
1015
20
25
30
0 50 100 150 200 250
Site ranking based on the number of hosted exploit URLs
Nu
mb
er
of
ho
ste
d
ex
plo
it U
RL
sSite Ranking by Number of Hosted Exploit URLs
toolbarpartner.com
.edu: hacked course bulletin board
Exploit Pages Organized by Account Names
http://ToolbarPartner.com
/adverts
/romas /west /0MhNSYFE
/x-web
/index.html
/index2.html /page1.htm
/index.html /index2.html /page1.htm
/index.html
/index2.html /page1.htm
/index.html
/index2.html /page1.htm
Pretend to be an
Advertisement
Syndicator
Zero-Day Exploit DetectionVulnerabilities exploited before patch was released
• Used to be an ad-hoc & manual process that relied heavily on external finders
• HoneyMonkey turned it into a systematic & automatic process that allows Microsoft to lead the battle– HoneyMonkey running on fully patched WinXP SP2 VM
constantly scanning the 752 exploit URLs
• The Javaprxy.dll zero day– Early July, 2005: detected the first zero-day exploit URL within
2.5 hours of scanning confirmed to be the first in-the-wild exploit URL reported to MSRC
– 26 of the 752 URLs “upgraded” to the javaprxy exploit
• 25 of them generated third-party URLs to an unknown exploit provider site: hxxp://82.179.166.2/[8 random chars]/test2/iejp.htm
– Takedown notices sent most, but not all, of the 25 URLs stopped exploiting javaprxy
Observations
• Monitoring easy-to-find exploit URLs is effective
– Zero-day exploits need to connect to popular pages
• Monitoring content providers with well-known
URLs is effective (because they cannot move)
– Exploit providers can move and randomize URLs
• Monitoring highly ranked and advanced exploit
URLs is effective
– First detected zero-day exploit URL belongs to the #9
site
– 7 of the top 10 sites upgraded (by connection counts)
– Nearly half of the SP2-PP exploit URLs upgraded
HoneyMonkey Anti-Exploit Process
Search
Engine
Crawler
O(1010) pages
on the Web
HoneyMonkey Network
of O(102) PCs
running unpatched VMs
Top O(108)
URLs(ranked by
click-through
counts, etc.)
O(104)
exploit
URLs
HoneyMonkey
Network of O(101) PCs
running partially- or
fully-patched VMs
O(101)
zero-day
exploit
URLs
Anti-Spyware
& Anti-Virus
Security
Response
Center
Legal
Takedown
Corporate
Proxy
Blocking
ISP
Blocking
Browser
Blocking
Search
Result
Blocking
SPAM
URLs
SPIM
URLsSpam URLs from Strider Search Ranger
Summary
• A common redirection-based framework for analyzing:– Web Bugs
– Advertising Syndication
– Typo-squatting
– Redirection Spam
– Malicious websites
• Automated web patrol with Strider monkeys– Analyzing individual web pages with known-bad
signatures
– Analyzing groups of web pages to discover new signatures