Measuring the Impact of Sharing Abuse Data with Web ... · -0.10-0.05 0.00 0.05 0.10 0.15 Median...

Post on 13-Sep-2020

3 views 0 download

Transcript of Measuring the Impact of Sharing Abuse Data with Web ... · -0.10-0.05 0.00 0.05 0.10 0.15 Median...

Measuring the Impact of Sharing Abuse Data with WebHosting Providers

Marie Vasek, Matthew Weeden, and Tyler Moore

University of Tulsa

WISCS24 October 2016

1 of 27

StopBadware

• Founded in 2006 by Harvard’s Berkman Klein Center for Internetand Society

• Now housed at the University of Tulsa

• Provides independent reviews of websites appearing on 3 malwareblacklists

3 of 27

Review Requests for Individual URLs

4 of 27

Review Requests for Bulk URLs

5 of 27

Research Questions

Does sending bulk reports help?

• Short term:◦ Do reported URLs get cleaned up?◦ Which URLs are more likely to get cleaned up?

• Long term:◦ Do ASes get better at cleaning URLs after receiving bulk reports?

6 of 27

Overview

• Brief overview of study

• Define metrics

• Direct impact of sharing abuse data

• Indirect impact of sharing abuse data

• Conclusions

7 of 27

Bulk Requests over Time

2010 2011 2012 2013 2014 2015

15

5050

050

00

Date shared

# U

RLs

sha

red

8 of 27

Summary Statistics

• Google Safebrowsing Data used exclusively

• 6 year time frame (2010 - 2015)

• 69 stakeholders requested reports

• 41 web hosting providers in our study◦ Responsible for entire AS◦ Sent Google Safebrowsing Data◦ Had at least a month of data before/after

• 28 548 URLs reported

9 of 27

Malware Cleanup Metrics

• Clean◦ Off the blacklist◦ Stays off for 3 weeks

• Recompromise◦ A previously blacklisted URL is clean and then is reblacklisted

10 of 27

Measuring Direct and Indirect Impact of Reporting

• Direct Impact◦ Are the URLs we shared cleaned up?

• Indirect Impact◦ Are networks “better” after receiving a bulk review from

StopBadware?

• Do they clean malware URLs faster?• Do they clean malware URLs more effectively?

11 of 27

Measurement Timeline

blacklisted reported clean

blacklist to report

report to clean

blacklist to clean

12 of 27

Cleanup of URLs Shared with ASes

1 5 50 500

0.0

0.2

0.4

0.6

0.8

1.0

URLS shared with ASes

Report to Clean (days)

Pr(

repo

rt to

cle

an d

ays

>=

X)

13 of 27

Measurement Timeline

blacklisted reported clean

blacklist to report

report to clean

blacklist to clean

14 of 27

Long Lived Malware Takes Longer to Clean

0−10%

10−20%

20−30%

30−40%

40−50%

50−60%

60−70%

70−80%

80−90%

90−100%

010

020

030

040

050

0

●●

●●

020

040

060

080

010

00

Decile for Blacklist to Report (Days)

Med

ian

Rep

ort t

o C

lean

(D

ays)

[Bar

]

Bla

cklis

t to

Rep

ort (

Day

s) [L

ine]

15 of 27

Pre- vs. Post-Contact Cleanup

1 2 5 10 50 200

0.0

0.2

0.4

0.6

0.8

1.0

Survival probability before and after contact

Blacklist to Clean (days)

Pr(

blac

klis

t to

clea

n da

ys >

=X

) pre−contactpost−contact

16 of 27

Pre- vs. Post-Contact Cleanup: Improved AS

17 of 27

Pre- vs. Post-Contact Cleanup: Worsened AS

18 of 27

Pre- vs. Post-Contact Cleanup: Unclear effect AS

19 of 27

Change in Metrics Pre- and Post- Sharing

# ∆ days to clean ∆ recomp. rate

Improved 13 58 0.010Worsened 3 -176 0.085Unclear 17 13 0.008

20 of 27

Comparing Change in Metrics by AS

−300 −200 −100 0 100

−0.

10−

0.05

0.00

0.05

0.10

0.15

Median blacklist to clean pre−sharing − post−sharing

Med

ian

reco

mpr

omis

e ra

te p

re−

shar

ing

− p

ost−

shar

ing

● Top Quartile Report to Clean2nd Quartile Report to Clean3rd Quartile Report to CleanBottom Quartile Report to Clean

21 of 27

Matched Pair Analysis

• What would happen if StopBadware had not sent out reviews?

• Matched pairs between reported-to ASes and similar ASes

• Similar?◦ Same country◦ Similar level of badness

• Key Assumption: All else equal, ASes would exhibit similar patterns

22 of 27

Measurement Timeline

blacklisted reported clean

blacklist to report

report to clean

blacklist to clean

23 of 27

Matched Pair: Cleanup of URLs Shared with ASes

1 5 50 500

0.0

0.2

0.4

0.6

0.8

1.0

URLS shared with ASes

Report to Clean (days)

Pr(

repo

rt to

cle

an d

ays

>=

X)

reported ASesmatched pairs

24 of 27

Matched Pair: Pre- vs. Post-Contact Cleanup

1 2 5 10 50 200

0.0

0.2

0.4

0.6

0.8

1.0

Survival probability before and after contact

Blacklist to Clean (days)

Pr(

blac

klis

t to

clea

n da

ys >

=X

) pre−contactpost−contactpre−contact (mp)post−contact (mp)

25 of 27

Responsive ASes Improve Long Term after Report

−300 −200 −100 0 100

−0.

10−

0.05

0.00

0.05

0.10

0.15

Median blacklist to clean pre−sharing − post−sharing

Med

ian

reco

mpr

omis

e ra

te p

re−

shar

ing

− p

ost−

shar

ing

● Top Quartile Report to Clean2nd Quartile Report to Clean3rd Quartile Report to CleanBottom Quartile Report to Clean

26 of 27

Conclusions

• Directly sharing URLs helps clean up those URLs◦ Consistent with prior work on individual reports◦ This work finds it to be true for bulk reporting

• No evidence for long term change overall◦ Improvements on individual providers

• Long lived malware a scourge◦ Lots of efforts concentrating on newly infected websites◦ Lurking infections continue to harm, perhaps compounding◦ Current efforts not sufficient for stopping this “immortal” malware

27 of 27