Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband...

59
Dark Data Svea Eckert Andreas Dewes

Transcript of Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband...

Page 1: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Dark DataSvea Eckert – Andreas Dewes

Page 2: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Who we are

Svea EckertJournalist NDR/ ARD

@sveckert @japh44

Andreas Dewes (data) scientist

Page 3: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Why we are here

Page 4: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

“US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit

consent before selling or sharing Web browsing data (...)“

3/23/2017https://arstechnica.com

Page 5: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

What does that mean

Page 6: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

You can seeeverything–S*#t!

ARD, Panorama, 03.11.2016

Page 7: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 8: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

ARD, Panorama, 03.11.2016

Page 9: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 10: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

ARD, Panorama, 03.11.2016

I don‘t know, why I was searching for “Tebonin” at that time.

This is really bad to see something like this – especially if it is connected with my own name.

Page 11: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

More members of parliament and their employees

Page 12: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Employee of Helge Braun, CDU – Assistant Secretary of the German Chancellor

Page 13: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

How we did it – the “hacking” part

Page 14: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 15: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Social engineering

Page 16: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 17: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 18: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 19: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 20: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 21: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 22: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

What we have discovered

Page 23: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

14 days (live) access

3 million (German) User Ids

Browsing data for one month

cat *.csv | grep "%40polizei.de"

Page 24: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 25: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Autoscout Webseite

Page 26: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 27: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Sehr geehrte Damen und Herren, im Rahmen eines hier bearbeitetenErmittlungsverfahrens wegen Computerbetrug (Aktenzeichen) benötige ichgem. § 113 TKG i.V.m. § 100j StPO eine Auskunft zu Bestandsdaten zufolgender IP-Adresse: xxx.xxx.xxx.xxx Zeitstempel: xx.xx.2016, 10:05:31 MESZDie Daten werden für die Ermittlung des Täters benötigt. Bitte übersenden SieIhre Antwort per Email an die [email protected] oder per Telefax.

Vorname NachnameKriminalhauptkommissarKriminalpolizeidirektion, OrtCyberCrimeTelefonnummer

Page 28: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Ladies and Gentlemen, because of an investigation concerning computer fraud(file number), which I have dealt with here, § 113 TKG i.V.m. § 100j StPO I needinformation on following IP address: xxx.xxx.xxx.xxx Time stamp: xx.xx.2016, 10:05:31 CEST

The data is needed to identify the offender. Please send your answer by e-mailto the following address

[email protected] or by fax.

first nameLast NameDetective Chief Place of countyCybercrimephone number

Page 29: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Where do I find tilde on my keyboard

What is IP 127.0.0.1

Page 30: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Who did this

Page 31: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Browser Plugins

Page 32: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Test in virtual machine

Test

Uninstalled Ad-Ons

Suspected WOT (Web of Trust)

Page 33: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

[DATUM] 11:15:04 http://what.kuketz.de/

[...]

[DATUM] 15:49:27 https://www.ebay-kleinanzeigen.de/p-anzeige-bearbeiten.html?adId=xxx[DATUM] 13:06:23 http://what.kuketz.de/

[...]

[DATUM] 11:22:18 http://what.kuketz.de/[DATUM] 14:59:30 http://blog.fefe.de/

[...]

[DATUM] 14:59:36 http://what.kuketz.de/[DATUM] 14:59:44 https://www.mywot.com/en/scorecard/what.kuketz.de?utm_source=addon&utm_content=rw-viewsc

[...]

[DATUM] 13:48:24 http://what.kuketz.de/

[...] test by Mike Kuketz / www.kuketz-blog.de

Page 34: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before
Page 35: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

How does deanonymization work?

...

anonymized user data public / external personal data

User 1

User 2

User N

Identifier(e.g. name)

Page 36: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

"Instant" deanonymization via unique URL

Page 37: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Combinatorial deanonymization

https://www.cs.cornell.edu/~shmat/shmat_oak08netflix.pdf

Page 38: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Netflix Data vs. IMDB Data

Provided anonymized ratings associated with user name / real name

Page 39: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Our data set

3.000.000.000 9.000.000 3.000.000

URLs(insufficiently anonymized)

Domains Users

Page 40: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

m

Frequency analysis of domains▪ We removeeverything but thedomain and user Id

▪ „Did this user visitthis domain?"(yes / no)

▪ We investigate howeasy it is toreidentify a usergiven his/her domain data

▪ We only look at users that havevisited at least tendomains

domain popularity rank

num

bero

fURL

s in

dom

ain

experimental data

Page 41: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Let‘s categorize our users

...

...

... ...

...

Domains

User

s

=> sparsely populated matrix with 9.000.000 x 1.000.000 entries

Page 42: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Algorithm

▪ Generate user/domain matrix M

▪ Generate vector v with informationabout visited domains

▪ Multiply M·v

▪ Look for best match

M = (...)

w = M·v

i = argmax (w)

Page 43: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

How unique am I?

15.561

1.114.408

www.gog.com kundencenter.telekom.de banking.sparda.de

113671

handelsblatt.com

Page 44: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

How well does this work?

Top-200 domains are already sufficient to identifya large fraction of our users

number of top domains used for analysis

med

ian

ofra

nk o

fcor

rect

user

Page 45: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

But how can public information be extracted?Three examples

Page 46: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Twitter

• We use the Twitter API todownload tweets from therelevant time period (onemonth)

• We extract URLs from thetweets and generate theassociated domain byfollowing the links

• We feed the domaininformation into our algoritm

Page 47: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

VisitedWebsites

github.com (2.584.681)www.change.org (124.152)fxexperience.com (394)community.oracle.com (5161)paper.li (2689)javarevisited.blogspot.de (525)www.adam-bien.com (365)rterp.wordpress.com (129)

Gotcha!

Examples

users (arbitrarily sorted)

num

bero

fmat

chin

gdo

mai

ns

Page 48: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Seemingly harmless identifiers can betray you

https://www.youtube.com/watch?v=DLzxrzFCyOs

Page 49: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Youtube

▪ We download public playlistsfrom users (often linked via Google+)

▪ We extract the video IDs usingthe Youtube API

▪ We feed the resulting (full)URLs into our algorithm (thistime with full URL info)

Page 50: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

02Zm-Ayv-PA18rBn4heThI2ips2mM7Zqw2wUvlTUi8kQ34Na4j8AVgA3VVuMIB2hC04fXvJHrbUTA4ulaGjwiIbo5BzkbSq7pww5RDSkR8_AQ0680R1Gq2YYU6IHq9yv_qis8d5QEWdHchk...

Gotcha!

Example

Video-IDs:

users (arbitrarily sorted)

num

bero

fmat

chin

gvi

deos

in p

rofil

e

Page 51: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Geo-basedidentification

▪ We extract geo-data from Google Maps URLS (i.e. what coordinatewas the userlooking at)

Page 52: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

GoogleMaps

▪ Ratings and photos areoften publicly available(thanks again,Google+)

▪ Locations of interestcould also be extractedfrom social mediaaccounts

▪ A few data points arealready enough toidentify you

Page 53: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Can I hide in my data by generating noise?(e.g. via random page visits)

Usually not¯\_(ツ)_/¯

argmax ||M·v|| is robust against isolated (additive) perturbation

Page 54: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Why use extensions for tracking?

tracking server

Page 55: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Analysis of data points per extension95 % of the data comesfrom only 10 extensions.

Many more are spying on their users, but have asmall installation base.

Up to 10.000 extensionversions affected (upperbound analysis via extension ID)

rank of extension

num

bero

fdat

apo

ints

from

exte

nsio

n

Page 56: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Behavior analysis of chrome extensions(via Selenium Webdriver + Docker)

plugin that behaves suspiciously

number of extension (arbitrarily sorted)

Num

bero

freq

uest

smad

eby

exte

nsio

n

Page 57: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

(How) can I protect myself?

Rotating proxy servers (n >> 1)e.g. TOR or a VPN with rotating exitnodes

Client-side blocking of trackers

Page 58: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Takeaways

Often, only a few external data points (<10) are suffcientto uniquely identify a person.

The increase in publicly availableinformation on many peoplemakes de-anonymization via linkage attacks easiert than everbefore.

High-dimensional, user-related datais really hard to robustly anonymize(even if you really try to do so).

Page 59: Build your own NSA - DEF CON CON 25/DEF CON 25... · “US Senate voted to eliminate broadband privacy rules that would have required ISPs to get consumers' explicit consent before

Special thanks to

Kian Badrnejad, NDRJasmin Klofta, NDRJan Lukas Strozyk, NDR

Martin Fuchs @wahlbeobachterStefanie HelbigMike Kuketz, kuketz-blog.de

Many anonymous sources and contributorsTV shows ARD Panorama, Panorama3 und ZAPP

http://daserste.ndr.de/panorama/archiv/2016/Nackt-im-Netz-Intime-Details-von-Politikern-im-Handel,nacktimnetz110.html