Malware Halting · 2017-05-16 · Malware—malicious software used to ... fraction of the...

$: Malware Halting · 2017-05-16 · Malware—malicious software used to ... fraction of the computers to halt malware spreading 16. Epidemiological model We model viruses and worms$
Malware Halting Part I: Method Development

Kjell Jørgen HoleSimula@UiB

Last updated 16.05.17

Overview

1. Malware

2. Software diversity

3. Computer “immunization”

4. Epidemiological model

5. Malware halting analysis

6. Malware halting method

2

Malware defined

Malware—malicious software used to

• disrupt computer operations

• gather sensitive information, or

• gain access to private systems

3

Worms

VirusesRootkits

Spyware

Keyloggers

Adware

Trojan horses

Backdoors

Dialers

4

bots

Ransomware

Infectious malware

We’ll concentrate on infectious malware:

• Viruses—need user intervention to spread

• Worms—spread automatically

5

Spreading mechanisms (1)

Random scanning selects target IP addresses at random (all nodes are neighbors)

• used by Code Red and Slammer worms

Localized scanning selects most hosts in the “local” address space

• used by Code Red II and Nimda worms

6


Topological-scanning relies on information contained in infected hosts to locate new targets

• the information may include (BGP) routing tables, email addresses, a list of peers, and Uniform Resource Locations (URLs)

• used by the Morris worm

7


Hitlist consists of potentially vulnerable machines that are gathered beforehand and targeted first when the worm is released

• the flash worm gathered all vulnerable machines into its hitlist

8

Software diversity

We consider systems of networked computing devices, such as computers, smartphones, and tablets

Each device downloads software from application stores utilizing compilers with “diversity engines”

9

Software monoculture (today’s situation)

10

iden%calbinaryforall

users

alluserssuscep%bleto

iden%calexploit

exploit

A5acker

Figure 1: Software monoculture aids attackers.

differentvariantsfor

differentusers

asingleexploitnolonger

affectsallusers

iden4cally

costtoa6ackerrises

drama4cally

exploit

A6acker

Figure 2: Software diversity lowers effectiveness of attack.

millions. This makes it easy for an attacker (Figure 1), because the same attack vector is likely tosucceed on a large number of targets [8, 18].

But what if these millions of computers were all running different versions of the software?That is, what if we could ensure that every computer runs a unique but functionally identicalbinary (Figure 2), so that a different attack vector is needed for different targets. All the differentversions would behave in exactly the same way from the perspective of the end-user, but theywould implement their functionality in subtly different ways. As a result, any specific attackwould succeed only on a small fraction of systems and would no longer sweep through the wholeinternet. An attacker would require a large number of different attacks and would have no way ofknowing a priori which specific attack should be directed at what specific target. Hence, the costto the attacker would be raised dramatically.

The idea of using software diversity as a defense mechanism is not new, but it has never beenrealized in practice at any significant scale. Until quite recently, it would have been prohibitivelyexpensive to create a unique version of every program for every client.

In this paper, we make a passionate argument that such massive-scale software diversity isnow actually technically possible. We observe that this is enabled by four simultaneous paradigmshifts that are occurring just now. We present our blueprint for an architecture that provides such

2

Diversity engine

11

So#wareDeveloper

creates

deliversto

So#ware

Variants

DiversityEngine

withinAppStore

AppStore

creates

subsequentdownloadersreceivefunc=onallyiden=cal

butinternallydifferentversionsofthesameso#ware

Figure 3: Diversification mechanism can be hidden entirely within an online software deliverysystem (“App Store”) so that it becomes transparent to both code consumers and software devel-opers.

massive-scale software diversity. We elaborate on the problem of patching software that has beendiversified. We present a list of interesting open research problems that appear in the context ofmassive-scale software diversity. We claim that massive-scale software diversity is a new securityparadigm in itself. Finally, following a section on related work, we conclude the paper.

2 Vulnerabilities, Exploits and a Solution

A software vulnerability by itself is merely a hazard. In order to turn such a hazard into a success-ful attack, an attacker needs to find a successful exploit strategy. For example, the attacker mayknow of a vulnerability that enables a write beyond the end of a certain buffer on the stack. Butin order to exploit this known vulnerability, the attacker needs to overwrite very specific locationson the stack with very specific values.

Operating system vendors now add elements of randomness to their systems, with the aim ofmaking it more difficult for attackers to design a successful exploit. For example, the latest versionof the Windows operating system now randomizes the starting address of the stack. Unfortunately,this has not stopped attackers from devising workarounds.

Designing a successful exploit for a known vulnerability is not trivial, but a dedicated attackerwith ample resources is likely to succeed in eventually creating an attack. In today’s world, theeffort invested into designing such an exploit can be amortized by its wide applicability—sincemillions of users are running the identical vulnerable binary, just one successful exploit can affectall of them simultaneously.

3

Software polyculture (the future?)

12

iden%calbinaryforall

users

alluserssuscep%bleto

iden%calexploit

exploit

A5acker

Figure 1: Software monoculture aids attackers.

differentvariantsfor

differentusers

asingleexploitnolonger

affectsallusers

iden4cally

costtoa6ackerrises

drama4cally

exploit

A6acker

Figure 2: Software diversity lowers effectiveness of attack.

millions. This makes it easy for an attacker (Figure 1), because the same attack vector is likely tosucceed on a large number of targets [8, 18].

But what if these millions of computers were all running different versions of the software?That is, what if we could ensure that every computer runs a unique but functionally identicalbinary (Figure 2), so that a different attack vector is needed for different targets. All the differentversions would behave in exactly the same way from the perspective of the end-user, but theywould implement their functionality in subtly different ways. As a result, any specific attackwould succeed only on a small fraction of systems and would no longer sweep through the wholeinternet. An attacker would require a large number of different attacks and would have no way ofknowing a priori which specific attack should be directed at what specific target. Hence, the costto the attacker would be raised dramatically.

The idea of using software diversity as a defense mechanism is not new, but it has never beenrealized in practice at any significant scale. Until quite recently, it would have been prohibitivelyexpensive to create a unique version of every program for every client.

In this paper, we make a passionate argument that such massive-scale software diversity isnow actually technically possible. We observe that this is enabled by four simultaneous paradigmshifts that are occurring just now. We present our blueprint for an architecture that provides such

2

Immunization (1)Software hardening, or immunization, consists of

• removal of non-essential software programs

• secure configuration of remaining programs

• constant patching, and

• use of intrusion-detection systems, firewalls, intrusion-prevention systems, anti-malware programs, and spyware blockers

13

Immunization (2)

• In extreme cases, trained personnel have to take a device off-line to wipe its memory before installing new software

14

Pragmatic approach

Despite the protection provided by computer “immunization,” it is nearly impossible to keep every devices free for malware at all times

A more realistic goal is to provide a form of “community immunity,” where most devices are protected against malware because there is little opportunity for new outbreaks to spread

15

Combine diversity and immunization

While community immunity usually entails immunization of nearly all entities in a monoculture, we’ll combine software diversity with the immunization of a small fraction of the computers to halt malware spreading

16

Epidemiological model

We model viruses and worms as infectious diseases spreading over networks with varying software diversity

17

Infected monoculture

18

Single sick node infects all other nodes

Node size proportionalto #adjacent edges

Fragile

Hub defined

Hub—network node with many more adjacent edges than the average number of edges per node

• see large nodes in previous figure

• the number of adjacent edges is often referredto as the ‘degree’

www.kjhole.com

Diversity

Nodes of L types have different colors

The (software) diversity is equal to number of colors L

20

http://www.kjhole.com

www.kjhole.com

Immunization

A white immunized node never gets infected or transmits an infection

21

http://www.kjhole.com

Immunized polyculture

22

The malware types onlyspread to three nodes

L=2 node typesL=2 malware typesEight immunized hubs

Robust

Network modelSimple connected graph defines malware spreading pattern

• N nodes

• L≥1 node types

• one malware type per node type

• discrete time t = 0,1,2, ...

• S infected seeds per node type at time t = 0

23

4 Journal of Computer Networks and Communications

(a) (b)

Figure 1: Diverse (a) BA network and (b) WS network seeded with viruses at time step t = 0. Both networks have L = 3 different colorednode types. Circular nodes are susceptible and star-shaped nodes are infected. There is S = 1 seed for each node type. Only the L · S = 3seeds are infected since the viruses have not started to spread. The viruses infecting the seeds control all adjacent edges (shown in red). TheBA network has four isolated nodes in addition to the three infected seeds. Only the seeds are isolated in the WS network.

4.2. Influence of Reinfected Hubs. We now study the stochasticmodel to determine the hubs’ influence on the fractionof isolated nodes in diverse inhomogeneous networks withreinfections of nodes.

When there are Nl = N/L nodes per type, an arbitrarynode is a seed with probability S/Nl = (SL)/N , where S is thenumber of seeds per node type. Since a node of degree D hasroughly D ·Nl/N = D/L neighbors of the same type, a node’snumber of neighboring seeds of the same type is estimatedby

(SL)N

· DL= (SD)

N. (1)

The right-hand side of (1) is independent of the numberof node types L. The number of seeds S per node type canbe large in practice because botnets are used to seed viruses.Hence, a hub with very large degree D is likely to be infectedby a seed during the first time steps of a model run, even ifthe diversity L is large.

A hub of type l is infected with probability pl · (SD)/Nduring the model’s first-time step. Infection will almostsurely occur when pl · (SD)/N ≈ 1. During the followingtime steps, the hub will infect many of its D/L neighborswith the same type, where L≪ D for current networks. Evenmore neighbors will be isolated. In particular, all degree-oneneighbors of any type l′ /= l will be isolated but not infected.When the hub recovers with probability ql during a time step,it will be quickly reinfected by one of the D/L neighbors.Since the neighbors ensure that the hub is infected nearly all

the time, a nonzero fraction of isolated nodes is maintainedover time even when L is large.

Many simulations using the stochastic model confirmthe hubs’ important role in making the fraction of isolatednodes much larger than the fraction of infected nodes. Asseen from Figure 3, if the largest hub on a DH or BA networkis immunized, that is, made permanently resistant to virusattacks, then the instantaneous fraction of isolated nodesdrops significantly. There is no easily detectable reduction inthe instantaneous fraction of infected nodes, confirming thatthe largest hub isolates many susceptible (i.e., not infected)nodes. The large fluctuations in the instantaneous fractionof isolated nodes in Figure 3(a) is due to temporary recoveryof hubs.

The instantaneous fraction of isolated nodes will even-tually go to zero because there is a non-zero probabilitythat all nodes become susceptible in any finite-size network.However, the nonzero averaged fraction of isolated nodeswas stable for very many time steps during the simulations.Hence, when hubs are reinfected, multiple virus outbreakscause substantial long-term node isolation even for highnode diversity L.

5. Halting Technique

Our goal is to halt multiple simultaneous virus outbreaks onany inhomogeneous network without changing its topology.The halting technique should drive the fraction of isolatednodes to zero in the stochastic model. For the deterministic

L = 3 node typesS = 1 seed per node type

24

Seeds

Malware spreading

A sick nodes infect all its neighbors during a single time step t

25

Types of spreading patterns

Homogeneous network—all nodes have degrees k approximately equal to the average degree ⟨k⟩

Inhomogeneous network—a small fraction of nodes, the hubs, have degrees k much larger than the average degree ⟨k⟩

26

Malware halting analysis

To halt malware on networks with several million nodes, we first determine

(A) desired distribution of node types,

(B) a lower bound on the needed diversity, and

(C) the trade-off between diversity and immunization

27

(A) Node type distribution

Let rl be the probability that an arbitrary node is of type l = 1,2, …, L

The entropy −∑ rl log rl measures the uncertainty of a node’s assigned type

It has maximum value log L when all rl = 1/L

28

Maximize entropy (1)

When the entropy is maximized, the best spreading strategy for each malware typeis to select new nodes at random

The probability that a spreading mechanism chooses a node of wrong type is 1 − 1/L

As L increases, this probability increases and the speed of the malware spreading decreases

29

Maximize entropy (2)

If there is less uncertainty about the distribution of vulnerable nodes, e.g. a few node types occur more often than the other node types in a network, then the entropy is smaller and malware writers can create very efficient topological-aware spreading mechanisms

30

Observation 1

Skewed distributions of node types should be avoided because they facilitate rapid malware spreading

31

(B) Needed diversity

32

Example: MMS malware exploits a smartphone’s address book to spread to new phones with the same OS

MMS malware spreading

33

Phones on email list

Phone noton email list

Infected phone

Wang’s network model

Based on location and calling data from 6.2 million mobile subscribers

Market share determines whether devices with the same OS form a giant component in the calling graph

34

What is a component?

A single-type component is a subset of nodes with the same type such that there

• is a path between any pair of nodes in the set, and

• it is not possible to add another node of the same type to the set while preserving this property

35

Giant component

A giant component of same-type nodes has size proportional to N

If a giant component contains a seed, then nearly all nodes in the network will be infected

36

The spread of mobile viruses is aided by twodominant communication protocols. First, aBluetooth (BT) virus can infect all BT-activatedphones within a distance from 10 to 30 m, result-ing in a spatially localized spreading patternsimilar to the one observed in the case of influ-enza (3, 6, 7), severe acute respiratory syndrome(SARS) (8, 9), and other contact-based diseases(Fig. 1A) (10). Second, a multimedia messagingsystem (MMS) virus can send a copy of itself toall mobile phones whose numbers are found inthe infected phone’s address book, a long-rangespreading pattern previously exploited only bycomputer viruses (11, 12). Thus, in order to quan-titatively study the spreading dynamics of mobileviruses we need to simultaneously track the lo-cation (13), the mobility (14–17), and the com-munication patterns (18–21) ofmobile phone users.We achieved this by studying the anonymizedbilling record of a mobile phone provider andrecording the calling patterns and the coordinatesof the closest mobile phone tower each time agroup of 6.2 million mobile subscribers used

their phone. Thus, we do not know the users’precise locations within the tower’s receptionarea, and no information is available about theusers between calls.

The methods we used to simulate thespreading of a potential BT and MMS virus aredescribed in (22). Briefly, once a phone becomesinfected with an MMS virus, within 2 min itsends a copy of itself to each mobile phonenumber found in the handset’s phone book,approximated with the list of numbers withwhich the handset’s user communicated duringa month-long observational period. A BT viruscan infect only mobile phones within a distancer =10 m. To simulate this process, we assigned toeach user an hourly location that was consistentwith its travel patterns (13) and followed the in-fection dynamics within each mobile tower areausing the susceptible infected (SI) model (23).That is, we consider that an infected user (I ) in-fects a susceptible user (S), so that the number ofinfected users evolves in time (t) as dI/dt = bSI/N,where the effective infection rate is b = m<k>with

m = 1, N is the number of users in the tower area,and the average number of contacts is <k> =rA =NA/Atower, where A = pr2 represents the BT com-munication area and r = N/Atower is the popula-tion density inside a tower’s service area. Oncean infected user moves into the vicinity of a newtower, it will serve as a source of a BT infection inits new location.

A cell phone virus can infect only the phoneswith the operating system (OS) for which it wasdesigned (2, 3), making the market sharem of anOS an important free parameter in our study. Thecurrent market share of various smart phone OSsvary widely, from as little as 2.6% for PalmOS to64.3% for Symbian. Given that smart phonestogether represent less than 5% of all phones, theoverall market share of these OSs among all mo-bile phones is in the range ofm = 0.0013 for PalmOS andm = 0.032 for Symbian, numbers that areexpected to dramatically increase as smart phonesreplace traditional phones. To maintain the gen-erality of our results, we treat m as a free param-eter, finding that the spreading of both BT and

Fig. 1. The spread-ing mechanisms ofmobile viruses. (A) ABT virus can infect allphones found withinBT range from the in-fected phone, its spreadbeing determined bythe owner’s mobilitypatterns. An MMS viruscan infect all suscep-tible phones whosenumber is found inthe infected phone’sphonebook, resulting ina long-range spread-ing pattern that isindependent of theinfected phone’s phys-ical location. (B) Asmall neighborhoodof the call graph con-structed starting froma randomly chosen userand including all mo-bile phone contacts upto four degrees from it.The color of the noderepresents the hand-set’s OS, which in thisexample are randomlyassigned so that 75%of the nodes representOS1, and the red arethe remaining handsetswith OS2 (25%). (C)The clusters in the callgraph on which anMMS virus affecting agiven OS can spread,illustrating that anMMSvirus can reach at most the number of users that are part of the giant component of the appropriate handset. As the example for the OS shows, the size of thegiant component highly depends on the handset’s market share (see also Fig. 2C).

B OS1: 75% market share

OS2: 25% market share

Giant component80%

Giant component6%

Small connected components and single nodes

A Bluetooth (BT) contagion Multimedia messages (MMS) contagion

Bluetooth range (~ 10 m)

MMS messages

Bluetooth messages

C

BT susceptible phone

Phone out of Bluetooth range

MMS susceptible phoneInfected phone

22 MAY 2009 VOL 324 SCIENCE www.sciencemag.org1072

REPORTS

on

May

21,

200

9 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

37

Wang’s network model

38

Fig. 2. The spreadingpatterns of BT andMMSviruses. (A) The changesin the ratio of infectedand susceptible hand-sets ( I/N) with time inthe case of a BT virusaffecting handsets withdifferent m. (B) Sameas in panel (A) but forMMS viruses. The satu-ration in I/N indicatesthat an MMS virus canreach only a finite frac-tion of all susceptiblephones. (C) The size ofthe giant componentGm in a function of m.The blue triangles cor-respond to the satura-tion values measured inFig. 2B, whereas thered line is the theoret-ical prediction accord-ing to percolation theory(the deviations are mainly attributed to finite size effects and degree correlationsbecause the calculation assumedan infinite call graph). (D) The latency timeneededto infect q = 0.65 or q = 0.15 fraction of susceptible handsets via a BT virus,approximated with T(q = 0.65, m) ~ m–0.63T0.05 and T(q = 0.15, m) ~ m–0.60T0.04

(continuous lines). (E) The latency time for an MMS virus for q = 0.05, 0.15, and0.30. The continuous lines correspond to T(q,m) ~ (m–mq*)

–a(q), where the best fits

indicate a systematic q-dependence: a(0.05) = 0.20 T 0.02, a(0.15) = 0.17 T 0.01,and a(0.30) = 0.14 T 0.01. (F) Log-log plot showing Lave and Lmax for the largestcluster. The fits correspond to Lmax ~ (m–m*)–0.20T0.02 and Lave ~ (m –m*)–0.19T0.02.The curves in (A), (B), and (D) are obtained from 10 independent simulations,and (E) and (F) represent an average over 100 runs. For more statistical analysisof the fits in (D) to (F), see the detailed discussion in (22).

0 0.2 0.4 0.6 0.8 1 0

0.2

0.4

0.6

0.8

1

m

G

m

0 1 2 3 4 5 6 70

0.2

0.4

0.6

0.8

1

Time (Day)

I/N

m=0.01m=0.05m=0.10m=0.30m=1.00

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

Time (Hour)

I/N

0 0.2 0.4 0.6 0.8 1 0

50

100

150

200

250

300

A B C

D F

L

m

E

MMSBT

MMS

m=0.01m=0.05m=0.10m=0.30m=1.00

T(q

, m

) (M

inut

e)

mc

No

gian

t com

pone

nt

Finite giant component

m m-m* (m*=0.1) 10-3 10-2 10-1 100

101

102

Lave~ (m-m*)-0.19

MMS

Lmax~ (m-m*)-0.20

T(q

, m

) (M

inut

e)

10-2 10-1 100 102

103

104

T(q=0.15, m) ~ m-0.60

BT

T(q=0.65, m) ~ m-0.63

q=0.05q=0.15q=0.30

MMS

Fig. 3. Spatial patternsin the spread of BT andMMS viruses. (A) The vi-rus starts from the sameuser located at the towermarked by the red ar-rows (left). The threepanels show thepercent-age of infected users inthe vicinity of eachmo-bile phone tower (de-noted by the voronoicell that approximateseach tower’s service area).In the right panel, weshow the correspond-ing time-dependent in-fection curves, markingthe moments when thespatial distribution wasrecorded. (B) Average dis-tance between the towerwhere the infection wasoriginally started andthe most currently in-fected phone as a func-tion ofN, the number oftowers with at least oneinfected user, used as aproxy of time (three redand blue curves corre-spond to m = 0.1, m =0.5, and m = 1). Thegreen line is obtained from a null model that assumes that the virus can only spread from one tower’s service area to its neighbor towers’ service areas. The curvesin (B) are obtained from 100 independent simulations.

0 25% 50% 75% 100%

Infected percentageBluetooth MMS

MMS BT

100 101 102 10310-6

10-4

10-2

10 0

Time (Minute)

I/N

N (number of towers with infected users)

<D> (km

)

A

B

0.1%

10%

80%

101 102 103 104100

101

102

103

MMSNull model

BT

Virus starts hereVirus starts here

www.sciencemag.org SCIENCE VOL 324 22 MAY 2009 1073

REPORTS

on

May

21,

200

9 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Wang et al.

Roughly 45% of all phones in US were smartphones in March 2011

39

Android market share

Androids’s share of the total mobile phone market was 0.45 X 0.35 = 0.16 (16%)

About 62% of the users utilized Android Gingerbread 2.3.x

• market share was 0.16 X 0.62 = 0.10 (10%)

40

41

Fig. 2. The spreadingpatterns of BT andMMSviruses. (A) The changesin the ratio of infectedand susceptible hand-sets ( I/N) with time inthe case of a BT virusaffecting handsets withdifferent m. (B) Sameas in panel (A) but forMMS viruses. The satu-ration in I/N indicatesthat an MMS virus canreach only a finite frac-tion of all susceptiblephones. (C) The size ofthe giant componentGm in a function of m.The blue triangles cor-respond to the satura-tion values measured inFig. 2B, whereas thered line is the theoret-ical prediction accord-ing to percolation theory(the deviations are mainly attributed to finite size effects and degree correlationsbecause the calculation assumedan infinite call graph). (D) The latency timeneededto infect q = 0.65 or q = 0.15 fraction of susceptible handsets via a BT virus,approximated with T(q = 0.65, m) ~ m–0.63T0.05 and T(q = 0.15, m) ~ m–0.60T0.04

(continuous lines). (E) The latency time for an MMS virus for q = 0.05, 0.15, and0.30. The continuous lines correspond to T(q,m) ~ (m–mq*)

–a(q), where the best fits

indicate a systematic q-dependence: a(0.05) = 0.20 T 0.02, a(0.15) = 0.17 T 0.01,and a(0.30) = 0.14 T 0.01. (F) Log-log plot showing Lave and Lmax for the largestcluster. The fits correspond to Lmax ~ (m–m*)–0.20T0.02 and Lave ~ (m –m*)–0.19T0.02.The curves in (A), (B), and (D) are obtained from 10 independent simulations,and (E) and (F) represent an average over 100 runs. For more statistical analysisof the fits in (D) to (F), see the detailed discussion in (22).

0 0.2 0.4 0.6 0.8 1 0

0.2

0.4

0.6

0.8

1

m

G

m

0 1 2 3 4 5 6 70

0.2

0.4

0.6

0.8

1

Time (Day)

I/N

m=0.01m=0.05m=0.10m=0.30m=1.00

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

Time (Hour)

I/N

0 0.2 0.4 0.6 0.8 1 0

50

100

150

200

250

300

A B C

D F

L

m

E

MMSBT

MMS

m=0.01m=0.05m=0.10m=0.30m=1.00

T(q

, m

) (M

inut

e)

mc

No

gian

t com

pone

nt

Finite giant component

m m-m* (m*=0.1) 10-3 10-2 10-1 100

101

102

Lave~ (m-m*)-0.19

MMS

Lmax~ (m-m*)-0.20

T(q

, m

) (M

inut

e)

10-2 10-1 100 102

103

104

T(q=0.15, m) ~ m-0.60

BT

T(q=0.65, m) ~ m-0.63

q=0.05q=0.15q=0.30

MMS

Fig. 3. Spatial patternsin the spread of BT andMMS viruses. (A) The vi-rus starts from the sameuser located at the towermarked by the red ar-rows (left). The threepanels show thepercent-age of infected users inthe vicinity of eachmo-bile phone tower (de-noted by the voronoicell that approximateseach tower’s service area).In the right panel, weshow the correspond-ing time-dependent in-fection curves, markingthe moments when thespatial distribution wasrecorded. (B) Average dis-tance between the towerwhere the infection wasoriginally started andthe most currently in-fected phone as a func-tion ofN, the number oftowers with at least oneinfected user, used as aproxy of time (three redand blue curves corre-spond to m = 0.1, m =0.5, and m = 1). Thegreen line is obtained from a null model that assumes that the virus can only spread from one tower’s service area to its neighbor towers’ service areas. The curvesin (B) are obtained from 100 independent simulations.

0 25% 50% 75% 100%

Infected percentageBluetooth MMS

MMS BT

100 101 102 10310-6

10-4

10-2

10 0

Time (Minute)

I/N

N (number of towers with infected users)

<D> (km

)

A

B

0.1%

10%

80%

101 102 103 104100

101

102

103

MMSNull model

BT

Virus starts hereVirus starts here

www.sciencemag.org SCIENCE VOL 324 22 MAY 2009 1073

REPORTS

on

May

21,

200

9 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Wang et al.

Ginger-breadwas here

Observation 2

A malware epidemic can only occur when a network contains a giant component of nodes with the same type

42

Diversity needed to avoid giant component

degree k — a node’s number of adjacent edges

average degree ⟨k⟩ = 1/N·∑ ki

average-square degree ⟨k2⟩ = 1/N·∑ (ki)2

43

L ≥ ⎡⟨k2⟩ ∕ {2·⟨k⟩}⎤

(C) Diversity vs immunization

The right-hand side of the bound is large when a network contains nodes with square degrees (ki)2 much bigger than the corresponding degrees ki

If the node degrees are known, then we reduce the lower bound by immunizing the nodes with largest degrees ki

44

Observation 3

We can immunize a small fraction of all nodes (the hubs) in an inhomogeneous network to reduce the need for diversity

• because it is expensive to immunize computers, immunization is of limiteduse in large networks with many hubs

45

Halting method (based on observations)

The method must handle spreading patterns with

• unknown and changing topology

• at least one million nodes

• unreliable node communication

46

Approach

Since the topology is unknown and communication is unreliable, it is difficult to modify the network structure or ask nodes to change their types based on the types of the neighboring nodes

It is more promising to use a simple method that is

• robust to varying topologies

• scale to very large networks

47

Malware halting method

(first version)

1. If practicable, immunize enough large-degree nodes in a network to create a homogeneous subnet when the immunized nodes and their adjacent edges are removed

2. Ensure that the node diversity of the homogeneous subnet is large enough to halt multiple simultaneous malware outbreaks

48

49

Video

Malware Halting Part II: Simulations and Analyses


Overview

1. Malware halting in proximity networks

2. Halting in the Enron email network

3. Halting in Barabási and Albert networks

4. Halting in dense IP networks

5. How to immunize unknown hubs

6. Slowing down ‘advanced persistent threats’

51

Proximity networks

52

Phones within Bluetooth range (~10m)

Phone out ofBluetooth range

Infected phone

Proximity network model

1. We generate a proximity network with average node degree ⟨k⟩ by first placing N nodes uniformly at random on a square

2. An edge is then added between a randomly chosen node and its closest neighbor in Euclidean distance

53

Network generation

3. More edges are added the same way until the network has the desired average degree ⟨k⟩

• self-loops and multiple edges between nodes are not allowed

54

N=300⟨k⟩=5L=4S=10

55

Proximity network simulations

Average node degree ⟨k⟩ 5 6 7 8

Minimum needed node types (diversity) L 3 4 4 5

Fraction of infected nodes 3,4 % 3,6 % 4,6 % 4,8 %

N=5000, each fraction averaged over 500 networks with random distribution of node types and seeds, S=10

56

NetLogo proximity model

57

Demo

Enron email network

Sparse and inhomogeneous email network

Nodes represent email addresses belonging to former Enron employees

N=36 692, ⟨k⟩=10.02 (183 831 edges)

The largest hub has degree 1 383

58

Diversity vs immunization

The lower bound on the required diversity is L ≥ 71 before hub immunization

After immunizing the 612 nodes with degrees larger than 0.05% of the total number of edges, the lower bound reduces to L ≥ 9

59

L ≥ ⎡⟨k2⟩ ∕ {2·⟨k⟩}⎤

Enron network simulations

Diversity L 9 10 11 12

Fraction of infected nodes 5,2 % 3,5 % 2,4 % 1,8 %

N=36 692, each fraction averaged over 500 networks with random distribution of node types and seeds, S=10

60

BA networks

The Barabási and Albert (BA) model grows an inhomogeneous scale-free network

61

Example BA network

62


(a) (b)

Figure 1: Diverse (a) BA network and (b) WS network seeded with viruses at time step t = 0. Both networks have L = 3 different colorednode types. Circular nodes are susceptible and star-shaped nodes are infected. There is S = 1 seed for each node type. Only the L · S = 3seeds are infected since the viruses have not started to spread. The viruses infecting the seeds control all adjacent edges (shown in red). TheBA network has four isolated nodes in addition to the three infected seeds. Only the seeds are isolated in the WS network.

4.2. Influence of Reinfected Hubs. We now study the stochasticmodel to determine the hubs’ influence on the fractionof isolated nodes in diverse inhomogeneous networks withreinfections of nodes.

When there are Nl = N/L nodes per type, an arbitrarynode is a seed with probability S/Nl = (SL)/N , where S is thenumber of seeds per node type. Since a node of degree D hasroughly D ·Nl/N = D/L neighbors of the same type, a node’snumber of neighboring seeds of the same type is estimatedby

(SL)N

· DL= (SD)

N. (1)

The right-hand side of (1) is independent of the numberof node types L. The number of seeds S per node type canbe large in practice because botnets are used to seed viruses.Hence, a hub with very large degree D is likely to be infectedby a seed during the first time steps of a model run, even ifthe diversity L is large.

A hub of type l is infected with probability pl · (SD)/Nduring the model’s first-time step. Infection will almostsurely occur when pl · (SD)/N ≈ 1. During the followingtime steps, the hub will infect many of its D/L neighborswith the same type, where L≪ D for current networks. Evenmore neighbors will be isolated. In particular, all degree-oneneighbors of any type l′ /= l will be isolated but not infected.When the hub recovers with probability ql during a time step,it will be quickly reinfected by one of the D/L neighbors.Since the neighbors ensure that the hub is infected nearly all

the time, a nonzero fraction of isolated nodes is maintainedover time even when L is large.

Many simulations using the stochastic model confirmthe hubs’ important role in making the fraction of isolatednodes much larger than the fraction of infected nodes. Asseen from Figure 3, if the largest hub on a DH or BA networkis immunized, that is, made permanently resistant to virusattacks, then the instantaneous fraction of isolated nodesdrops significantly. There is no easily detectable reduction inthe instantaneous fraction of infected nodes, confirming thatthe largest hub isolates many susceptible (i.e., not infected)nodes. The large fluctuations in the instantaneous fractionof isolated nodes in Figure 3(a) is due to temporary recoveryof hubs.

The instantaneous fraction of isolated nodes will even-tually go to zero because there is a non-zero probabilitythat all nodes become susceptible in any finite-size network.However, the nonzero averaged fraction of isolated nodeswas stable for very many time steps during the simulations.Hence, when hubs are reinfected, multiple virus outbreakscause substantial long-term node isolation even for highnode diversity L.

5. Halting Technique

Our goal is to halt multiple simultaneous virus outbreaks onany inhomogeneous network without changing its topology.The halting technique should drive the fraction of isolatednodes to zero in the stochastic model. For the deterministic

N=40L=3S=1

NetLogo BA 3D model

63

Demo

Dense IP networks

Assume that L types of random scanning malware spread over a complete network with N nodes of degree k = N − 1

• there are L node types and N/L uniformly distributed nodes per type

64

Dense network analysis

Let there be one seed (S=1) per node type

Each seed has edges to the other N/L − 1 nodes of the same type

Together, the N/L single-type nodes form a star graph with the seed in the center

65

Star graph

66

Seed in thecenter willinfect allother nodes

Need more node types than malware types

Since the seed will always infect all the peripheral nodes in the star graph, it does not help to increase the number of node types L as long as there is one seed per node type

The only way diversity can halt multi-malware outbreaks in dense networks is to use many more node types then there are malware types

67

Needed diversity in dense IP network (1)

If there are M malware types, then M· N/L nodes will be infected

Hence, the diversity L needs to be proportional to N and the number of malware types M must be much smaller than N to prevent a large infection

68

Needed diversity in dense IP networks (2)

The previous observation is in accordance with the diversity bound, which is equal to L ≥ (N − 1)/2 for k = N − 1

69

Advantage of diversity in dense networks (1)

Software diversity cannot stop malware spreading in dense random scanning networks, but it can slow the spreading

• the likelihood of selecting a node of wrong type 1–1/L goes to 1 as the diversity L increases

70

Advantage of diversity in dense networks (2)

If the malware spreading is slow, then it is possible for a (cloud-based) anti-malware solution to tailor the defense to a particular outbreak

71

Unknown hubs

Acquaintance immunization immunizes unknown hubs on a monoculture

72

Acquaintance immunization (monoculture version)

Choose a set of nodes uniformly at random and immunize one arbitrary neighbor per node

• while the original set of nodes is unlikely to contain the relatively few hubs in an inhomogeneous network, the randomly selected neighbors are much more likely to be hubs, since very many edges are adjacent to high-degree nodes

73

74

Acquaintance immunization (polyculture version)

1. Assume Nl = N/L nodes per type

2. For some fraction 0 < f < 1,choose a set of f· Nl nodes of type l uniformly at random such that each node has at least one neighbor of the same type, l=1,2,...,L

3. Immunize one randomly selected neighbor of type l per node in the set f· Nl

75

Observation

When the set of all immunized neighbors f· N = ∑ f· Nl is large enough, the set f· N will contain most of the hubs

76


Fraction of nodes

948

Time

1248

0

0.551

Frac

tion

IsolatedInfected

(a)

Fraction of nodes

0

Time

184

0

0.5

Frac

tion

IsolatedInfected

Fractions are nearly identical

(b)

Figure 5: Fractions of isolated and infected nodes caused bymultiple simultaneous virus outbreaks on DH network. (a) Thelargest 216 hubs are immunized at time step one thousand. (b) Thehubs have been correctly identified and immunized before the virusoutbreaks.

spread. The 7 · 20 = 140 seeds generated 158 isolated nodesor less than 1% of all nodes. Let pl = 0.06 and ql = 0.04.When acquaintance immunization is performed in advance,the fractions of isolated and infected nodes went to zero afteronly 154 time steps. The plot of the isolated and infectedfractions (not shown) is very similar to Figure 5(b).

To verify the usefulness of the halting technique forinhomogeneous networks with unknown hubs, we generatedadditional model runs for different DH networks, includingruns where the infection and recovery probabilities pl andql varied with l. After first determining a suitable fraction fof immunized nodes and number of node types L, the seedscaused little spreading and all infected nodes recovered. Thespeed at which the virus outbreaks die out depends on thefraction f , diversity L, selection of L · S seeds, and spreadingrates pl/ql.

(a)

(b)

Figure 6: Acquaintance immunization of DH network. (a) Immu-nized dark-pink nodes and remaining susceptible multicoloredhubs. (b) The few red and white isolated nodes after the viruses triedto spread.

7. Final Discussion

The Internet is best viewed as a large collection of networks.Because each network has different default settings, soft-ware patch levels, firewall rules, browser settings, antivirussignature sets, configuration management practices, anddiagnostic capabilities, they are not all vulnerable to the sameviruses [8]. However, we have seen many examples of largenetworks with too little software diversity to prevent virusepidemics.

Since the virus writers control the spreading mechanismsof viruses, a practical halting technique must handle viruseswith widely different spreading patterns. The reported resultsindicate that robust halting of viruses is obtainable whenapplication stores with “diversity engines” ensure adequatesoftware diversity on the OS, and application layers of anetwork and vulnerable hubs are immunized (Appendix Cdiscusses the halting technique’s fragility to clustering ofplatform types.)

77

Immunization example

Immunizednodes andremainingsusceptiblehubs

APT defined

Advanced Persistent Threat (APT)— targeted effort to obtain or change information by means that are difficult to

• discover,

• difficult to remove,

• and difficult to attribute

78

APT examples

Examples of APTs are state-sponsored attacks on foreign commercial and governmental enterprises to steal industrial and military secrets

The attacks are often initiated by well-timed, socially engineered (spear-phishing) emails delivering trojans to individuals with accessto sensitive information

79

Why malicious email?

Malicious email is leveraged because most enterprises allow email to enter their networks

Persistent attackers frequently exploit OS or application vulnerabilities in the targeted systems

80

Attack description

An attacker first develops a payload to exploitone or more vulnerabilities

Next, an automated tool such as a PDF or a Microsoft document delivers the payload toa few users of a system

The payload installs a backdoor or provides remote system access, allowing the attacker to establish a presence inside the trusted system boundary

81

Diversity slows down attacks

If a user and an attacker download the same program from an application store that generate diverse executable files, then the two downloaded files share a common vulnerability with probability 1/L

82

Diversity slows down attacks

When the diversity L is large, the probability of a common vulnerability is small and attackers can no longer reliably analyze their own downloaded program files to exploit vulnerabilities in users’ program files

83

Reverse engineering in a monoculture is easy

84

securitypatchor

replacementso1ware

nowsafe

vulnerable

unpatchedso1ware

A8ackercancomputean

exploitbycomparing

unpatched&patched

so1wareversions

userswhohaven'tyetappliedthepatchareputatrisk

byreleaseofthepatch

exploit

Figure 4: Existing situation: release of a software update exposes vulnerabilities.

It is this fundamental problem that massive-scale software diversity addresses. We advocatethe introduction of automated code variance techniques that result in the binary code imagesdelivered to subsequent code consumers being subtly different. This process can be embeddedseamlessly into an online software delivery system, an “App Store” (Figure 3), and thereby bemade entirely transparent to the code consumer (all programs derived from the same source inthis manner have the identical functionality). The mechanism can even be set up so that addingsoftware diversity poses no extra effort to the software programmer (a compiler automaticallygenerates the different versions without any additional human intervention). But as a result ofdeploying massive-scale diversity, any specific exploit will work on only a relatively small numberof targets.

3 Diversity Removes Reverse Engineering Vulnerabilities of Soft-ware Patches

Massive-scale software diversity removes another major problem of current software monocul-ture: the fact that releasing a patch for a discovered vulnerability alerts adversaries about theexistence of the vulnerability. It is current best practice to fix software vulnerabilities as soon aspossible after they are discovered. In the desktop space, this is usually achieved by sending a“patch” to the compromised host. Such a patch contains the delta between the original (vulnera-ble) program and a corrected new version. Since such fixes usually apply to only a small fractionof a program, it is more efficient to send just the patch rather than sending a whole correctedprogram.

In the mobile space, user-installable programs (“Apps”) are currently updated by sendingcomplete replacement versions rather applying an incremental patch, while the mobile operatingsystem software itself is updated using patches. As Apps on mobile devices grow, it is probablyonly a matter of time until the App Store frameworks of the various mobile platforms will supportreplacing only part of an App (via a patch) rather than downloading a wholly new App each time.

4

Situation in polyculture

85

It is necessary to create software patches tailored to the different binary versions of the same program

An attacker cannot reverse engineer software patches by comparing a particular patch to the corresponding code on a user’s computer because the path and code are unknown to the attacker

Reverse engineering in a polyculture is hard

86

replacementso-ware originalso-ware

vulnerabilitycannotbe

extractedsimplybycomparing

Figure 5: Solution #1: Software is updated by sending complete replacement version. Adversarycannot match an original version to its replacement and cannot extract a vulnerability.

IDofmy

variant

custompatchfor

thisversiononly

custompatchisworthless

unlessyouhavetheexact

variantitrelatesto

Figure 6: Solution #2: Software is updated via custom patch. Patch is meaningless to adversarywithout the specific original version that it relates to.

A bug fix (in the form of either a patch or a replacement program) gives a potential adver-sary information that can be used to precisely identify the vulnerability being fixed in the newversion (Figure 4). A significant proportion of software exploits today are generated from reverseengineering of error fixes. As a consequence, it is imperative that updates are applied as soon asthey are available. The average time lag between availability of an update and its installation on avulnerable target is often a good predictor for overall vulnerability.

In this context, Apps for mobile devices are actually potentially even more vulnerable thandesktop applications. This is because Apps for mobile devices tend to evolve much more quicklythan traditional desktop software. For example, for many Apps in the Android Marketplace,release cycles are expressed in days rather than months. The rapid software evolution cycle ismore likely to push out software that is not as mature as it should be, i.e., containing a higherproportion of residual errors than necessary. And fixing these errors in subsequent releases willgive an adversary a steady stream of hints as to the location of exploitable vulnerabilities.

Massive-scale software diversity makes it much more difficult for an attacker to generateattack vectors by way of reverse engineering of security patches (Figures 5 and 6). An attackerrequires two pieces of information to extract a vulnerability from a bug fix: the version of thesoftware that is vulnerable and the specific patch that fixes the vulnerability. In an environment inwhich software is diversified to an extreme degree and every instance of every piece of softwareis unique, we can set things up so that the attacker never obtains a matching pair of vulnerablesoftware and its corresponding bug fix that could be used to identify the vulnerability. In Section 7below, we outline a concrete mechanism that achieves this goal.

5

Malware Halting Part III: From Fragility to Antifragility


Overview

1. Definition of fragility, robustness, and antifragility

2. Antifragility to malware spreading

3. Network model

4. NetLogo demo

88

89

A property of a complex adaptive system is

fragile if it is easily damaged by internal or external perturbations,

robust if it can withstand perturbations (up to a point), and

antifragile if it learns from incidents how to become increasingly robust over time

Fragile, robust, and anti-fragile systems

90

Antifragile

91

Please mishandle

Fragile Robust Antifragile

92

Fragile monoculture

93

A system is fragile to infectious malware when it spreads over the network

Initially infectednode

Robust polyculture

94

A system is robust to infectious malware when there is only limited spreading

Antifragility to malware

A system is antifragile when it “learns” from previous malware outbreaks how to reduce the spreading of future outbreaks

• the previously studied epidemiological model is not antifragile because it does not learn

95

+ =

Software diversity Imperfect malwaredetection

Antifragility to malware spreading

96

So#wareDeveloper

creates

deliversto

So#ware

Variants

DiversityEngine

withinAppStore

AppStore

creates

subsequentdownloadersreceivefunc=onallyiden=cal

butinternallydifferentversionsofthesameso#ware

Figure 3: Diversification mechanism can be hidden entirely within an online software deliverysystem (“App Store”) so that it becomes transparent to both code consumers and software devel-opers.

massive-scale software diversity. We elaborate on the problem of patching software that has beendiversified. We present a list of interesting open research problems that appear in the context ofmassive-scale software diversity. We claim that massive-scale software diversity is a new securityparadigm in itself. Finally, following a section on related work, we conclude the paper.

2 Vulnerabilities, Exploits and a Solution

A software vulnerability by itself is merely a hazard. In order to turn such a hazard into a success-ful attack, an attacker needs to find a successful exploit strategy. For example, the attacker mayknow of a vulnerability that enables a write beyond the end of a certain buffer on the stack. Butin order to exploit this known vulnerability, the attacker needs to overwrite very specific locationson the stack with very specific values.

Operating system vendors now add elements of randomness to their systems, with the aim ofmaking it more difficult for attackers to design a successful exploit. For example, the latest versionof the Windows operating system now randomizes the starting address of the stack. Unfortunately,this has not stopped attackers from devising workarounds.

Designing a successful exploit for a known vulnerability is not trivial, but a dedicated attackerwith ample resources is likely to succeed in eventually creating an attack. In today’s world, theeffort invested into designing such an exploit can be amortized by its wide applicability—sincemillions of users are running the identical vulnerable binary, just one successful exploit can affectall of them simultaneously.

3

Software diversity

Application stores, e.g. Google Play and iOS App Store, can utilize compilers with “diversity engines”

97

Devices can repeatedly download executables to create time-varying

software diversity

Malware detection

Behavior based methods increase the detection rate compared to signature-based methods, but the detection is not perfect

98

Malware detection allows a system to partially learn when new software should be downloaded to a device

New epidemiological model

• Simple connected graph with N nodes and maximum L node types

• Discrete time t = 0,1,...,

• D = D(t) node types, 1 ≤ D(t) ≤ L, at time step t

99

Software downloads

• All nodes change type with probability p at each time t to model automated software downloads

• each of the L possible node types is selected with probability 1/L

100

Malware outbreaks

• A single susceptible node is infected with probability q at each time t

• the node is selected uniformly at random

• A sick node will infect all its neighbors of the same type

101

Malware detection

• Infected nodes change type with probability r during a time step to model malware detection

• detection is followed by immediate download of new software, i.e., change of node type

• any infected node becomes susceptible when it changes type

102

Malware halting method

(second version)

103

1. From fragile monoculture to robust polyculture

• using time-varying software diversity

2. From robust to antifragile polyculture

• using imperfect malware detection

NetLogo model (1)

104

Spike due to monoculture

Malware detection on

Change in spreading mechanism

Immunization of hubs

NetLogo model (2)

105

Polyculture Polyculture Polyculture PolycultureMonoculture

Time-varying polyculture

NetLogo model (3)

106

Monoculture

PolycultureImperfect hub immunization

Demo

107

Fragility vs antifragility

108

Fragile monoculture Antifragile polyculture

Large spreading of malware Nearly no malware spreading

Needs continuous repair Self-repairing (up to a point)

Must immunise nearly all devices

Immunization of hubs

No adaption to changes in spreading

Adapts to changes

Further study1. Watch Nassim Taleb and Daniel Kahneman

discuss antifragility

• www.youtube.com/watch?v=MMBclvY_EMA

2. Watch Taleb give a talk at Stanford

• www.youtube.com/watch?v=c6sX5MSdLag

3. Find more talks on the web

109

References (1)

M. Franz, E Unibus Pluram: Massive-Scale Software Diversity as a Defense Mechanism, Proc. New Security Paradigms Workshop 2010 (NSPW 2010), Concord, Massachussetts, USA, Sept. 21–23, 2010, pp. 7–16

P. Wang, M. C. González,C. A. Hidalgo, A.-L. Barabási, Understanding the Spreading Patterns of Mobile Phone Viruses, Science, vol. 324, 22 May 2009, pp. 1072–1076

110

References (2)

K. J. Hole, Diversity Reduces the Impact of Malware, IEEE Security & Privacy Magazine, vol. 13, no. 3, 2015, pp. 48–54

K. J. Hole, Towards Anti-fragility to Malware Spreading, IEEE Security & Privacy Magazine, vol. 13, no. 4, 2015, pp. 40–46

111

References (3)

N.N. Taleb, Antifragile: Things That Gain from Disorder, Random House, 2012

Daniel Bilar, Known Knowns, Known Unknowns and Unknown Unknowns: Anti-virus issues, malicious software and Internet attacks for non-technical audiences, journals.sas.ac.uk/deeslr/article/view/1880

112

http://journals.sas.ac.uk/deeslr/article/view/1880

http://journals.sas.ac.uk/deeslr/article/view/1880

References (4)

R. Cohen, S. Havlin, and D. Ben-Avraham, Efficient immunization strategies for computer networks and populations, Physical Review Letters, vol. 91, no. 24, Article ID 247901, 2003

113

Malware Halting · 2017-05-16 · Malware—malicious software used to ... fraction of the...

Documents

Transcript of Malware Halting · 2017-05-16 · Malware—malicious software used to ... fraction of the...