Malware Halting · 2017-05-16 · Malware—malicious software used to ... fraction of the...
Transcript of Malware Halting · 2017-05-16 · Malware—malicious software used to ... fraction of the...
Malware Halting Part I: Method Development
Kjell Jørgen HoleSimula@UiB
Last updated 16.05.17
Overview
1. Malware
2. Software diversity
3. Computer “immunization”
4. Epidemiological model
5. Malware halting analysis
6. Malware halting method
2
Malware defined
Malware—malicious software used to
• disrupt computer operations
• gather sensitive information, or
• gain access to private systems
3
Worms
VirusesRootkits
Spyware
Keyloggers
Adware
Trojan horses
Backdoors
Dialers
4
bots
Ransomware
Infectious malware
We’ll concentrate on infectious malware:
• Viruses—need user intervention to spread
• Worms—spread automatically
5
Spreading mechanisms (1)
Random scanning selects target IP addresses at random (all nodes are neighbors)
• used by Code Red and Slammer worms
Localized scanning selects most hosts in the “local” address space
• used by Code Red II and Nimda worms
6
Spreading mechanisms (2)
Topological-scanning relies on information contained in infected hosts to locate new targets
• the information may include (BGP) routing tables, email addresses, a list of peers, and Uniform Resource Locations (URLs)
• used by the Morris worm
7
Spreading mechanisms (3)
Hitlist consists of potentially vulnerable machines that are gathered beforehand and targeted first when the worm is released
• the flash worm gathered all vulnerable machines into its hitlist
8
Software diversity
We consider systems of networked computing devices, such as computers, smartphones, and tablets
Each device downloads software from application stores utilizing compilers with “diversity engines”
9
Software monoculture (today’s situation)
10
iden%calbinaryforall
users
alluserssuscep%bleto
iden%calexploit
exploit
A5acker
Figure 1: Software monoculture aids attackers.
differentvariantsfor
differentusers
asingleexploitnolonger
affectsallusers
iden4cally
costtoa6ackerrises
drama4cally
exploit
A6acker
Figure 2: Software diversity lowers effectiveness of attack.
millions. This makes it easy for an attacker (Figure 1), because the same attack vector is likely tosucceed on a large number of targets [8, 18].
But what if these millions of computers were all running different versions of the software?That is, what if we could ensure that every computer runs a unique but functionally identicalbinary (Figure 2), so that a different attack vector is needed for different targets. All the differentversions would behave in exactly the same way from the perspective of the end-user, but theywould implement their functionality in subtly different ways. As a result, any specific attackwould succeed only on a small fraction of systems and would no longer sweep through the wholeinternet. An attacker would require a large number of different attacks and would have no way ofknowing a priori which specific attack should be directed at what specific target. Hence, the costto the attacker would be raised dramatically.
The idea of using software diversity as a defense mechanism is not new, but it has never beenrealized in practice at any significant scale. Until quite recently, it would have been prohibitivelyexpensive to create a unique version of every program for every client.
In this paper, we make a passionate argument that such massive-scale software diversity isnow actually technically possible. We observe that this is enabled by four simultaneous paradigmshifts that are occurring just now. We present our blueprint for an architecture that provides such
2
Diversity engine
11
So#wareDeveloper
creates
deliversto
So#ware
Variants
DiversityEngine
withinAppStore
AppStore
creates
subsequentdownloadersreceivefunc=onallyiden=cal
butinternallydifferentversionsofthesameso#ware
Figure 3: Diversification mechanism can be hidden entirely within an online software deliverysystem (“App Store”) so that it becomes transparent to both code consumers and software devel-opers.
massive-scale software diversity. We elaborate on the problem of patching software that has beendiversified. We present a list of interesting open research problems that appear in the context ofmassive-scale software diversity. We claim that massive-scale software diversity is a new securityparadigm in itself. Finally, following a section on related work, we conclude the paper.
2 Vulnerabilities, Exploits and a Solution
A software vulnerability by itself is merely a hazard. In order to turn such a hazard into a success-ful attack, an attacker needs to find a successful exploit strategy. For example, the attacker mayknow of a vulnerability that enables a write beyond the end of a certain buffer on the stack. Butin order to exploit this known vulnerability, the attacker needs to overwrite very specific locationson the stack with very specific values.
Operating system vendors now add elements of randomness to their systems, with the aim ofmaking it more difficult for attackers to design a successful exploit. For example, the latest versionof the Windows operating system now randomizes the starting address of the stack. Unfortunately,this has not stopped attackers from devising workarounds.
Designing a successful exploit for a known vulnerability is not trivial, but a dedicated attackerwith ample resources is likely to succeed in eventually creating an attack. In today’s world, theeffort invested into designing such an exploit can be amortized by its wide applicability—sincemillions of users are running the identical vulnerable binary, just one successful exploit can affectall of them simultaneously.
3
Software polyculture (the future?)
12
iden%calbinaryforall
users
alluserssuscep%bleto
iden%calexploit
exploit
A5acker
Figure 1: Software monoculture aids attackers.
differentvariantsfor
differentusers
asingleexploitnolonger
affectsallusers
iden4cally
costtoa6ackerrises
drama4cally
exploit
A6acker
Figure 2: Software diversity lowers effectiveness of attack.
millions. This makes it easy for an attacker (Figure 1), because the same attack vector is likely tosucceed on a large number of targets [8, 18].
But what if these millions of computers were all running different versions of the software?That is, what if we could ensure that every computer runs a unique but functionally identicalbinary (Figure 2), so that a different attack vector is needed for different targets. All the differentversions would behave in exactly the same way from the perspective of the end-user, but theywould implement their functionality in subtly different ways. As a result, any specific attackwould succeed only on a small fraction of systems and would no longer sweep through the wholeinternet. An attacker would require a large number of different attacks and would have no way ofknowing a priori which specific attack should be directed at what specific target. Hence, the costto the attacker would be raised dramatically.
The idea of using software diversity as a defense mechanism is not new, but it has never beenrealized in practice at any significant scale. Until quite recently, it would have been prohibitivelyexpensive to create a unique version of every program for every client.
In this paper, we make a passionate argument that such massive-scale software diversity isnow actually technically possible. We observe that this is enabled by four simultaneous paradigmshifts that are occurring just now. We present our blueprint for an architecture that provides such
2
Immunization (1)Software hardening, or immunization, consists of
• removal of non-essential software programs
• secure configuration of remaining programs
• constant patching, and
• use of intrusion-detection systems, firewalls, intrusion-prevention systems, anti-malware programs, and spyware blockers
13
Immunization (2)
• In extreme cases, trained personnel have to take a device off-line to wipe its memory before installing new software
14
Pragmatic approach
Despite the protection provided by computer “immunization,” it is nearly impossible to keep every devices free for malware at all times
A more realistic goal is to provide a form of “community immunity,” where most devices are protected against malware because there is little opportunity for new outbreaks to spread
15
Combine diversity and immunization
While community immunity usually entails immunization of nearly all entities in a monoculture, we’ll combine software diversity with the immunization of a small fraction of the computers to halt malware spreading
16
Epidemiological model
We model viruses and worms as infectious diseases spreading over networks with varying software diversity
17
Infected monoculture
18
Single sick node infects all other nodes
Node size proportionalto #adjacent edges
Fragile
Hub defined
Hub—network node with many more adjacent edges than the average number of edges per node
• see large nodes in previous figure
• the number of adjacent edges is often referredto as the ‘degree’
www.kjhole.com
Diversity
Nodes of L types have different colors
The (software) diversity is equal to number of colors L
20
www.kjhole.com
Immunization
A white immunized node never gets infected or transmits an infection
21
Immunized polyculture
22
The malware types onlyspread to three nodes
L=2 node typesL=2 malware typesEight immunized hubs
Robust
Network modelSimple connected graph defines malware spreading pattern
• N nodes
• L≥1 node types
• one malware type per node type
• discrete time t = 0,1,2, ...
• S infected seeds per node type at time t = 0
23
4 Journal of Computer Networks and Communications
(a) (b)
Figure 1: Diverse (a) BA network and (b) WS network seeded with viruses at time step t = 0. Both networks have L = 3 different colorednode types. Circular nodes are susceptible and star-shaped nodes are infected. There is S = 1 seed for each node type. Only the L · S = 3seeds are infected since the viruses have not started to spread. The viruses infecting the seeds control all adjacent edges (shown in red). TheBA network has four isolated nodes in addition to the three infected seeds. Only the seeds are isolated in the WS network.
4.2. Influence of Reinfected Hubs. We now study the stochasticmodel to determine the hubs’ influence on the fractionof isolated nodes in diverse inhomogeneous networks withreinfections of nodes.
When there are Nl = N/L nodes per type, an arbitrarynode is a seed with probability S/Nl = (SL)/N , where S is thenumber of seeds per node type. Since a node of degree D hasroughly D ·Nl/N = D/L neighbors of the same type, a node’snumber of neighboring seeds of the same type is estimatedby
(SL)N
· DL= (SD)
N. (1)
The right-hand side of (1) is independent of the numberof node types L. The number of seeds S per node type canbe large in practice because botnets are used to seed viruses.Hence, a hub with very large degree D is likely to be infectedby a seed during the first time steps of a model run, even ifthe diversity L is large.
A hub of type l is infected with probability pl · (SD)/Nduring the model’s first-time step. Infection will almostsurely occur when pl · (SD)/N ≈ 1. During the followingtime steps, the hub will infect many of its D/L neighborswith the same type, where L≪ D for current networks. Evenmore neighbors will be isolated. In particular, all degree-oneneighbors of any type l′ /= l will be isolated but not infected.When the hub recovers with probability ql during a time step,it will be quickly reinfected by one of the D/L neighbors.Since the neighbors ensure that the hub is infected nearly all
the time, a nonzero fraction of isolated nodes is maintainedover time even when L is large.
Many simulations using the stochastic model confirmthe hubs’ important role in making the fraction of isolatednodes much larger than the fraction of infected nodes. Asseen from Figure 3, if the largest hub on a DH or BA networkis immunized, that is, made permanently resistant to virusattacks, then the instantaneous fraction of isolated nodesdrops significantly. There is no easily detectable reduction inthe instantaneous fraction of infected nodes, confirming thatthe largest hub isolates many susceptible (i.e., not infected)nodes. The large fluctuations in the instantaneous fractionof isolated nodes in Figure 3(a) is due to temporary recoveryof hubs.
The instantaneous fraction of isolated nodes will even-tually go to zero because there is a non-zero probabilitythat all nodes become susceptible in any finite-size network.However, the nonzero averaged fraction of isolated nodeswas stable for very many time steps during the simulations.Hence, when hubs are reinfected, multiple virus outbreakscause substantial long-term node isolation even for highnode diversity L.
5. Halting Technique
Our goal is to halt multiple simultaneous virus outbreaks onany inhomogeneous network without changing its topology.The halting technique should drive the fraction of isolatednodes to zero in the stochastic model. For the deterministic
L = 3 node typesS = 1 seed per node type
24
Seeds
Malware spreading
A sick nodes infect all its neighbors during a single time step t
25
Types of spreading patterns
Homogeneous network—all nodes have degrees k approximately equal to the average degree ⟨k⟩
Inhomogeneous network—a small fraction of nodes, the hubs, have degrees k much larger than the average degree ⟨k⟩
26
Malware halting analysis
To halt malware on networks with several million nodes, we first determine
(A) desired distribution of node types,
(B) a lower bound on the needed diversity, and
(C) the trade-off between diversity and immunization
27
(A) Node type distribution
Let rl be the probability that an arbitrary node is of type l = 1,2, …, L
The entropy −∑ rl log rl measures the uncertainty of a node’s assigned type
It has maximum value log L when all rl = 1/L
28
Maximize entropy (1)
When the entropy is maximized, the best spreading strategy for each malware typeis to select new nodes at random
The probability that a spreading mechanism chooses a node of wrong type is 1 − 1/L
As L increases, this probability increases and the speed of the malware spreading decreases
29
Maximize entropy (2)
If there is less uncertainty about the distribution of vulnerable nodes, e.g. a few node types occur more often than the other node types in a network, then the entropy is smaller and malware writers can create very efficient topological-aware spreading mechanisms
30
Observation 1
Skewed distributions of node types should be avoided because they facilitate rapid malware spreading
31
(B) Needed diversity
32
Example: MMS malware exploits a smartphone’s address book to spread to new phones with the same OS
MMS malware spreading
33
Phones on email list
Phone noton email list
Infected phone
Wang’s network model
Based on location and calling data from 6.2 million mobile subscribers
Market share determines whether devices with the same OS form a giant component in the calling graph
34
What is a component?
A single-type component is a subset of nodes with the same type such that there
• is a path between any pair of nodes in the set, and
• it is not possible to add another node of the same type to the set while preserving this property
35
Giant component
A giant component of same-type nodes has size proportional to N
If a giant component contains a seed, then nearly all nodes in the network will be infected
36
The spread of mobile viruses is aided by twodominant communication protocols. First, aBluetooth (BT) virus can infect all BT-activatedphones within a distance from 10 to 30 m, result-ing in a spatially localized spreading patternsimilar to the one observed in the case of influ-enza (3, 6, 7), severe acute respiratory syndrome(SARS) (8, 9), and other contact-based diseases(Fig. 1A) (10). Second, a multimedia messagingsystem (MMS) virus can send a copy of itself toall mobile phones whose numbers are found inthe infected phone’s address book, a long-rangespreading pattern previously exploited only bycomputer viruses (11, 12). Thus, in order to quan-titatively study the spreading dynamics of mobileviruses we need to simultaneously track the lo-cation (13), the mobility (14–17), and the com-munication patterns (18–21) ofmobile phone users.We achieved this by studying the anonymizedbilling record of a mobile phone provider andrecording the calling patterns and the coordinatesof the closest mobile phone tower each time agroup of 6.2 million mobile subscribers used
their phone. Thus, we do not know the users’precise locations within the tower’s receptionarea, and no information is available about theusers between calls.
The methods we used to simulate thespreading of a potential BT and MMS virus aredescribed in (22). Briefly, once a phone becomesinfected with an MMS virus, within 2 min itsends a copy of itself to each mobile phonenumber found in the handset’s phone book,approximated with the list of numbers withwhich the handset’s user communicated duringa month-long observational period. A BT viruscan infect only mobile phones within a distancer =10 m. To simulate this process, we assigned toeach user an hourly location that was consistentwith its travel patterns (13) and followed the in-fection dynamics within each mobile tower areausing the susceptible infected (SI) model (23).That is, we consider that an infected user (I ) in-fects a susceptible user (S), so that the number ofinfected users evolves in time (t) as dI/dt = bSI/N,where the effective infection rate is b = m<k>with
m = 1, N is the number of users in the tower area,and the average number of contacts is <k> =rA =NA/Atower, where A = pr2 represents the BT com-munication area and r = N/Atower is the popula-tion density inside a tower’s service area. Oncean infected user moves into the vicinity of a newtower, it will serve as a source of a BT infection inits new location.
A cell phone virus can infect only the phoneswith the operating system (OS) for which it wasdesigned (2, 3), making the market sharem of anOS an important free parameter in our study. Thecurrent market share of various smart phone OSsvary widely, from as little as 2.6% for PalmOS to64.3% for Symbian. Given that smart phonestogether represent less than 5% of all phones, theoverall market share of these OSs among all mo-bile phones is in the range ofm = 0.0013 for PalmOS andm = 0.032 for Symbian, numbers that areexpected to dramatically increase as smart phonesreplace traditional phones. To maintain the gen-erality of our results, we treat m as a free param-eter, finding that the spreading of both BT and
Fig. 1. The spread-ing mechanisms ofmobile viruses. (A) ABT virus can infect allphones found withinBT range from the in-fected phone, its spreadbeing determined bythe owner’s mobilitypatterns. An MMS viruscan infect all suscep-tible phones whosenumber is found inthe infected phone’sphonebook, resulting ina long-range spread-ing pattern that isindependent of theinfected phone’s phys-ical location. (B) Asmall neighborhoodof the call graph con-structed starting froma randomly chosen userand including all mo-bile phone contacts upto four degrees from it.The color of the noderepresents the hand-set’s OS, which in thisexample are randomlyassigned so that 75%of the nodes representOS1, and the red arethe remaining handsetswith OS2 (25%). (C)The clusters in the callgraph on which anMMS virus affecting agiven OS can spread,illustrating that anMMSvirus can reach at most the number of users that are part of the giant component of the appropriate handset. As the example for the OS shows, the size of thegiant component highly depends on the handset’s market share (see also Fig. 2C).
B OS1: 75% market share
OS2: 25% market share
Giant component80%
Giant component6%
Small connected components and single nodes
A Bluetooth (BT) contagion Multimedia messages (MMS) contagion
Bluetooth range (~ 10 m)
MMS messages
Bluetooth messages
C
BT susceptible phone
Phone out of Bluetooth range
MMS susceptible phoneInfected phone
22 MAY 2009 VOL 324 SCIENCE www.sciencemag.org1072
REPORTS
on
May
21,
200
9 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
37
Wang’s network model
38
Fig. 2. The spreadingpatterns of BT andMMSviruses. (A) The changesin the ratio of infectedand susceptible hand-sets ( I/N) with time inthe case of a BT virusaffecting handsets withdifferent m. (B) Sameas in panel (A) but forMMS viruses. The satu-ration in I/N indicatesthat an MMS virus canreach only a finite frac-tion of all susceptiblephones. (C) The size ofthe giant componentGm in a function of m.The blue triangles cor-respond to the satura-tion values measured inFig. 2B, whereas thered line is the theoret-ical prediction accord-ing to percolation theory(the deviations are mainly attributed to finite size effects and degree correlationsbecause the calculation assumedan infinite call graph). (D) The latency timeneededto infect q = 0.65 or q = 0.15 fraction of susceptible handsets via a BT virus,approximated with T(q = 0.65, m) ~ m–0.63T0.05 and T(q = 0.15, m) ~ m–0.60T0.04
(continuous lines). (E) The latency time for an MMS virus for q = 0.05, 0.15, and0.30. The continuous lines correspond to T(q,m) ~ (m–mq*)
–a(q), where the best fits
indicate a systematic q-dependence: a(0.05) = 0.20 T 0.02, a(0.15) = 0.17 T 0.01,and a(0.30) = 0.14 T 0.01. (F) Log-log plot showing Lave and Lmax for the largestcluster. The fits correspond to Lmax ~ (m–m*)–0.20T0.02 and Lave ~ (m –m*)–0.19T0.02.The curves in (A), (B), and (D) are obtained from 10 independent simulations,and (E) and (F) represent an average over 100 runs. For more statistical analysisof the fits in (D) to (F), see the detailed discussion in (22).
0 0.2 0.4 0.6 0.8 1 0
0.2
0.4
0.6
0.8
1
m
G
m
0 1 2 3 4 5 6 70
0.2
0.4
0.6
0.8
1
Time (Day)
I/N
m=0.01m=0.05m=0.10m=0.30m=1.00
0 0.5 1 1.5 2 2.5 30
0.2
0.4
0.6
0.8
1
Time (Hour)
I/N
0 0.2 0.4 0.6 0.8 1 0
50
100
150
200
250
300
A B C
D F
L
m
E
MMSBT
MMS
m=0.01m=0.05m=0.10m=0.30m=1.00
T(q
, m
) (M
inut
e)
mc
No
gian
t com
pone
nt
Finite giant component
m m-m* (m*=0.1) 10-3 10-2 10-1 100
101
102
Lave~ (m-m*)-0.19
MMS
Lmax~ (m-m*)-0.20
T(q
, m
) (M
inut
e)
10-2 10-1 100 102
103
104
T(q=0.15, m) ~ m-0.60
BT
T(q=0.65, m) ~ m-0.63
q=0.05q=0.15q=0.30
MMS
Fig. 3. Spatial patternsin the spread of BT andMMS viruses. (A) The vi-rus starts from the sameuser located at the towermarked by the red ar-rows (left). The threepanels show thepercent-age of infected users inthe vicinity of eachmo-bile phone tower (de-noted by the voronoicell that approximateseach tower’s service area).In the right panel, weshow the correspond-ing time-dependent in-fection curves, markingthe moments when thespatial distribution wasrecorded. (B) Average dis-tance between the towerwhere the infection wasoriginally started andthe most currently in-fected phone as a func-tion ofN, the number oftowers with at least oneinfected user, used as aproxy of time (three redand blue curves corre-spond to m = 0.1, m =0.5, and m = 1). Thegreen line is obtained from a null model that assumes that the virus can only spread from one tower’s service area to its neighbor towers’ service areas. The curvesin (B) are obtained from 100 independent simulations.
0 25% 50% 75% 100%
Infected percentageBluetooth MMS
MMS BT
100 101 102 10310-6
10-4
10-2
10 0
Time (Minute)
I/N
N (number of towers with infected users)
<D> (km
)
A
B
0.1%
10%
80%
101 102 103 104100
101
102
103
MMSNull model
BT
Virus starts hereVirus starts here
www.sciencemag.org SCIENCE VOL 324 22 MAY 2009 1073
REPORTS
on
May
21,
200
9 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
Wang et al.
Roughly 45% of all phones in US were smartphones in March 2011
39
Android market share
Androids’s share of the total mobile phone market was 0.45 X 0.35 = 0.16 (16%)
About 62% of the users utilized Android Gingerbread 2.3.x
• market share was 0.16 X 0.62 = 0.10 (10%)
40
41
Fig. 2. The spreadingpatterns of BT andMMSviruses. (A) The changesin the ratio of infectedand susceptible hand-sets ( I/N) with time inthe case of a BT virusaffecting handsets withdifferent m. (B) Sameas in panel (A) but forMMS viruses. The satu-ration in I/N indicatesthat an MMS virus canreach only a finite frac-tion of all susceptiblephones. (C) The size ofthe giant componentGm in a function of m.The blue triangles cor-respond to the satura-tion values measured inFig. 2B, whereas thered line is the theoret-ical prediction accord-ing to percolation theory(the deviations are mainly attributed to finite size effects and degree correlationsbecause the calculation assumedan infinite call graph). (D) The latency timeneededto infect q = 0.65 or q = 0.15 fraction of susceptible handsets via a BT virus,approximated with T(q = 0.65, m) ~ m–0.63T0.05 and T(q = 0.15, m) ~ m–0.60T0.04
(continuous lines). (E) The latency time for an MMS virus for q = 0.05, 0.15, and0.30. The continuous lines correspond to T(q,m) ~ (m–mq*)
–a(q), where the best fits
indicate a systematic q-dependence: a(0.05) = 0.20 T 0.02, a(0.15) = 0.17 T 0.01,and a(0.30) = 0.14 T 0.01. (F) Log-log plot showing Lave and Lmax for the largestcluster. The fits correspond to Lmax ~ (m–m*)–0.20T0.02 and Lave ~ (m –m*)–0.19T0.02.The curves in (A), (B), and (D) are obtained from 10 independent simulations,and (E) and (F) represent an average over 100 runs. For more statistical analysisof the fits in (D) to (F), see the detailed discussion in (22).
0 0.2 0.4 0.6 0.8 1 0
0.2
0.4
0.6
0.8
1
m
G
m
0 1 2 3 4 5 6 70
0.2
0.4
0.6
0.8
1
Time (Day)
I/N
m=0.01m=0.05m=0.10m=0.30m=1.00
0 0.5 1 1.5 2 2.5 30
0.2
0.4
0.6
0.8
1
Time (Hour)
I/N
0 0.2 0.4 0.6 0.8 1 0
50
100
150
200
250
300
A B C
D F
L
m
E
MMSBT
MMS
m=0.01m=0.05m=0.10m=0.30m=1.00
T(q
, m
) (M
inut
e)
mc
No
gian
t com
pone
nt
Finite giant component
m m-m* (m*=0.1) 10-3 10-2 10-1 100
101
102
Lave~ (m-m*)-0.19
MMS
Lmax~ (m-m*)-0.20
T(q
, m
) (M
inut
e)
10-2 10-1 100 102
103
104
T(q=0.15, m) ~ m-0.60
BT
T(q=0.65, m) ~ m-0.63
q=0.05q=0.15q=0.30
MMS
Fig. 3. Spatial patternsin the spread of BT andMMS viruses. (A) The vi-rus starts from the sameuser located at the towermarked by the red ar-rows (left). The threepanels show thepercent-age of infected users inthe vicinity of eachmo-bile phone tower (de-noted by the voronoicell that approximateseach tower’s service area).In the right panel, weshow the correspond-ing time-dependent in-fection curves, markingthe moments when thespatial distribution wasrecorded. (B) Average dis-tance between the towerwhere the infection wasoriginally started andthe most currently in-fected phone as a func-tion ofN, the number oftowers with at least oneinfected user, used as aproxy of time (three redand blue curves corre-spond to m = 0.1, m =0.5, and m = 1). Thegreen line is obtained from a null model that assumes that the virus can only spread from one tower’s service area to its neighbor towers’ service areas. The curvesin (B) are obtained from 100 independent simulations.
0 25% 50% 75% 100%
Infected percentageBluetooth MMS
MMS BT
100 101 102 10310-6
10-4
10-2
10 0
Time (Minute)
I/N
N (number of towers with infected users)
<D> (km
)
A
B
0.1%
10%
80%
101 102 103 104100
101
102
103
MMSNull model
BT
Virus starts hereVirus starts here
www.sciencemag.org SCIENCE VOL 324 22 MAY 2009 1073
REPORTS
on
May
21,
200
9 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
Wang et al.
Ginger-breadwas here
Observation 2
A malware epidemic can only occur when a network contains a giant component of nodes with the same type
42
Diversity needed to avoid giant component
degree k — a node’s number of adjacent edges
average degree ⟨k⟩ = 1/N·∑ ki
average-square degree ⟨k2⟩ = 1/N·∑ (ki)2
43
L ≥ ⎡⟨k2⟩ ∕ {2·⟨k⟩}⎤
(C) Diversity vs immunization
The right-hand side of the bound is large when a network contains nodes with square degrees (ki)2 much bigger than the corresponding degrees ki
If the node degrees are known, then we reduce the lower bound by immunizing the nodes with largest degrees ki
44
Observation 3
We can immunize a small fraction of all nodes (the hubs) in an inhomogeneous network to reduce the need for diversity
• because it is expensive to immunize computers, immunization is of limiteduse in large networks with many hubs
45
Halting method (based on observations)
The method must handle spreading patterns with
• unknown and changing topology
• at least one million nodes
• unreliable node communication
46
Approach
Since the topology is unknown and communication is unreliable, it is difficult to modify the network structure or ask nodes to change their types based on the types of the neighboring nodes
It is more promising to use a simple method that is
• robust to varying topologies
• scale to very large networks
47
Malware halting method
(first version)
1. If practicable, immunize enough large-degree nodes in a network to create a homogeneous subnet when the immunized nodes and their adjacent edges are removed
2. Ensure that the node diversity of the homogeneous subnet is large enough to halt multiple simultaneous malware outbreaks
48
49
Video
Malware Halting Part II: Simulations and Analyses
Kjell Jørgen HoleSimula@UiB
Overview
1. Malware halting in proximity networks
2. Halting in the Enron email network
3. Halting in Barabási and Albert networks
4. Halting in dense IP networks
5. How to immunize unknown hubs
6. Slowing down ‘advanced persistent threats’
51
Proximity networks
52
Phones within Bluetooth range (~10m)
Phone out ofBluetooth range
Infected phone
Proximity network model
1. We generate a proximity network with average node degree ⟨k⟩ by first placing N nodes uniformly at random on a square
2. An edge is then added between a randomly chosen node and its closest neighbor in Euclidean distance
53
Network generation
3. More edges are added the same way until the network has the desired average degree ⟨k⟩
• self-loops and multiple edges between nodes are not allowed
54
N=300⟨k⟩=5L=4S=10
55
Proximity network simulations
Average node degree ⟨k⟩ 5 6 7 8
Minimum needed node types (diversity) L 3 4 4 5
Fraction of infected nodes 3,4 % 3,6 % 4,6 % 4,8 %
N=5000, each fraction averaged over 500 networks with random distribution of node types and seeds, S=10
56
NetLogo proximity model
57
Demo
Enron email network
Sparse and inhomogeneous email network
Nodes represent email addresses belonging to former Enron employees
N=36 692, ⟨k⟩=10.02 (183 831 edges)
The largest hub has degree 1 383
58
Diversity vs immunization
The lower bound on the required diversity is L ≥ 71 before hub immunization
After immunizing the 612 nodes with degrees larger than 0.05% of the total number of edges, the lower bound reduces to L ≥ 9
59
L ≥ ⎡⟨k2⟩ ∕ {2·⟨k⟩}⎤
Enron network simulations
Diversity L 9 10 11 12
Fraction of infected nodes 5,2 % 3,5 % 2,4 % 1,8 %
N=36 692, each fraction averaged over 500 networks with random distribution of node types and seeds, S=10
60
BA networks
The Barabási and Albert (BA) model grows an inhomogeneous scale-free network
61
Example BA network
62
4 Journal of Computer Networks and Communications
(a) (b)
Figure 1: Diverse (a) BA network and (b) WS network seeded with viruses at time step t = 0. Both networks have L = 3 different colorednode types. Circular nodes are susceptible and star-shaped nodes are infected. There is S = 1 seed for each node type. Only the L · S = 3seeds are infected since the viruses have not started to spread. The viruses infecting the seeds control all adjacent edges (shown in red). TheBA network has four isolated nodes in addition to the three infected seeds. Only the seeds are isolated in the WS network.
4.2. Influence of Reinfected Hubs. We now study the stochasticmodel to determine the hubs’ influence on the fractionof isolated nodes in diverse inhomogeneous networks withreinfections of nodes.
When there are Nl = N/L nodes per type, an arbitrarynode is a seed with probability S/Nl = (SL)/N , where S is thenumber of seeds per node type. Since a node of degree D hasroughly D ·Nl/N = D/L neighbors of the same type, a node’snumber of neighboring seeds of the same type is estimatedby
(SL)N
· DL= (SD)
N. (1)
The right-hand side of (1) is independent of the numberof node types L. The number of seeds S per node type canbe large in practice because botnets are used to seed viruses.Hence, a hub with very large degree D is likely to be infectedby a seed during the first time steps of a model run, even ifthe diversity L is large.
A hub of type l is infected with probability pl · (SD)/Nduring the model’s first-time step. Infection will almostsurely occur when pl · (SD)/N ≈ 1. During the followingtime steps, the hub will infect many of its D/L neighborswith the same type, where L≪ D for current networks. Evenmore neighbors will be isolated. In particular, all degree-oneneighbors of any type l′ /= l will be isolated but not infected.When the hub recovers with probability ql during a time step,it will be quickly reinfected by one of the D/L neighbors.Since the neighbors ensure that the hub is infected nearly all
the time, a nonzero fraction of isolated nodes is maintainedover time even when L is large.
Many simulations using the stochastic model confirmthe hubs’ important role in making the fraction of isolatednodes much larger than the fraction of infected nodes. Asseen from Figure 3, if the largest hub on a DH or BA networkis immunized, that is, made permanently resistant to virusattacks, then the instantaneous fraction of isolated nodesdrops significantly. There is no easily detectable reduction inthe instantaneous fraction of infected nodes, confirming thatthe largest hub isolates many susceptible (i.e., not infected)nodes. The large fluctuations in the instantaneous fractionof isolated nodes in Figure 3(a) is due to temporary recoveryof hubs.
The instantaneous fraction of isolated nodes will even-tually go to zero because there is a non-zero probabilitythat all nodes become susceptible in any finite-size network.However, the nonzero averaged fraction of isolated nodeswas stable for very many time steps during the simulations.Hence, when hubs are reinfected, multiple virus outbreakscause substantial long-term node isolation even for highnode diversity L.
5. Halting Technique
Our goal is to halt multiple simultaneous virus outbreaks onany inhomogeneous network without changing its topology.The halting technique should drive the fraction of isolatednodes to zero in the stochastic model. For the deterministic
N=40L=3S=1
NetLogo BA 3D model
63
Demo
Dense IP networks
Assume that L types of random scanning malware spread over a complete network with N nodes of degree k = N − 1
• there are L node types and N/L uniformly distributed nodes per type
64
Dense network analysis
Let there be one seed (S=1) per node type
Each seed has edges to the other N/L − 1 nodes of the same type
Together, the N/L single-type nodes form a star graph with the seed in the center
65
Star graph
66
Seed in thecenter willinfect allother nodes
Need more node types than malware types
Since the seed will always infect all the peripheral nodes in the star graph, it does not help to increase the number of node types L as long as there is one seed per node type
The only way diversity can halt multi-malware outbreaks in dense networks is to use many more node types then there are malware types
67
Needed diversity in dense IP network (1)
If there are M malware types, then M· N/L nodes will be infected
Hence, the diversity L needs to be proportional to N and the number of malware types M must be much smaller than N to prevent a large infection
68
Needed diversity in dense IP networks (2)
The previous observation is in accordance with the diversity bound, which is equal to L ≥ (N − 1)/2 for k = N − 1
69
Advantage of diversity in dense networks (1)
Software diversity cannot stop malware spreading in dense random scanning networks, but it can slow the spreading
• the likelihood of selecting a node of wrong type 1–1/L goes to 1 as the diversity L increases
70
Advantage of diversity in dense networks (2)
If the malware spreading is slow, then it is possible for a (cloud-based) anti-malware solution to tailor the defense to a particular outbreak
71
Unknown hubs
Acquaintance immunization immunizes unknown hubs on a monoculture
72
Acquaintance immunization (monoculture version)
Choose a set of nodes uniformly at random and immunize one arbitrary neighbor per node
• while the original set of nodes is unlikely to contain the relatively few hubs in an inhomogeneous network, the randomly selected neighbors are much more likely to be hubs, since very many edges are adjacent to high-degree nodes
73
74
Acquaintance immunization (polyculture version)
1. Assume Nl = N/L nodes per type
2. For some fraction 0 < f < 1,choose a set of f· Nl nodes of type l uniformly at random such that each node has at least one neighbor of the same type, l=1,2,...,L
3. Immunize one randomly selected neighbor of type l per node in the set f· Nl
75
Observation
When the set of all immunized neighbors f· N = ∑ f· Nl is large enough, the set f· N will contain most of the hubs
76
8 Journal of Computer Networks and Communications
Fraction of nodes
948
Time
1248
0
0.551
Frac
tion
IsolatedInfected
(a)
Fraction of nodes
0
Time
184
0
0.5
Frac
tion
IsolatedInfected
Fractions are nearly identical
(b)
Figure 5: Fractions of isolated and infected nodes caused bymultiple simultaneous virus outbreaks on DH network. (a) Thelargest 216 hubs are immunized at time step one thousand. (b) Thehubs have been correctly identified and immunized before the virusoutbreaks.
spread. The 7 · 20 = 140 seeds generated 158 isolated nodesor less than 1% of all nodes. Let pl = 0.06 and ql = 0.04.When acquaintance immunization is performed in advance,the fractions of isolated and infected nodes went to zero afteronly 154 time steps. The plot of the isolated and infectedfractions (not shown) is very similar to Figure 5(b).
To verify the usefulness of the halting technique forinhomogeneous networks with unknown hubs, we generatedadditional model runs for different DH networks, includingruns where the infection and recovery probabilities pl andql varied with l. After first determining a suitable fraction fof immunized nodes and number of node types L, the seedscaused little spreading and all infected nodes recovered. Thespeed at which the virus outbreaks die out depends on thefraction f , diversity L, selection of L · S seeds, and spreadingrates pl/ql.
(a)
(b)
Figure 6: Acquaintance immunization of DH network. (a) Immu-nized dark-pink nodes and remaining susceptible multicoloredhubs. (b) The few red and white isolated nodes after the viruses triedto spread.
7. Final Discussion
The Internet is best viewed as a large collection of networks.Because each network has different default settings, soft-ware patch levels, firewall rules, browser settings, antivirussignature sets, configuration management practices, anddiagnostic capabilities, they are not all vulnerable to the sameviruses [8]. However, we have seen many examples of largenetworks with too little software diversity to prevent virusepidemics.
Since the virus writers control the spreading mechanismsof viruses, a practical halting technique must handle viruseswith widely different spreading patterns. The reported resultsindicate that robust halting of viruses is obtainable whenapplication stores with “diversity engines” ensure adequatesoftware diversity on the OS, and application layers of anetwork and vulnerable hubs are immunized (Appendix Cdiscusses the halting technique’s fragility to clustering ofplatform types.)
77
Immunization example
Immunizednodes andremainingsusceptiblehubs
APT defined
Advanced Persistent Threat (APT)— targeted effort to obtain or change information by means that are difficult to
• discover,
• difficult to remove,
• and difficult to attribute
78
APT examples
Examples of APTs are state-sponsored attacks on foreign commercial and governmental enterprises to steal industrial and military secrets
The attacks are often initiated by well-timed, socially engineered (spear-phishing) emails delivering trojans to individuals with accessto sensitive information
79
Why malicious email?
Malicious email is leveraged because most enterprises allow email to enter their networks
Persistent attackers frequently exploit OS or application vulnerabilities in the targeted systems
80
Attack description
An attacker first develops a payload to exploitone or more vulnerabilities
Next, an automated tool such as a PDF or a Microsoft document delivers the payload toa few users of a system
The payload installs a backdoor or provides remote system access, allowing the attacker to establish a presence inside the trusted system boundary
81
Diversity slows down attacks
If a user and an attacker download the same program from an application store that generate diverse executable files, then the two downloaded files share a common vulnerability with probability 1/L
82
Diversity slows down attacks
When the diversity L is large, the probability of a common vulnerability is small and attackers can no longer reliably analyze their own downloaded program files to exploit vulnerabilities in users’ program files
83
Reverse engineering in a monoculture is easy
84
securitypatchor
replacementso1ware
nowsafe
vulnerable
unpatchedso1ware
A8ackercancomputean
exploitbycomparing
unpatched&patched
so1wareversions
userswhohaven'tyetappliedthepatchareputatrisk
byreleaseofthepatch
exploit
Figure 4: Existing situation: release of a software update exposes vulnerabilities.
It is this fundamental problem that massive-scale software diversity addresses. We advocatethe introduction of automated code variance techniques that result in the binary code imagesdelivered to subsequent code consumers being subtly different. This process can be embeddedseamlessly into an online software delivery system, an “App Store” (Figure 3), and thereby bemade entirely transparent to the code consumer (all programs derived from the same source inthis manner have the identical functionality). The mechanism can even be set up so that addingsoftware diversity poses no extra effort to the software programmer (a compiler automaticallygenerates the different versions without any additional human intervention). But as a result ofdeploying massive-scale diversity, any specific exploit will work on only a relatively small numberof targets.
3 Diversity Removes Reverse Engineering Vulnerabilities of Soft-ware Patches
Massive-scale software diversity removes another major problem of current software monocul-ture: the fact that releasing a patch for a discovered vulnerability alerts adversaries about theexistence of the vulnerability. It is current best practice to fix software vulnerabilities as soon aspossible after they are discovered. In the desktop space, this is usually achieved by sending a“patch” to the compromised host. Such a patch contains the delta between the original (vulnera-ble) program and a corrected new version. Since such fixes usually apply to only a small fractionof a program, it is more efficient to send just the patch rather than sending a whole correctedprogram.
In the mobile space, user-installable programs (“Apps”) are currently updated by sendingcomplete replacement versions rather applying an incremental patch, while the mobile operatingsystem software itself is updated using patches. As Apps on mobile devices grow, it is probablyonly a matter of time until the App Store frameworks of the various mobile platforms will supportreplacing only part of an App (via a patch) rather than downloading a wholly new App each time.
4
Situation in polyculture
85
It is necessary to create software patches tailored to the different binary versions of the same program
An attacker cannot reverse engineer software patches by comparing a particular patch to the corresponding code on a user’s computer because the path and code are unknown to the attacker
Reverse engineering in a polyculture is hard
86
replacementso-ware originalso-ware
vulnerabilitycannotbe
extractedsimplybycomparing
Figure 5: Solution #1: Software is updated by sending complete replacement version. Adversarycannot match an original version to its replacement and cannot extract a vulnerability.
IDofmy
variant
custompatchfor
thisversiononly
custompatchisworthless
unlessyouhavetheexact
variantitrelatesto
Figure 6: Solution #2: Software is updated via custom patch. Patch is meaningless to adversarywithout the specific original version that it relates to.
A bug fix (in the form of either a patch or a replacement program) gives a potential adver-sary information that can be used to precisely identify the vulnerability being fixed in the newversion (Figure 4). A significant proportion of software exploits today are generated from reverseengineering of error fixes. As a consequence, it is imperative that updates are applied as soon asthey are available. The average time lag between availability of an update and its installation on avulnerable target is often a good predictor for overall vulnerability.
In this context, Apps for mobile devices are actually potentially even more vulnerable thandesktop applications. This is because Apps for mobile devices tend to evolve much more quicklythan traditional desktop software. For example, for many Apps in the Android Marketplace,release cycles are expressed in days rather than months. The rapid software evolution cycle ismore likely to push out software that is not as mature as it should be, i.e., containing a higherproportion of residual errors than necessary. And fixing these errors in subsequent releases willgive an adversary a steady stream of hints as to the location of exploitable vulnerabilities.
Massive-scale software diversity makes it much more difficult for an attacker to generateattack vectors by way of reverse engineering of security patches (Figures 5 and 6). An attackerrequires two pieces of information to extract a vulnerability from a bug fix: the version of thesoftware that is vulnerable and the specific patch that fixes the vulnerability. In an environment inwhich software is diversified to an extreme degree and every instance of every piece of softwareis unique, we can set things up so that the attacker never obtains a matching pair of vulnerablesoftware and its corresponding bug fix that could be used to identify the vulnerability. In Section 7below, we outline a concrete mechanism that achieves this goal.
5
Malware Halting Part III: From Fragility to Antifragility
Kjell Jørgen HoleSimula@UiB
Overview
1. Definition of fragility, robustness, and antifragility
2. Antifragility to malware spreading
3. Network model
4. NetLogo demo
88
89
A property of a complex adaptive system is
fragile if it is easily damaged by internal or external perturbations,
robust if it can withstand perturbations (up to a point), and
antifragile if it learns from incidents how to become increasingly robust over time
Fragile, robust, and anti-fragile systems
90
Antifragile
91
Please mishandle
Fragile Robust Antifragile
92
Fragile monoculture
93
A system is fragile to infectious malware when it spreads over the network
Initially infectednode
Robust polyculture
94
A system is robust to infectious malware when there is only limited spreading
Antifragility to malware
A system is antifragile when it “learns” from previous malware outbreaks how to reduce the spreading of future outbreaks
• the previously studied epidemiological model is not antifragile because it does not learn
95
+ =
Software diversity Imperfect malwaredetection
Antifragility to malware spreading
96
So#wareDeveloper
creates
deliversto
So#ware
Variants
DiversityEngine
withinAppStore
AppStore
creates
subsequentdownloadersreceivefunc=onallyiden=cal
butinternallydifferentversionsofthesameso#ware
Figure 3: Diversification mechanism can be hidden entirely within an online software deliverysystem (“App Store”) so that it becomes transparent to both code consumers and software devel-opers.
massive-scale software diversity. We elaborate on the problem of patching software that has beendiversified. We present a list of interesting open research problems that appear in the context ofmassive-scale software diversity. We claim that massive-scale software diversity is a new securityparadigm in itself. Finally, following a section on related work, we conclude the paper.
2 Vulnerabilities, Exploits and a Solution
A software vulnerability by itself is merely a hazard. In order to turn such a hazard into a success-ful attack, an attacker needs to find a successful exploit strategy. For example, the attacker mayknow of a vulnerability that enables a write beyond the end of a certain buffer on the stack. Butin order to exploit this known vulnerability, the attacker needs to overwrite very specific locationson the stack with very specific values.
Operating system vendors now add elements of randomness to their systems, with the aim ofmaking it more difficult for attackers to design a successful exploit. For example, the latest versionof the Windows operating system now randomizes the starting address of the stack. Unfortunately,this has not stopped attackers from devising workarounds.
Designing a successful exploit for a known vulnerability is not trivial, but a dedicated attackerwith ample resources is likely to succeed in eventually creating an attack. In today’s world, theeffort invested into designing such an exploit can be amortized by its wide applicability—sincemillions of users are running the identical vulnerable binary, just one successful exploit can affectall of them simultaneously.
3
Software diversity
Application stores, e.g. Google Play and iOS App Store, can utilize compilers with “diversity engines”
97
Devices can repeatedly download executables to create time-varying
software diversity
Malware detection
Behavior based methods increase the detection rate compared to signature-based methods, but the detection is not perfect
98
Malware detection allows a system to partially learn when new software should be downloaded to a device
New epidemiological model
• Simple connected graph with N nodes and maximum L node types
• Discrete time t = 0,1,...,
• D = D(t) node types, 1 ≤ D(t) ≤ L, at time step t
99
Software downloads
• All nodes change type with probability p at each time t to model automated software downloads
• each of the L possible node types is selected with probability 1/L
100
Malware outbreaks
• A single susceptible node is infected with probability q at each time t
• the node is selected uniformly at random
• A sick node will infect all its neighbors of the same type
101
Malware detection
• Infected nodes change type with probability r during a time step to model malware detection
• detection is followed by immediate download of new software, i.e., change of node type
• any infected node becomes susceptible when it changes type
102
Malware halting method
(second version)
103
1. From fragile monoculture to robust polyculture
• using time-varying software diversity
2. From robust to antifragile polyculture
• using imperfect malware detection
NetLogo model (1)
104
Spike due to monoculture
Malware detection on
Change in spreading mechanism
Immunization of hubs
NetLogo model (2)
105
Polyculture Polyculture Polyculture PolycultureMonoculture
Time-varying polyculture
NetLogo model (3)
106
Monoculture
PolycultureImperfect hub immunization
Demo
107
Fragility vs antifragility
108
Fragile monoculture Antifragile polyculture
Large spreading of malware Nearly no malware spreading
Needs continuous repair Self-repairing (up to a point)
Must immunise nearly all devices
Immunization of hubs
No adaption to changes in spreading
Adapts to changes
Further study1. Watch Nassim Taleb and Daniel Kahneman
discuss antifragility
• www.youtube.com/watch?v=MMBclvY_EMA
2. Watch Taleb give a talk at Stanford
• www.youtube.com/watch?v=c6sX5MSdLag
3. Find more talks on the web
109
References (1)
M. Franz, E Unibus Pluram: Massive-Scale Software Diversity as a Defense Mechanism, Proc. New Security Paradigms Workshop 2010 (NSPW 2010), Concord, Massachussetts, USA, Sept. 21–23, 2010, pp. 7–16
P. Wang, M. C. González,C. A. Hidalgo, A.-L. Barabási, Understanding the Spreading Patterns of Mobile Phone Viruses, Science, vol. 324, 22 May 2009, pp. 1072–1076
110
References (2)
K. J. Hole, Diversity Reduces the Impact of Malware, IEEE Security & Privacy Magazine, vol. 13, no. 3, 2015, pp. 48–54
K. J. Hole, Towards Anti-fragility to Malware Spreading, IEEE Security & Privacy Magazine, vol. 13, no. 4, 2015, pp. 40–46
111
References (3)
N.N. Taleb, Antifragile: Things That Gain from Disorder, Random House, 2012
Daniel Bilar, Known Knowns, Known Unknowns and Unknown Unknowns: Anti-virus issues, malicious software and Internet attacks for non-technical audiences, journals.sas.ac.uk/deeslr/article/view/1880
112
References (4)
R. Cohen, S. Havlin, and D. Ben-Avraham, Efficient immunization strategies for computer networks and populations, Physical Review Letters, vol. 91, no. 24, Article ID 247901, 2003
113