Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F. ...

64
Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F. Kaashoek & N. Zeldovich Security

description

Security. Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F. Kaashoek & N. Zeldovich. Where are we?. System Complexity Modularity & Naming Enforced Modularity Network Fault Tolerance. Transaction Consistency Security - PowerPoint PPT Presentation

Transcript of Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F. ...

Page 1: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

Principles of Computer System, for Graduated, 2012 FallSlides adapted from MIT 6.033, credit F. Kaashoek & N. Zeldovich

Security

Page 2: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

2

Where are we?

• System Complexity• Modularity & Naming• Enforced Modularity• Network• Fault Tolerance

• Transaction• Consistency• Security

Authentication & Authorization

Mobile Code Secure Channel

Page 3: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

3

Mobile Code• Goal

Safely run someone else's code on user's computer

• Use cases Javascript Flash Downloaded programs on a mobile phone

Page 4: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

4

Threat Model• Assume the code user is running is malicious

User may be tricked into visiting a malicious web site That web site will run malicious code in your browser Spam email may trick user into clicking on link to malicious site

Malicious site could buy an advertisement on CNN.com Adversary's page loads when user visits CNN.com

On phones, users can also be misled to install a malicious app Adversaries create apps that match popular search terms Mimic existing popular applications

Page 5: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

5

Security Goals• Privacy

User's data should not be disclosed

• Integrity User's data should not be corrupted

• Availability User should use the service

• Example: Android

Page 6: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

6

Virtual Machine• Run each app in a separate virtual machine

E.g., phonebook VM stores user's contacts, email VM stores messages

• VMM ensures isolation between VMs Strong isolation in case of malicious app

Android uses JVM (Dalvik VM)

• Problem: applications may need to share data E.g., mapping app access the phone's location (GPS)

E.g., phonebook app send one entry via user's gmail account

E.g., Facebook app add an entry to the phone's calendar,

• if user accepts an invitation to some event via Facebook

• How to achieve controlled sharing?

Page 7: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

7

Unix Security Mechanisms• Principal: UID

Each user is assigned a 32-bit user ID (uid) OS keeps track of the uid for each running process

• Objects: files Operations: read/write file Authorization: each file has an access control list

• How to enforce mobile code policies on a UNIX system?

Page 8: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

8

Unix Security Mechanisms• Hard to do

Unix designed to protect users from each other All user's processes run with the same uid All user's files are accessible to that uid Any program would have full access to user's system Mismatch between available mechanism and desired policy

• How to do better? Define an application model where security mechanisms fit

our policies

Page 9: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

9

Goals• Arbitrary applications, with different privileges

Might want apps to be the principals, rather than the user

• Arbitrary resources, operations Unix only protects files, and mechanism controls read/write Might have many kinds of resources:

• contacts in phonebook, events in Facebook app, GPS, photos Resources and operations defined by applications

• modifying a contact, scheduling a meeting in calendar, responding to an event in Facebook app

• Arbitrary policies Similarly defined by applications & user

Page 10: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

10

Android application model• Applications have names

e.g., com.android.contacts Applications interact via messages, called "intents”

• Message (intent) contains: name of target (string) action (string) data (string)

• Applications broken up into components Components receive different messages Components are also named by strings Message targets are actually components E.g., "com.android.contacts / Database”

Page 11: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

11

Security Model• Applications: the principals

Application defines permission labels

• Permission Label Free-form string: e.g., android.perm.READ_CONTACTS

• Label on component Application assigns a permission label to each component E.g., contacts db with android.perm.READ_CONTACTS This specifies the permission necessary to send messages to

that component

Page 12: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

12

Security Model• Application specifies permission labels that

it requires List of permission label strings Each installed application maps to a list of permissions that

it has

Page 13: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

13

Mechanism & Policy• Mechanism:

A can send message to B.C if A has B.C's permission label

• Principal: application• Resource: component• Operation: any message• Policy: assignment of labels to components

& applications

Page 14: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

14

Mechanism & Policy• Who decides the policy?

Server app developer decides what permissions are important

Client app developer decides what permissions it may need User decides whether it's OK with granting client app those

permissions To help user, server app includes some text to explain each

permission

Page 15: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

15

Implementing Android's Model• Messages are sent via a “reference monitor”

RM in charge of checking permissions for all messages Uses the mechanism defined above RM runs as its own uid, distinct from any app

• What ensures complete interposition? Built on top of Linux, but uses Unix security mechanisms in

new way Each app (principal) has its own uid Apps only listen for messages from reference monitor Linux kernel ensures isolation between different uid’s

Page 16: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

16

Broadcast Messages• Broadcast

May want to send message to any application that's interested E.g., GPS service may want to provide periodic location updates

• Android mechanism: broadcast intents Components can request any broadcast message to specific action But now anyone can subscribe to receive location messages!

• Solution: think of the receiver as asking the sender for the message

Receiver would need label for sender's component Sender includes a permission label in broadcast messages it sends Message delivered only to components whose application has that

permission

Page 17: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

17

Authenticating source of messages

• How can an application tell where the message came from?

E.g., android provides bootup / shutdown broadcast messages Want to prevent malicious app from sending a fake shutdown

message

• Use labels: agree on a permission label for sending, e.g., system events

Component that wants to receive these events has the appropriate label

Only authentic events will be sent: others aren't allowed by label

Page 18: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

18

Delegation• Scenario

E.g., run a spell checker on a post you're writing in some app E.g., view an attachment from an email message Ideally want to avoid sending data around all the time

• Android mechanism: delegation One app can allow a second app to send specific kinds of messages, even if

the second otherwise wouldn't be able to send them on its own E.g., read a particular email attachment, edit a particular post, etc.

• Implementation: RM keeps track of all delegations Delegation complicates reasoning about what principals can access a

resource May want to delegate access temporarily Delegation usually requires a revocation mechanism

Page 19: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

19

Authenticating Applications• Can one app rely on another app's names?

E.g., can my camera app safely send a picture to "com.facebook/WallPost"?

• Not really App names and permission names are first-come-first-serve Name maps to the first application that claimed that name Important to get naming right for security!

Page 20: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

20

What goes wrong?• Only if permissions are granted to trust apps

At best, Android model limits damage of malicious code to what user allowed

Users don't have a way to tell whether an app is malicious or trustworthy

Users often install apps with any permissions Manifest vs runtime prompts? Some permission is easy to be ignored

• Trusted components have bugs Linux kernel Privileged applications (e.g., logging service on some HTC devices, 2011).

Page 21: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

21

Enforcing Goals• Enforce original goal (protect user's private)

Hard to enforce, because this property spans entire system, not just one app

Strawman: prohibit any app that has both READ_CONTACTS and INTERNET

App 1 might have READ_CONTACTS & export a component for leaking contacts info

App 2 might have INTERNET & talk to app 1 to get leaked contacts info

Page 22: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

22

Enforcing Goals• Complementary approach: examine

application code. Apple's AppStore for iPhone applications. Android's market / "Google play".

Page 23: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

23

Research Topics• User-friendly Approach

User-driven Access Control Biometric Authentication Progressive Authentication

Page 24: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

24

Secure Channel• Cryptographic primitives

Encrypt/decrypt, MAC, sign/verify

• Key establishment• MITM attacks• Certificates

Page 25: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

25

Secure Channel• Problem: many networks do not provide

security guarantees Adversary can look at packets, corrupt them Easy to do on local network Might be possible over the internet, if adversary changes DNS

• Adversary can inject arbitrary packets, from almost anywhere

Dropped packets: retransmit Randomly corrupted packets: use checksum to drop Carefully corrupted, injected, sniffed packets: need some new

plan

Page 26: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

26

Security Message• Security goals for messages

Secrecy: adversary cannot learn message contents Integrity: adversary cannot tamper with message contents

Page 27: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

27

Cryptographic Primitives• Encrypt(ke, m) -> c; Decrypt(ke, c) -> m

Ciphertext c is similar in length to m (usually slightly longer) Hard to obtain plaintext m, given ciphertext c, without ke But adversary may change c to c', which decrypts to some other

m’

• MAC(ka, m) -> t MAC stands for Message Authentication Code Output t is fixed length, similar to a hash function (e.g., 256 bits) Hard to compute t for message m, without ka

• Common keys today are 128- or 256-bit long

Page 28: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

28

Secure Channel Abstraction• Send and receive messages, just as before

Use Encrypt to ensure secrecy of a message Use MAC to ensure integrity (increases size of message)

• Complication: replay of messages Include a sequence number in every message Choose a new random sequence number for every connection

• Complication: reflection of messages Recall: be explicit -- we're not explicit about what the MAC

means Use different keys in each direction

Page 29: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

29

Open vs. Closed Design• Should system designer keep the details of Encrypt,

Decrypt, and MAC secret? Argument for: harder for adversary to reverse-engineer the system? Argument against: hard to recover once adversary learns algorithms Argument against: very difficult to get all the details right by yourself

• Generally, want to make the weakest practical assumptions about adversary

Typically, assume adversary knows algorithms, but doesn't know the key Advantage: get to reuse well-tested, proven crypto algorithms Advantage: if key disclosed, relatively easy to change (unlike algorithm)

• Using an "open" design makes security assumptions clearer

Page 30: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

30

Problem: Key Establishment• Suppose client wants to communicate

securely with a server How would a client get a secret key shared with some

server?

• Broken approaches: Alice picks some random key, sends it to Bob Alice and Bob pick some random values, send them to each

other, use XOR

Page 31: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

31

Diffie-Hellman Key Exchange

Page 32: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

32

Diffie-Hellman Protocol• Another cryptographic primitive.

Crypto terminology: two parties, Alice and Bob, want to communicate

• Main properties of the protocol: After exchanging messages, both parties end up with same

key k Adversary cannot figure out k from g^a and g^b alone (if a+b

are secret)

• This works well, as long as the adversary only observes packets

Page 33: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

33

Man-in-the-middle Attack

Page 34: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

34

Man-in-the-middle Attack• Active adversary intercepts messages between Alice

and Bob Adversary need not literally intercept packets: can subvert DNS instead If adversary controls DNS, Alice may be tricked to send packets to Eve

• Both Alice and Bob think they've established a key Unfortunately, they've both established a key with Eve

• What went wrong: no way for Alice to know who she's talking to

Need to authenticate messages during key exchange In particular, given the name (Bob) need to know if (g^b mod p) is from

Bob

Page 35: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

35

New Primitive: Signatures• User generates a public-private key pair: (PK, SK)

PK stands for Public Key, can be given to anyone SK stands for Secret Key, must be kept private

• Two operations: Sign(SK, m) -> sig Verify(PK, m, sig) -> yes/no

• Property: hard to compute sig without knowing SK. "Better" than MAC: for MAC, two parties had to already share a secret key. With signatures, the recipient only needs to know the sender's _public_ key.

• We will denote the pair {m, sig=Sign(SK, m)} as {m}_SK Given {m}_SK and corresponding PK, know that m was signed by someone

w/ SK

Page 36: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

36

Diffie-Hellman with Signatures

Page 37: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

37

Idea 1: Remember Key• Idea 1: Alice remembers key used to

communicate with Bob last time Easy to implement, simple, effective against subsequent

MITM attacks ssh uses this approach Doesn't protect against MITM attacks the first time around. Doesn't allow server to change its key later on.

Page 38: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

38

Idea 2: Consulting Authority• Idea 2: consult some authority that knows everyone's public

key• Simple protocol

Authority server has a table: name <-> public key Alice connects to authority server (using above key exchange protocol) Client sends message asking for Bob's public key Server replies with PK_bob

• Alice must already know the authority server's public key, PK_as

Otherwise, chicken-and-egg problem

• Works well, but doesn't scale Client must ask the authority server for public key for every connection .. or at least every time it sees new public key for a given name

Page 39: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

39

Idea 3: CA• Idea 3: Pre-computing

Authority responds the same way every time Public/private keys can be used for more than just key exchange.

• New protocol: Authority server creates signed message { Bob, PK_bob }_(SK_as). Anyone can verify that the authority signed this message, given PK_as. When Alice wants to connect to Bob, need signed message from authority.

• Authority's signed message usually called a "certificate". Certificate attests to a binding between the name (Bob) and key (PK_bob). Authority is called a certificate authority (CA).

• Certificates are more scalable. Doesn't matter where certificate comes from, as long as signature is OK. Easy scalability solution: Bob sends his certificate to Alice. (Similarly, Alice sends her certificate to Bob.)

Page 40: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

40

Who runs this CA?• Today, a large number of certificate

authorities. Show certificate for https://www.google.com/. Name in certificate is the web site's host name. Show list of certificate authorities in Firefox

• If any of the CAs sign a certificate, browser will believe it

Somewhat problematic. Lots of CAs, controlled by many companies & governments. If any are compromised or malicious, mostly game over.

Page 41: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

41

Where does this list of CAs come from?

• Most of these CAs come with the browser Web browser developers carefully vet the list of default CAs Downloading list of CAs: need to already know someone's

public key

• Bootstrapping: chicken-and-egg problem, as before

Computer came with some initial browser from the manufacturer

Manufacturer physically got a copy of Windows, including IE and its CAs

Page 42: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

42

How does the CA get name mapping

• 1. How do we name principals? Everyone must agree on what names will be used Depends on what's meaningful to the application. Would having certificates for an IP address help a web browser?

• Probably not: actually want to know if we're talking to the right server.• Since DNS untrusted, don't know what IP we want• Knowing key belongs to IP is not useful

For web servers, certificate contains server's host name (e.g., google.com).

• 2. How to check if a key corresponds to name? Whatever mechanism CA decides is sufficient proof Some CAs send an email root@domain asking if they approve cert for

domain Some CAs used to require faxed signed documents on company letterhead

Page 43: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

43

What if a CA makes a mistake?• Whoever controls the corresponding secret

keys can now impersonate sites Similarly problematic: attacker breaks into server, steals

secret key Need to revoke certificates that should no longer be

accepted Note this wasn't a problem when we queried the server for

every connection

Page 44: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

44

Certificate Authority Mistakes• 2001: Verisign cert for Microsoft Corp.• 2011: Comodo certs for mail.google.com, etc• 2011: DigiNotar cert for *.google.com

Page 45: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

45

Tech-1: Expiration• Technique 1: include an expiration time in

certificate Certificate: { Bob, 10-Aug-2011, PK_bob }_(SK_as) Clients will not accept expired certificates When certificate is compromised, wait until expiration time Useful in the long term, but not so useful for immediate

problems

Page 46: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

46

Tech-2: Revocation• Technique 2: publish a certificate revocation

list (CRL) Can work in theory Clients need to periodically download the CRL from each CA MSFT 2001 problem: VeriSign realized they forgot to publish

their CRL address Things are a little better now, but still many CRLs are empty Principle: economy of mechanism, avoid rarely-used

(untested) mechanisms

Page 47: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

47

Tech-3: Query & Check• Technique 3: query an online server to check

certificate freshness No need to download long CRL Checking status might be less costly than obtaining

certificate

Page 48: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

48

Tech-4: Public Key• Use public keys as names. [ SPKI/SDSI ]

Trivially solves the problem of finding the public key for a "name” Avoids the need for certificate authorities altogether Might not work for names that users enter directly

• Can work well for names that users don't have to remember/enter

Application referring to a file Web page referring to a link

• Additional attributes of a name can be verified by checking signatures

Suppose each user in a system is named by a public key Can check user's email address by verifying a message signed by that key

Page 49: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

49

Review of CSP• Complex systems systems fail for complex

reasons Find the cause … Find a second cause … Keep looking … Find the mind-set

Page 50: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

50

Page 51: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

51

United Airlines/Univac• Automated reservations, ticketing, flight

scheduling, fuel delivery, kitchens, and general administration

Started 1966, target 1968, scrapped 1970, spend $50M

• Second-system effect (First: SABRE) (Burroughs/TWA repeat)

Page 52: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

52

CONFIRM• Hilton, Marriott, Budget, American Airlines • Hotel reservations linked with airline and car

rental • Started 1988, scrapped 1992, $125M • Second system • Dull tools (machine language) • Bad-news diode

Page 53: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

53

London Ambulance Service • Ambulance dispatching • Started 1991, scrapped in1992 (20 lives lost in 2

days, 2.5M) • Unrealistic schedule (5 months) • Overambitious objectives • Unidentifiable project manager • Low bidder had no experience • No testing/overlap with old system • Users not consulted during design

Page 54: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

54

More, too many to list• Portland, Oregan, Water Bureau, 30M, 2002

• Washington D.C., Payroll system, 34M 2002

• Southwick air traffic control system $1.6B 2002

• Sobey’s grocery inventory, 50M, 2002

• King’s County financial mgmt system, 38M, 2000)

• Australian submarine control system, 100M, 1999

• California lottery system, 52M

• Hamburg police computer system, 70M, 1998

• Kuala Lumpur total airport management system, $200M, 1998

• UK Dept. of Employment tracking, $72M, 1994

• Bank of America Masternet accounting system, $83M, 1988,

• FBI virtual case, 2004.

• FBI Sentinel case management software, 2006.

• UK National offender management IS, $155M, 2007 (restart)

Page 55: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

55

Fighting back: control novelty• Source of excessive novelty:

Second-system effect Technology is better Idea worked in isolation Marketing pressure

• Some novelty is necessary; the difficult part is saying No

• Don’t be afraid to re-use existing components Don’t reinvent the wheel Even if it takes some massaging

Page 56: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

56

Fighting back: adopt sweeping simplifications

• Processor, Memory, Communication • Dedicated servers • N-level memories • Best-effort network • Delegate administration • Fail-fast, pair-and-compare • Don’t overwrite • Transactions • Sign and encrypt

Page 57: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

59

Fighting back: find flaws fast• Plan, plan, plan (CHIPS, Intel processors) • Simulate, simulate, simulate

Boeing 777 and F-16

• Design reviews, coding reviews, regression tests, daily/hourly builds, performance measurements

• Design the feedback system: Alpha and beta tests Incentives, not penalties, for reporting errors

Page 58: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

60

Fighting back: design for iteration, iterate the design

• Something simple working soon Find out what the real problems are

• One new problem at a time • Use iteration-friendly design

E.g., Failure/attack models

• Facebook: Keep Shipping!

Page 59: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

61

Example: Linux• 1995: Linux

hobbyist project • Now: Google,

Amazon servers, Android run Linux

• Fast iterative software development

Page 60: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

62

Fighting back: conceptual integrity

• One mind controls the design Macintosh Visicalc spreadsheet UNIX Linux

• Good esthetics yields more successful systems Parsimonious, Orthogonal, Elegant, Readable, …

• Few top designers can be more productive than a larger group of average designers

Page 61: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

63

Fighting back: learn from failures

• Take failures seriously and learn from it • Example: Amazon outage [2011]

Elastic block store aggressively remirrors Network configuration problem in NE availability zone

effected primary and backup network “Re-mirror storm”, effected other regions Took days to get under control Amazon took failure analysis serious

• Counter examples: RSA, Sony PlayStation network

Page 62: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

64

Fighting back: Summary• Principles that help avoiding failure

Limit novelty Adopt sweeping simplifications Get something simple working soon Iteratively add capability Give incentives for reporting errors Descope early Give control to (and keep it in) a small design team

• Strong outside pressures to violate these principles

Need strong knowledgeable managers/designers

Page 63: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

65

Summary• Thank you all.• Please give your advices on our QA site.

• What to do next? Killer-skill, at least one Expand your vision Ask the right question

Page 64: Principles of Computer System, for Graduated, 2012 Fall Slides adapted from MIT 6.033, credit F.  Kaashoek  & N.  Zeldovich

66

Final Exam• The 17th week (2013-01-04 Friday)• 14:00 – 16:00• 东下 101/201