Brace yourselves, leap second is coming

43
Nati Cohen Fewbytes BRACE YOURSELVES LEAP SECOND IS COMING

Transcript of Brace yourselves, leap second is coming

Page 1: Brace yourselves, leap second is coming

Nati CohenFewbytes

BRACE YOURSELVES

LEAP SECONDIS COMING

Page 2: Brace yourselves, leap second is coming

Intro: Assumptions

Page 3: Brace yourselves, leap second is coming

Installing servers

1. Unbox2. Mount3. Connect to power4. Connect to network5. Power up6. Network boot7. …8. Profit

Page 4: Brace yourselves, leap second is coming

(Not) Installing servers

1. Unbox2. Mount3. Connect to power4. Connect to network5. Power up6. …7. …...8. ……...

Page 5: Brace yourselves, leap second is coming

Solving problems 101

1. Blame it on the network

2. DHCP issue?

3. PXE issue?

4. Problematic server?

Page 6: Brace yourselves, leap second is coming

We checked everything

at least anything that seemed plausible

But after 5 days...

Page 7: Brace yourselves, leap second is coming

Same MAC address

MAC addressA media access control address (MAC address) is a unique identifier assigned to network interfaces for communications on the physical network segment.

Page 9: Brace yourselves, leap second is coming

We all make assumptions

● Development● Debugging● Marketing● Support● …

Being informed, helps avoiding them!

Page 10: Brace yourselves, leap second is coming

Agenda

1. Intro: Assumptions

2. Missiles and Rounding Errors

3. Breaking the Internet in one second

4. Aviation Safety

Page 11: Brace yourselves, leap second is coming

Patriot missile defense system

1. Search

2. Validate

3. Track

Next position =Velocity (real) * Time (int -> real)

GAO/IMTEC-92-26 - Software Problem Led to System Failure at Dhahran, Saudi Arabia

Page 12: Brace yourselves, leap second is coming

to_seconds(ttos)

ttos = 28800 // 8*60*60*10ttos * 0.1 = ?

1. 2880.02. 28799.97253. 28799.9999998613974. ???

It depends on the representation...

Page 13: Brace yourselves, leap second is coming

Rational Numbers

Recall, integers:1337 = 00000101 00111001

Option 1: Rational Numbers● (numerator, denominator)● PROs: exact representation● CONs: Pi / sqrt(2), space, speed

Page 14: Brace yourselves, leap second is coming

Fixed Point

Option 2: Fixed-point● variable * (base ^ scaling factor)● base and scaling factor are fixed

○ binary vs decimal○ +/- exponent

● PROs: space, easy to compute● CONs: limited range of valueseg.variable = 30, base = 2, scaling factor = -3

00011.1102 = 21 + 20 + 2-1 + 2-2 = 3.75

Page 15: Brace yourselves, leap second is coming

Rounding errors

0.125 = 2-3

0.1 = 2-4 + 2-5 + 2-8 + 2-9 + 2-12 + 2-13 + ...

with 8 bit variable0.1 = 0.09375

with 24 bit variable0.1 = 0.09999990463256836

Page 16: Brace yourselves, leap second is coming

Floating Point

Option 3: Floating-point (IEEE 754)● (mantissa * 2^exponent)● eg. float- 1 sign, 23 mantissa, 8 exponent● PROs: wide range, fast w/ FPU● CONs: accuracyCaveats- NaNs, signed zero/infinity, denorm

rounding, ... IEEE Standards Association. "Standard for Floating-Point Arithmetic." IEEE 754-2008 (2008).

Goldberg, David. "What every computer scientist should know about floating-point arithmetic." ACM Computing Surveys (CSUR) 23.1 (1991): 5-48.

Page 17: Brace yourselves, leap second is coming

Picking the right tool

Rational Numbers - fractions.Fraction

Decimal Fixed-point - decimalBinary Fixed-point - spfpm module

Floating-point issues and limitation

Page 18: Brace yourselves, leap second is coming

Recall: Patriot system

1. Search

2. Validate

3. Track

Next position =Velocity (real) * Time (int -> real)

GAO/IMTEC-92-26 - Software Problem Led to System Failure at Dhahran, Saudi Arabia

Page 19: Brace yourselves, leap second is coming

Patriot system cont’d

● After 8 hours 0.0275 seconds error● After 100 hours 0.3433 seconds error

Scud velocity is ~ 1,676 meters per-second

687 meters error

Page 20: Brace yourselves, leap second is coming

Patriot system cont’d

On February 25, 1991, a Patriot missile defense system operating at Dhahran, Saudi Arabia, during Operation Desert Storm failed to track and intercept an incoming Scud.

This Scud subsequently hit an Army barracks, killing 28 Americans.

GAO/IMTEC-92-26 - Software Problem Led to System Failure at Dhahran, Saudi Arabia

Page 21: Brace yourselves, leap second is coming
Page 22: Brace yourselves, leap second is coming

Let’s talk about time

Page 23: Brace yourselves, leap second is coming

Time is simple, right?

1 Year = 365 days

Leap year?

Page 24: Brace yourselves, leap second is coming

Time is (not) simple

1 Year = 365 or 366 days1 Month = 28/29/30/31 days

Mostly true, except:in Britain 1752, September had 19 daysin Russia 1918, February had 15 daysin Greece 1923, February had 15 days

Page 25: Brace yourselves, leap second is coming

Time is (not) simple

1 Year = 365 or 366 days*1 Month = 28/29/30/31 days*1 Day = 24 hours

Don’t forget DST!it can also be 23/25or 23.5/24.5

Lord Howe Island, Australia

Page 26: Brace yourselves, leap second is coming

Time is (not) simple

1 Year = 365 or 366 days*1 Month = 28/29/30/31 days*1 Day = 24 hours**1 Minute = 60 Seconds

NO- Leap Second might cause a minute to have 61 seconds, up to twice a year...

June 30, 2015 at 23:59

Page 27: Brace yourselves, leap second is coming
Page 28: Brace yourselves, leap second is coming

What could possibly go wrong?

● 2005/8- most NTPs failed to get it right● 2012- Bugs in Linux

○ Reddit, LinkedIn, Yelp, Meetup, Foursquare○ 400 Qantas flights delayed○ Leaping seconds and looping servers○ Linux's leap-second deadlocks

● s += 3600○ "one hour from now" ?○ "same time, next hour" ?

● 1 second in Flight Control = 300 meter

Page 29: Brace yourselves, leap second is coming

Possible solutions

● Simulate leap second○ eg. using adjtimex(8)

● Slewing time (AWS, Google)○ ntptime -s 0 # reset kernel state○ ntpdate -B <some-ntp-server>

● Servers from the future (Google, Facebook)● Have good monitoring

○ NTP offset○ leap second status○ @statscraft

Page 31: Brace yourselves, leap second is coming

Let’s talk about planes

Page 32: Brace yourselves, leap second is coming

787 Boeing Dreamliner

● 30/4/2015- FAA requires operators to do electrical power deactivation at intervals which will not exceed 120 days

FAA- Airworthiness Directives; The Boeing Company Airplanes

Page 33: Brace yourselves, leap second is coming

But, why?

“Boeing ... identified during laboratory testing the software counter internal to the generator control units (GCUs) will overflow after 248 days of continuous power, ... resulting in a loss of all AC electrical power regardless of flight phase”

FAA- Airworthiness Directives; The Boeing Company Airplanes

Page 34: Brace yourselves, leap second is coming

Wait, 248 days?

● 231 / (100 * 60 * 60 * 24) = 248.551

● ie. 231 deciseonds are roughly 248 days

● signed integer?

Page 35: Brace yourselves, leap second is coming

Recall 2-complement

00000000 000000001 100000010 200000011 3…01111111 12710000000 -12810000001 -127…

Arithmetic operations are easy, but might overflow:

00000001 1+

01111111 127=

10000000 -128

Page 36: Brace yourselves, leap second is coming

F**k overflows, I’m using Python

>>> import sys

>>> print sys.maxint, type(sys.maxint)

9223372036854775807 <type 'int'>

>>> print sys.maxint + 1, type(sys.maxint + 1)

9223372036854775808 <type 'long'>

Page 37: Brace yourselves, leap second is coming

Arbitrary Precision

struct _longobject {PyObject_VAR_HEADdigit ob_digit[1];

};

● PROs: unlimited*● CONs: slow, harder to implement

○ think about multiplication● What about builtins? C modules?

○ eg. formatting, unicode, itertools, sqlite

PEP 237 -- Unifying Long Integers and IntegersPython 2.7.10 source code, “Include/longintrepr.h”

Page 38: Brace yourselves, leap second is coming

Know your language (ruby)

[1] pry(main)> a = 1337

=> 1337

[2] pry(main)> a.class

=> Fixnum

[3] pry(main)> a = 2**100

=> 1267650600228229401496703205376

[4] pry(main)> a.class

=> Bignum

Page 39: Brace yourselves, leap second is coming

Know your language (Scala)

scala> 2147483647 + 1

res1: Int = -2147483648

scala> Math.addExact(2147483647, 1)

java.lang.ArithmeticException: integer overflow

scala> Math.pow(2, 1024)

res4: Double = Infinity

Page 40: Brace yourselves, leap second is coming

Know your language (JavaScript)

> Number.MAX_VALUE

1.7976931348623157e+308

> Number.MAX_VALUE*2

Infinity

> Number.MAX_VALUE + 1 - Number.MAX_VALUE

0

> Number.MAX_VALUE - Number.MAX_VALUE + 1

1

Page 41: Brace yourselves, leap second is coming

From the news*

● January 2014:○ >67m players per month○ >27m players per day○ >7.5m concurrently during peak hours○ 946m dollar yearly revenue

● 12/6/2015- The EU West Spectator mode

crashed○ game counter just surpassed 2147483647 (2B)○ ie. 2**31 games...

League of Legends tops MMO revenue list, Hearthstone No. 10LEAGUE PLAYERS REACH NEW HEIGHTS IN 2014EUW spectator mode fell over recently – here’s why

Page 42: Brace yourselves, leap second is coming

Take home message

● MAC addresses aren’t always unique● Real numbers can be deadly● Time is not simple● Beware of the overflow

We all make assumptions, let’s make less