Unique ID generation in distributed systems
-
Upload
dave-gardner -
Category
Technology
-
view
22.864 -
download
4
description
Transcript of Unique ID generation in distributed systems
ID generation
PHP London 2012-08-02@davegardnerisme
@davegardnerisme
hailoapp.com/dave(for a £5 discount)
Web AppMySQL
DC 1
MySQL auto increment
1,2,3,4…
MySQL auto increment
• Numeric IDs
• Go up with time
• Not resilient
Web AppMySQL
DC 1
MySQL multi-master replication
MySQL
1,3,5,7…
2,4,6,8…
MySQL multi-master replication
• Numeric IDs
• Do not go up with time
• Some resilience
Going global…
DC 1
DC 2
DC 3
DC 4
DC 5
DC 6
Web App
DC 1
MySQL in multi DC setup
MySQL
Web App
DC 2
?
1,2,3…
WAN LINK
Web App
DC 1
Flickr MySQL ticket server
Ticket Server
Web App
DC 2
1,3,5…
WAN LINK
Ticket Server
4,6,8…
WAN link not required to generate an ID
Flickr MySQL ticket server
• Numeric IDs
• Do not go up with time
• Resilient and distributed
• ID generation separated from data store
DC
The anatomy of a ticket server
Web App
Web App
Web App
Web App
Ticket Server
DC
Making things simpler
ID gen
Web App
ID gen
Web App
ID gen
Web App
ID gen
Web App
UUIDs
• 128 bits
• Could use type 4 (Random) or type 1 (MAC address with time component)
• Can generate on each machine with no co-ordination
Type 4 – random
xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
f47ac10b-58cc-4372-a567-0e02b2c3d479
version
variant (8, 9, A or B)
5.3 x 1036
possible values for a type 4 UUID
1.1 x 1019
UUIDs we could generate per second since the Universe began
2.1 x 1027
Olympic swimming pools filled if each possible value contributed a millilitre
Type 1 – MAC address
51063800-dc76-11e1-9fae-001c42000009
• Time component is based on 100 nanosecond intervals since October 15, 1582
• Most significant bits of timestamp shifted to least significant bits of UUID
Type 1 – MAC address
• The address (MAC) of the computer that generated the ID is encoded into it
• Lexical ordering essentially meaningless
• Deterministically unique
There are some other options…
No co-ordination needed
Deterministically unique
K-ordered (time-ordered lexically)
Twitter Snowflake
• Under 64 bits
• No co-ordination (after startup)
• K-ordered
• Scala service, Thrift interface, uses Zookeeper for configuration
Twitter Snowflake
41 bits Timestampmillisecond precision,
bespoke epoch
10 bits Configured machine ID
12 bits Sequence number
Twitter Snowflake
77669839702851584
= (timestamp << 22) | (machine << 12) | sequence
Boundary Flake
• 128 bits
• No co-ordination at all
• K-ordered
• Erlang service
Boundary Flake
64 bits Timestampmillisecond precision,
1970 epoch
48 bits MAC address
16 bits Sequence number
PHP Cruftflake
• Based on Twitter Snowflake
• No co-ordination (after startup)
• K-ordered
• PHP, ZeroMQ interface, uses Zookeeper for configuration
Questions?
References
Flickr distributed ticket serverhttp://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/
UUIDshttp://tools.ietf.org/html/rfc4122
How random are random UUIDs?http://stackoverflow.com/a/2514722/15318
Twitter Snowflakehttps://github.com/twitter/snowflake
Boundary Flakehttps://github.com/boundary/flake
PHP Cruftflakehttps://github.com/davegardnerisme/cruftflake
private function mintId64($timestamp, $machine, $sequence){ $timestamp = (int)$timestamp; $value = ($timestamp << 22) | ($machine << 12) | $sequence; return (string)$value;}
private function mintId32($timestamp, $machine, $sequence){ $hi = (int)($timestamp / pow(2,10)); $lo = (int)($timestamp * pow(2, 22)); // stick in the machine + sequence to the low bit $lo = $lo | ($machine << 12) | $sequence;
// reconstruct into a string of numbers $hex = pack('N2', $hi, $lo); $unpacked = unpack('H*', $hex); $value = $this->hexdec($unpacked[1]); return (string)$value;}
public function generate(){ $t = floor($this->timer->getUnixTimestamp() - $this->epoch); if ($t !== $this->lastTime) { $this->sequence = 0; $this->lastTime = $t; } else { $this->sequence++; if ($this->sequence > 4095) { throw new \OverflowException('Sequence overflow'); } } if (PHP_INT_SIZE === 4) { return $this->mintId32($t, $this->machine, $this->sequence); } else { return $this->mintId64($t, $this->machine, $this->sequence); }}