Michael Laing Architect New York Times [email protected] Michael Laing Architect New York...

54

Transcript of Michael Laing Architect New York Times [email protected] Michael Laing Architect New York...

Page 1: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.
Page 2: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Michael LaingArchitect

New York [email protected]

Page 3: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.
Page 4: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.
Page 5: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.
Page 6: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.
Page 7: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.
Page 8: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.
Page 9: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

A Global Mesh with a Memory

Message-based: WebSocket, AMQP, SockJSIf in doubt:• Resend• Reconnect• Reread

Idempotent:• Replicating• Racy• Resolving

Classes of service:• Gold: replicate/race/resolve• Silver: prioritize• Bronze: queueable

Millions of users

Event-driven: async using libev

Page 10: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Message: an event with data

Envelope: Routing while in motion & Locating when at rest

Metadata

Body (opaque to us)

Metadata

Body (may be absent)

Message

Envelope

Page 11: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Message: an event with data

RabbitMQ WebSocket S3 / CloudFront Cassandra

Envelope Routing Key Gateway Connection

UUID “Path” & UUID

Metadata Headers: Map / Array

JSON HTTP Headers JSON

Body Blob Blob Blob Blob

Page 12: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Publish

Message Core

Cassandra

S3 / CloudFront

Gateway Device

Init

AMQP

CQL

WebSocket

HTTP

sync

Page 13: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Subscribe

Message Core

Cassandra

S3 / CloudFront

Gateway Device

Init

AMQP

CQL

WebSocket

HTTP

Page 14: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Dismiss

Message Core

Cassandra

Gateway Device

Init

AMQP

CQL

WebSocket

Core Gateway Device

Cassandra

Page 15: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

CoreCore

CoreCore

Gateway

DeviceMessage

S3 / CloudFront

dozens dozens millions millionsmillions

several

Cassandra

dozens

S3 / CloudFront

S3 / CloudFront

S3 / CloudFront

GatewayGateway

GatewayGateway

GatewayGateway

CassandraCassandraCassandraCassandraCassandra

DeviceDevice

DeviceDevice

DeviceDevice

DeviceDevice

DeviceDevice

DeviceDevice

MessageMessage

MessageMessage

MessageMessage

MessageMessage

MessageMessage

MessageMessage

Connect

Page 16: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Envelope – 2 forms of addressing

“Path”: 1) Routing a message to a user 2) Finding a message for a user

Message nyt aбrik⨍

Page 17: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Envelope – 2 forms of addressing

“Path”: 1) Routing a message to a user 2) Finding a message for a user

“Postoffice”: Routing a message internally in the nyt a rik⨍ б

Message nyt aбrik⨍

Core Gateway

Core Gateway

Page 18: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

The Path hierarchy

Path elements are text (utf-8 but “.” is reserved) – the 1st element is the “category”

“category”: “feeds”,“2nd element”: “breaking-news”“3rd element”: “0012345”

Page 19: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

The Path hierarchy

Path elements are text (utf-8 but “.” is reserved) – the 1st element is the “category”

“category”: “feeds”,“2nd element”: “breaking-news”“3rd element”: “0012345”

The elements are joined by “.” for routing

“path”: “feeds.breaking-news.00123456”

Page 20: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Deeper into the Path hierarchy

For persistence, the path denotes a sorted “folder” containing messages in reverse datetime order (using the timestamp from the version 1 uuid uniquely identifying each message)

“feeds.breaking-news.56”/bd1961f5-1062-11e4-a630-406c8f1838fa“feeds.breaking-news.56”/b94e8b45-1062-11e4-900d-406c8f1838fa

Page 21: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Deeper into the Path hierarchy

For persistence, the path denotes a sorted “folder” containing messages in reverse datetime order (using the timestamp from the version 1 uuid uniquely identifying each message)

“feeds.breaking-news.56”/bd1961f5-1062-11e4-a630-406c8f1838fa“feeds.breaking-news.56”/b94e8b45-1062-11e4-900d-406c8f1838fa

Subscribing to a path is done by “binding”, typically with wildcards: “*” matches any one element, “#” matches any sequence of elements

All breaking-news messages: “feeds.breaking-news.#”

Page 22: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

More on subscribing & retrieving

Retrieving from persistent storage can be done by path, e.g. the “latest” breaking-news messages for item 56:

“feeds.breaking-news.56”

Page 23: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

More on subscribing & retrieving

Retrieving from persistent storage can be done by path, e.g. the “latest” breaking-news messages for item 56:

“feeds.breaking-news.56”

But retrieval can also be done using trailing wild cards:

“feeds.breaking-news.#” will return the “latest” breaking-news messages for all “current” items

Page 24: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

More on subscribing & retrieving

Retrieving from persistent storage can be done by path, e.g. the “latest” breaking-news messages for item 56:

“feeds.breaking-news.56”

But retrieval can also be done using trailing wild cards:

“feeds.breaking-news.#” will return the “latest” breaking-news messages for all “current” items

The Cassandra data store is designed to return hierarchical queries with a single request and in the desired order

Page 25: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

A notable simplification:

Paths for subscribing to messages and paths for retrieving persisted messages, including the use of wild cards, are the same, e.g.:

Page 26: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

A notable simplification:

Paths for subscribing to messages and paths for retrieving persisted messages, including the use of wild cards, are the same, e.g.:

When a user logs in she is “subscribed” using her ID; messages “published” to her will be received while “persisted” messages and subscription preferences are retrieved (a few 10’s of milliseconds)

Page 27: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

A notable simplification:

Paths for subscribing to messages and paths for retrieving persisted messages, including the use of wild cards, are the same, e.g.:

When a user logs in she is “subscribed” using her ID; messages “published” to her will be received while “persisted” messages and subscription preferences are retrieved (a few 10’s of milliseconds)

Once subscription preferences arrive, she will be “subscribed” to them and any corresponding “persisted” messages retrieved

The same paths are used for subscription and retrieval

Page 28: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Special Paths for individual routing

Our subscribers (millions of them) have numeric IDs – using those IDs directly for routing, specifically for the “binding” function, would be inefficient

“id.prefs.09067832” (namespace of 3rd element is too large)

Page 29: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Special Paths for individual routing

Our subscribers (millions of them) have numeric IDs – using those IDs directly for routing, specifically for the “binding” function, would be inefficient

“id.prefs.09067832” (namespace of 3rd element is too large)

Instead we convert the ID to base62 elements and take advantage of the patricia trie search structures built into RabbitMQ and our gateway

“id.prefs.c.2.x.M” (equivalent to the above, used for routing)

Page 30: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice addressing

The “postoffice” is a logical “bus” that connects all the services in all the nyt a rik ⨍ бinstances globally Gateway

Core Gateway

Gateway

Core Gateway

postoffice

logicalview

Page 31: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice addressing

The “postoffice” is a logical “bus” that connects all the services in all the nyt a rik ⨍ бinstances globally

It is physically segmented and the segments are connected using RabbitMQ “federation”

Gateway

Core Gateway

Gateway

Core Gateway

postoffice

logicalview

Page 32: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice address elements

Each nyt aбrik service has 3 basic uniquifying elements:⨍

“region”: “us-west-2”,“instance”: “i-123”,“pid”: “12”

Page 33: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice address elements

Each nyt aбrik service has 3 basic uniquifying elements:⨍

“region”: “us-west-2”,“instance”: “i-123”,“pid”: “12”

And some additional qualifiers:

“product”: “search”,“service”: “route”

Page 34: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice routing key

Each routing key has a “from” address embedded in it:

“region”: “us-west-2”,“instance”: “i-123”,“pid”: “12”,“product”: “search”,“service”: “resolve”

Page 35: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice routing key

Each routing key has a “from” address embedded in it:

“region”: “us-west-2”,“instance”: “i-123”,“pid”: “12”,“product”: “search”,“service”: “resolve”

And a “to” address:

“region”: “us-west-2”,“instance”: “-”,“pid”: “-”,“product”: “search”,“service”: “route”

(the “–” means “any”)

Page 36: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice routing key

Each routing key has a “from” address embedded in it:

“region”: “us-west-2”,“instance”: “i-123”,“pid”: “12”,“product”: “search”,“service”: “resolve”

And a “to” address:

“region”: “us-west-2”,“instance”: “-”,“pid”: “-”,“product”: “search”,“service”: “route”

And an “action”: “action”: “route”

(the “–” means “any”)

Page 37: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice routing key detail

And they are put together as an ordered sequence like this:

<action>.<from address>.<to address>

Page 38: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice routing key detail

And they are put together as an ordered sequence like this:

<action>.<from address>.<to address>

“route.\us-west-2.search.resolve.i-123.12.\us-west-2.search.route.-.-”

Page 39: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice routing key detail

And they are put together as an ordered sequence like this:

<action>.<from address>.<to address>

“route.\us-west-2.search.resolve.i-123.12.\us-west-2.search.route.-.-”

Meaning: This is a request for a “route” action from a specific invocation of the “search” product “resolve” service addressed to any “search” product “route” service in region “us-west-2”

Page 40: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice binding

Each service invocation “binds” (subscribes) to the postoffice using its unique address to get messages specifically directed to it, e.g. asynchronous RPC responses

<any action>.<any address>.<my address>

“*.\*.*.*.*.*.\us-west-2.search.route.i-123.12”

Page 41: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice binding for services

Each service invocation also “binds” to the postoffice using addresses that will select messages appropriate for its service

<my action>.<my domain>.<my service>

“route.\us-west-2.*.*.*.*.\*.*.route.*.*”

Page 42: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Postoffice binding for services

Each service invocation also “binds” to the postoffice using addresses that will select messages appropriate for its service

<my action>.<my domain>.<my service>

“route.\us-west-2.*.*.*.*.\*.*.route.*.*”

All this address manipulation is handled by common methods in the nyt aбrik⨍

Page 43: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Routing in the Core

For load balancing on entry to the nyt aбrik⨍ Core

Message

Core

Core

or

Page 44: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Routing in the Core

For replication of important (gold service) messages

Message

Core

Core

and

Page 45: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Routing in the Core

For distribution to all consumers

Core

Core Gateway Device

Gateway Device

Page 46: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Problems with Core instances

Complex connectivity: N(N-1) federation + clustering + …

Page 47: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Problems with Core instances

Complex connectivity: N(N-1) federation + clustering + …

Many services: input, process, resolve, reject, cache_push, …

Page 48: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Problems with Core instances

Complex connectivity: N(N-1) federation + clustering + …

Many services: input, process, resolve, reject, cache_push, …

Hence, problematic to manage

Page 49: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Problems with Core instances

Complex connectivity: N(N-1) federation + clustering + …

Many services: input, process, resolve, reject, cache_push, …

Hence, problematic to manage

And difficult to autoscale

Page 50: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Possible solution: refactor and simplify

A new component, the Rabbit Router, to focus on connectivity and routing

Page 51: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Possible solution: refactor and simplify

A new component, the Rabbit Router, to focus on connectivity and routing

A New Core, with a focus on services

Page 52: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Possible solution: refactor and simplify

A new component, the Rabbit Router, to focus on connectivity and routing

A New Core, with a focus on services

Everything connected to a Rabbit Router

Page 53: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.

Possible solution: refactor and simplify

A new component, the Rabbit Router, to focus on connectivity and routing

A New Core, with a focus on services

Everything connected to a Rabbit Router

The “bus” becomes a “star”

Page 54: Michael Laing Architect New York Times michael.laing@nytimes.com Michael Laing Architect New York Times michael.laing@nytimes.com.