Nats and netlify
-
Upload
ryan-neal -
Category
Engineering
-
view
493 -
download
0
Transcript of Nats and netlify
NATS and Netlify Building out a data plane for a globally distributed system
@ry_boflavin @netlify
About Me: Ryan Neal- Head of Infrastructure at Netlify
- Simultaneously fixing and breaking everything
- Senior Dev at Yelp
- Internal tools and metrics team
- Used to about 400k metrics/sec
- 12-18k pageviews/sec
- FDE at Palantir
- Developed counter terrorist software
- 4 Billion records / day
@ry_boflavin @netlify
@ry_boflavin @netlify
A developer’s toolkit for deploying git-backed,
browser-driven sites to an intelligent CDN
- Global CDN
- CI cluster
- Redundant DNS
- Prerender cluster
- Mongo cluster
- Rails cluster
- 4 cloud providers
- 14 PoPs
API cluster
Global CDNPre-Render cluster
CI cluster
Distributed systems are cool
buildbotbuildbot
buildbotbuildbot
APIAPI
APIAPI
CDN CDN CDN CDN CDN CDN CDN APIAPI
DB land
The problem
your request
@ry_boflavin @netlify
CDN Node
db
The problem
your request
@ry_boflavin @netlify
CDN Node
db
The problem
your request
@ry_boflavin @netlify
CDN Node
proxy
db
The problem
your request
@ry_boflavin @netlify
CDN Node
proxy origin
db
The problem
your request
@ry_boflavin @netlify
CDN Node
proxy origin
api
db
The problem
your request
@ry_boflavin @netlify
CDN Node
db
XX
X
Unity- Cohesive view of system
- Traceability between services
- Build for now, not later
The Naive Solution
@ry_boflavin @netlify
random service
logs Papertrail daemon
papertrailrandom service
Immediate Problem
- Make the logs searchable
- Easy to add more logs
Long Term Vision
- A generic system to let services push data out
- An easy way to access that data for new and fun uses
Tool Requirements
- Easy installation
- Good scaling factors
- Secure
Spec before building
@ry_boflavin @netlify
And so the story begins...
@ry_boflavin @netlify
Rabbit MQ
- Existing infrastructure
- Didn’t need enterprise messaging features
- Data was only metrics, telemetry and logs
Kafka
- Didn’t want to run zookeeper
- Didn’t need rewind or buffering
Creating the Data plane
@ry_boflavin @netlify
logs nats
random service
Creating the Data plane
@ry_boflavin @netlify
logs nats
random service
streamer
Creating the Data plane
@ry_boflavin @netlify
random service
logs nats
random service
streamer
Creating the Data plane
@ry_boflavin @netlify
random service
logs nats
random service
streamer elastinats
es
Creating the Data plane
@ry_boflavin @netlify
random service
logs nats
random service
streamer
elastinats
elastinats
elastinats
elastinats
es
Creating the Data plane
@ry_boflavin @netlify
random service
logs nats
random service
streamer
taptap
elastinats
elastinats
elastinats
elastinats
es
Elastinats lessons
@ry_boflavin @netlify
func(m *nats.Msg) { stats.IncrementMessagesConsumed() go func() { payload := message.NewPayload(string(m.Data), m.Subject)
// maybe it is json! _ = json.Unmarshal(m.Data, payload) c <- payload }()}
func(m *nats.Msg) { stats.IncrementMessagesConsumed() payload := message.NewPayload(string(m.Data), m.Subject)
// maybe it is json! _ = json.Unmarshal(m.Data, payload) c <- payload}
- Don’t block the consumer
Elastinats lessons
@ry_boflavin @netlify
- Don’t block the consumer
- Use ES’s Bulk API
Elastinats lessons
@ry_boflavin @netlify
- Don’t block the consumer
- Use ES’s Bulk API
- Add error reporting
handle := func(nc *nats.Conn, sub *nats.Subscription, err error) {log.Warn(err)
}
nc, err := nats.Connect(serverString, nats.Secure(tlsConfig), nats.ErrorHandler(handle))if err != nil {
panic(err)}
Elastinats lessons
@ry_boflavin @netlify
- Don’t block the consumer
- Use ES’s Bulk API
- Add error reporting
- Use buffering
ch := make(chan *nats.Msg, 100000)sub, err := nc.ChanSubscribe(subject, ch)if err != nil {
log.Fatal("Failed to subscribe")}defer sub.Unsubscribe()
sub, err := nc.SubscribeSync(subject)if err != nil {
log.Fatal("Failed to subscribe")}defer sub.Unsubscribe()err := sub.SetPendingLimits(numMsgs, numBytes)
Future Work
@ry_boflavin @netlify
Future Work
@ry_boflavin @netlify
- Use a nats_metrics library to measure and push to nats
- Add more taps for log analysis
- Migrate legacy services to push based metrics and logs
@ry_boflavin @netlify
Questions?
Linkshttps://github.com/netlify/elastinats
https://github.com/netlify/streamer
https://github.com/rybit/nats_metrics
@ry_boflavin @netlify
https://github.com/rybit
Check out the slides on slideshare!!