Frontend at Scale - The Tumblr Story
-
Upload
chris-miller -
Category
Engineering
-
view
2.870 -
download
3
description
Transcript of Frontend at Scale - The Tumblr Story
Frontend at Scalethe Tumblr story
What is Tumblr?→ Platform for you to express yourself
→ ~200 million blogs
→ 83+ billion posts
→ HQ in NYC
→ Founded in 2007
→ 100+ engineers
What is Tumblr?→ Three ways to surface
content:
→ The dashboard
What is Tumblr?→ Three ways to surface
content:
→ The dashboard
→ Search
What is Tumblr?→ Three ways to surface
content:
→ The dashboard
→ Search
→ Blog network
!
(Example: http://16-bitch.tumblr.com/)
Who am I?
→ Chris Miller
→ Product Engineering Manager
→ Content Consumption (a.k.a., The Dashboard)
Our stack→ Frontend
→ Backbone (+ lodash, underscore, etc.)
→ jQuery (+ some plugins)
→ SASS (+ Bourbon)
→ a bit of VelocityJS
→ Gulp for build
Our stack
→ Backend
→ PHP application layer
→ Some specialized services (Scala, C, etc.)
→ Data: MySQL, Redis, memcache, HDFS
How does it work?
→ 1000’s of servers
→ Deploy dozens of times per day
→ Monitor and measure everything
→ Hadoop
→ OpenTSDB (backed by HBase)
Our process
→ Teams are small
→ Iterate quickly
→ Release early and often, usually to % of users
→ 2 code review “ok’s” required for all Pull Requests
Feature Flagging
Feature Flagging
What is it?
→ Segregate your users to certain features
→ Control who sees what (and when)
Feature Flagging
Implementation→ Server-side feature flagging
→ Client-side feature flagging
Feature Flagging
Usage
→ Provides
→ A/B testing
→ Run beta code alongside production code
→ Kill switch
Feature Flagging
A/B Testing→ Injected recommendations
→ A/B(/*) testing of positioning
→ Which position is the best? Why?
Feature Flagging
A/B Test Results→ Injected recommendations
→ A/B(/*) testing of positioning
→ Which position is the best? Why?
Position 2
Position 3
Position 4
Position 5
Position 6
Position 7
Position 8
Position 9
Feature Flagging
Ramping & Kill Switch
→ Ramping new features
→ Deploy to only “admin” (staff)
→ …then 1% of users… then 5%… 10%… 25%…
→ Kill switch
→ Completely turn off a feature that’s breaking the site… poof
Feature Flagging
Use Carefully→ Feature flagging certain functionality can give a mixed
experience
→ Can cause user confusion:
→ “Why does my mom see this and I don’t?” — Confused teenager
→ Easy to build complex dependencies — don’t
Error Logging
Error Logging
Launching Features→ New features usually have bugs
→ (Well, not my code)
→ (just kidding)
Error Logging
Error Logging→ New features usually have bugs
→ Server-side errors, easy to find
Error Logging
Error Logging→ New features usually have bugs
→ Client-side errors, also easy to find…
→ …on my browser
Error Logging
Error Logging→ New features usually have bugs
→ Client-side errors, not easy to find on your browser
→ …until recently
Error Logging
Capture Errors→ We built: exceptions.js
→ Really, it’s just: window.onerror
Error Logging
Capture Errors→ Build dependency-free
→ Build to be defensive
Error Logging
Capture Errors→ What you do with the logs doesn’t matter; it’s how you use it
→ We log errors to Scribe…
→ …throw them into Hadoop
→ …and count frequency with OpenTSDB
Error Logging
Error Data→ With Hive, we can query Hadoop:
→ With this, I can see we log around 1.4 million errors per day
Error Logging
Error Data→ With OpenTSDB we can plot the frequency of logs
Error Logging
We Love Graphs→ We made pretty graphs with OpenTSDB and graph everything
Getting it Right→ Sometimes we find errors before our users do.
→ Sometimes.
→ And it makes us feel good.
Getting it Right→ So we dance.
Thank You
Email - [email protected] me - ee99ee.com