Logical sharding at Etsy
Transcript of Logical sharding at Etsy
![Page 1: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/1.jpg)
“Just” shard it Logical sharding at Etsy
Maggie Zhou @zmagg
ScaleConf 2015
![Page 2: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/2.jpg)
![Page 3: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/3.jpg)
![Page 4: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/4.jpg)
@zmagg
What’s the infrastructure?
![Page 5: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/5.jpg)
@zmagg
L A M P
![Page 6: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/6.jpg)
@zmagg
L A M P
![Page 7: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/7.jpg)
@zmagg
![Page 8: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/8.jpg)
@zmagg
2019
http://surge.omniti.com/2011/speakers/ross-snyder
![Page 9: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/9.jpg)
@zmagg
L A M P
yay, databases!
![Page 10: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/10.jpg)
@zmagg
tickets index
shard 1 shard 2 shard 3 shard n
![Page 11: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/11.jpg)
@zmagg
??
![Page 12: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/12.jpg)
@zmagg
master-master replication
![Page 13: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/13.jpg)
@zmagg
tickets index
shard 1 shard 2 shard 3 shard n
ORM
![Page 14: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/14.jpg)
@zmagg
tickets index
shard 1 shard 2 shard 3 shard n
shard n+1
![Page 15: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/15.jpg)
@zmagg
Capacity planning included setting aside
2 months for load balancing
![Page 16: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/16.jpg)
@zmagg
2 months??
![Page 17: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/17.jpg)
@zmagg
2010’s solution is Not Scaling
![Page 18: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/18.jpg)
@zmagg
shop_1 shop_2 shop_3 shop_4
…
shop_2
shard N shard N+1
index
shop_2
…
…writes locked
1)
2)
3) update index, remove lock
![Page 19: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/19.jpg)
@zmagg
Migrations were• error prone
• arbitrary
• developers had to be aware
• created orphaned data
• locked shops & users out of changes for up to hours
• slow
![Page 20: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/20.jpg)
@zmagg
Migrations wereError prone? We can fix the errors! We can make the script more robust!
Arbitrary? We can write tooling that figures out which rows are right to migrate off for optimal balance!
Developers had to be aware? We can write better interfaces!
![Page 21: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/21.jpg)
@zmagg
Orphaned data?• Deletes are expensive, so we didn’t do them.
• Migrations created orphaned data on old hosts that were still picked up by full table scans (downstream systems: search, analytics).
!
![Page 22: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/22.jpg)
@zmagg
Migrations were• error prone
• arbitrary
• developers had to be aware
• created orphaned data
• locked shops & users out of changes for up to hours
• slow
![Page 23: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/23.jpg)
@zmagg
Let’s talk about slowness…
What if we could move more than one row at a time?
![Page 24: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/24.jpg)
@zmagg
<<enter logical sharding>>
![Page 25: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/25.jpg)
@zmagg
Well, okay, why didn’t we do this in the first
place?
![Page 26: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/26.jpg)
@zmagg
You have to run your site to learn your data
access patterns.
![Page 27: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/27.jpg)
@zmagg
photo from gizmodo http://gizmodo.com/5632095/justin-bieber-has-dedicated-servers-at-twitter
![Page 28: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/28.jpg)
@zmagg
listing from https://www.etsy.com/shop/NausicaaDistribution
![Page 29: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/29.jpg)
@zmagg
listing from https://www.etsy.com/shop/Mugnificentart
![Page 30: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/30.jpg)
@zmagg
Scaling Pinterest, 2012 “to increase capacity, a server is replicated and the new replica is responsible for some DBs”
slide 32 https://speakerdeck.com/yashh/scaling-pinterest
Sharding & IDs at Instagram, 2012 “simply by moving a set of logical shards from one database to another"
http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
![Page 31: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/31.jpg)
@zmagg
shard1 shard2
… shard 10
db_host1 db_host2 db_host3 db_hostN
shard1 shard2
… shard10
shard11 shard12
… shard20
shard21 shard22
… shard30
!
!
…
![Page 32: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/32.jpg)
@zmagg
db_host1
shard1 shard2
… shard10
db_host2
shard2
![Page 33: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/33.jpg)
@zmagg
![Page 34: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/34.jpg)
![Page 35: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/35.jpg)
![Page 36: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/36.jpg)
![Page 37: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/37.jpg)
How did we move onto this new architecture?
![Page 38: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/38.jpg)
@zmagg
We used the old row-based migration framework, one last time.
![Page 39: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/39.jpg)
We migrated all the sharded data (120TB) without
downtime or developer disruption.
![Page 40: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/40.jpg)
@zmagg
![Page 41: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/41.jpg)
@zmagg
It took us 5 months to move that data using
the old way.
![Page 42: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/42.jpg)
@zmagg
Today it would take us a few hours.
![Page 43: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/43.jpg)
@zmagg
What’d we build?• made tooling logical shard aware
• generated database configs so that we could have two-way O(1) mappings between logical shards and physical hosts.
![Page 44: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/44.jpg)
@zmagg
DNS related outage…
![Page 45: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/45.jpg)
@zmagg
Nice side effects!• schema changes (alters) now run in parallel,
significantly faster!
• no more orphaned data!
• downstream analytics systems replicate faster at the shard-by-shard level!
![Page 46: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/46.jpg)
@zmagg
The Future
![Page 47: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/47.jpg)
Do you have any limits to how many shard hosts you can add?
![Page 48: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/48.jpg)
@zmagg
Well, yes…It’s 999…because of technical debt.
Code base riddled with checks like this:
!
!
(We can support ~16x data growth, on current day hardware)
![Page 49: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/49.jpg)
@zmagg
2019
http://surge.omniti.com/2011/speakers/ross-snyder
![Page 50: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/50.jpg)
@zmagg
Resources• Using a tickets database for ID generation: http://code.flickr.net/2010/02/08/ticket-servers-distributed-
unique-primary-keys-on-the-cheap/
• Using master-master replication: https://codeascraft.com/2012/04/20/two-sides-for-salvation/
• Etsy’s shard architecture, the 2012 edition: http://www.percona.com/live/mysql-conference-2012/sessions/etsy-shard-architecture-starts-s-and-ends-hard
• Scaling Etsy, 2011: http://surge.omniti.com/2011/speakers/ross-snyder
• Instagram’s shard & id architecture: http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
• Scaling Pinterest: https://speakerdeck.com/yashh/scaling-pinterest
• Morgue, Etsy’s postmortem tool https://github.com/etsy/morgue
![Page 51: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/51.jpg)
Questions?
![Page 52: Logical sharding at Etsy](https://reader034.fdocuments.net/reader034/viewer/2022042507/55a782691a28ab1f3e8b47bb/html5/thumbnails/52.jpg)
“Just” shard it Logical sharding at Etsy
Maggie Zhou @zmagg
ScaleConf 2015