Riak at shareaholic

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)


Slides from my talk on using Riak at Shareaholic

Transcript of Riak at shareaholic

  • 1.Riak @ Robby Grossmanrobby@shareaholic.com @freerobby

2. AgendaShareaholic: Product & TechWhy Riak: The Search for a Big Data StoreTransitioning to RiakRiak Use CasesDeploying to EC2 3. Whats ? 4. Browser Tools 5. Sharing Buttons 6. Recommendations 7. Social Analytics 8. Monthly @ Thousands of developers hitting API Hundreds of thousands of publishers Tens of millions of shares & clicks Hundreds of millions of pageviews & events 9. Tech @JRuby on Rails (via Torquebox)MySQL (Master, Read Slave)Elastic MapReduce (similar to Hadoop)RedisFormerly Mongo, Now Riak 10. Why Not Mongo?Working set needs to t in memoryGlobal write lock blocks all queriesdespite not having transactions/joinsStandbys not hot 11. Why Riak? 12. Next @Options:Goals:HBase Linear scalabilityCassandra Full-text searchRiakFlexible indexingEasier Devops 13. HBaseProsConsBattle tested ComplexArchitectureHigh performanceSPOFsRequires Hive forIndexing/QueryingExpensive to deployat small scale 14. CassandraPros ConsNative secondary Known users allindicesdomain expertsLinear scalability Search requires LuceneTunable CAP Heavy Weight MapReduce 15. RiakProsConsOperationally simpler Multi-data centerreplication requiresLinear scalabilityEnterprise productIntegrated search leveldb puts highstrain on CPUSecondary indicesTunable CAPVector clocks solvetime-sync problems 16. From Mongo to Riak 17. Migration GoalsNo time where database goes ofineProduct parity throughout migration 18. Migration Process1. App writes to Mongo and Riak2. Verify data integrity3. Import historical data4. App reads from Riak5. Decommission Mongo 19. Use Cases 20. Share APISave shared contentUses MapReduce topopulate user dashboard 21. RecommendationsSets of related pagesGenerated on-demand 22. Publisher AnalyticsGenerated nightly via HadoopTypical stored document (JSON)80kb-1Mb 23. Riak Successes 24. MapReduceHandy for queryingRuns at web page speed.Easy to re-reduce for complex queriesEasy to test via CURL 25. Tunable CAP @Replication: primary/secondary authorityRead failure tolerance: speed/consistencyWrite failure tolerance 26. Full Text SearchBuilt on LuceneMake user content searchableMake arbitrary keys queryableJust turn it onHiccup: corrupt merge indexes 27. Query ExampleWhos our oldest user whos shared something in the last minute?curl -XPOST http://localhost:8098/mapred -H Content-Type: application/json -d { "inputs": {"bucket":"links","query":"timestamp:[1346350877 TO 1346350937}" //60 second period }, "query":[{"map":{"language":"javascript","source":"function(riakObject) { return [[Riak.mapValuesJson(riakObject)[0].user_id]];}"}},{"reduce":{"language":"javascript", "name":"Riak.reduceMin" // [[2],[5],[9],[13]] => [[2]]}} ]}[[2197]] 28. Riak on EC2 29. In a NutshellEC2 specs poorly proportioned for leveldbMultiple AZs in one location works wellScale vertically for better latency & consistencyScale horizontally for more throughput/$ 30. BenchmarksTop Graph: c1.medium (1.7G, 5 CPU)Middle: m1.large (7.5G, 4 CPU)Bottom: cc1.4xlarge (23G, 33.5 CPU) 31. Throughput 32. Latency (Typical) 33. Latency (Worst Case) 34. Calculationsc1.medium (1.7G, 5 CPU)1758 IOPS/$-hrWorst 1% of queries: 300ms/800msm1.large (7.5G, 4 CPU)1167 IOPS/$-hrWorst 1% of queries: 110ms/200mscc1.4xlarge (23G, 33.5 CPU)872 IOPS/$-hrWorst 1% of queries: 47ms/139ms 35. Benchmark Takeaways You cant go by spec IO is limiting factor RAM never limiting factor for 1% of keyspace to be in memory 36. Fin. Questions?Thanks: Were Hiring!Tom SanteroRobby GrossmanJustin Sheehyrobby@shareaholic.comRyan Zezeski @freerobbyReid Draper#freenode riak crew 37. Fin.