Cabs, Cassandra, and Hailo (at Cassandra EU)

download Cabs, Cassandra, and Hailo (at Cassandra EU)

of 68

  • date post

    01-Dec-2014
  • Category

    Technology

  • view

    3.189
  • download

    1

Embed Size (px)

description

My talk from #CassandraEU covering Hailo's use of Cassandra including insight from developers, operations and management, plus lessons learned.

Transcript of Cabs, Cassandra, and Hailo (at Cassandra EU)

  • 1. Cabs, Cassandra, and HailoDavid Gardner, Architect at Hailo #CASSANDRAEUCASSANDRASUMMITEU

2. #CASSANDRAEUCASSANDRASUMMITEU 3. #CASSANDRAEUCASSANDRASUMMITEU 4. 0.6 to 1.2 1,352 changed files with 235,413 additions and 47,487 deletions 7,429 commits 1,653 tickets completedhttps://github.com/apache/cassandra/compare/cassandra-0.6.0...cassandra-1.2 https://github.com/apache/cassandra/blob/trunk/CHANGES.txt#CASSANDRAEUCASSANDRASUMMITEU 5. What this talk is about Cassandra adoption at Hailo from three perspectives: 1. Development 2. Operational 3. Management#CASSANDRAEUCASSANDRASUMMITEU 6. What is Hailo? Hailo is The Taxi Magnet. Use Hailo to get a cab wherever you are, whenever you want.#CASSANDRAEUCASSANDRASUMMITEU 7. #CASSANDRAEUCASSANDRASUMMITEU 8. #CASSANDRAEUCASSANDRASUMMITEU 9. #CASSANDRAEUCASSANDRASUMMITEU 10. What is Hailo? The worlds highest-rated taxi app over 11,000 five-star reviews Over 500,000 registered passengers A Hailo hail is accepted around the world every 4 seconds Hailo operates in 15 cities on 3 continents from Tokyo to Toronto in nearly 2 years of operation#CASSANDRAEUCASSANDRASUMMITEU 11. Hailo is growing Hailo is a marketplace that facilitates over $100M in run-rate transactions and is making the world a better place for passengers and drivers Hailo has raised over $50M in financing from the world's best investors including Union Square Ventures, Accel, the founder of Skype (via Atomico), Wellington Partners (Spotify), Sir Richard Branson, and our CEO's mother, Janice#CASSANDRAEUCASSANDRASUMMITEU 12. The history The story behind Cassandra adoption at Hailo#CASSANDRAEUCASSANDRASUMMITEU 13. Hailo launched in London in November 2011 Launched on AWS Two PHP/MySQL web apps plus a Java backend Mostly built by a team of 3 or 4 backend engineers MySQL multi-master for single AZ resilience#CASSANDRAEUCASSANDRASUMMITEU 14. Why Cassandra? A desire for greater resilience become a utility Cassandra is designed for high availability Plans for international expansion around a single consumer app Cassandra is good at global replication Expected growth Cassandra scales linearly for both reads and writes Prior experience I had experience with Cassandra and could recommend it #CASSANDRAEUCASSANDRASUMMITEU 15. The path to adoption Largely unilateral decision by developers a result of a startup culture Replacement of key consumer app functionality, splitting up the PHP/MySQL web app into a mixture of global PHP/Java services backed by a Cassandra data store Launched into production in September 2012 originally just powering North American expansion, before gradually switching over Dublin and London #CASSANDRAEUCASSANDRASUMMITEU 16. One year on... Further breakdown of functionality into Go/Java SOA Migrating all online databases to Cassandra#CASSANDRAEUCASSANDRASUMMITEU 17. Development perspective#CASSANDRAEUCASSANDRASUMMITEU 18. Cassandra just works Dom W, Senior Engineer#CASSANDRAEUCASSANDRASUMMITEU 19. Use cases 1. Entity storage 2. Time series data#CASSANDRAEUCASSANDRASUMMITEU 20. CF = customers 126007613634425612: createdTimestamp: email: givenName: familyName: locale: phone:#CASSANDRAEU1370465412 dave@cruft.co Dave Gardner en_GB +447911111111CASSANDRASUMMITEU 21. Considerations for entity storage Do not read the entire entity, update one property and then write back a mutation containing every column Only mutate columns that have been set This avoids read-before-write race conditions#CASSANDRAEUCASSANDRASUMMITEU 22. #CASSANDRAEUCASSANDRASUMMITEU 23. CF = stats_db 2013-06-01: 55374fa0-ce2b-11e2-8b8b-0800200c9a66: a48bd800-ce2b-11e2-8b8b-0800200c9a66: b0e15850-ce2b-11e2-8b8b-0800200c9a66: bfac6c80-ce2b-11e2-8b8b-0800200c9a66:#CASSANDRAEU{action: {action: {action: {action:CASSANDRASUMMITEU 24. CF = stats_db LON123456: 13b247f0-ce2c-11e2-8b8b-0800200c9a66: 20f70a40-ce2c-11e2-8b8b-0800200c9a66: 2b44d3b0-ce2c-11e2-8b8b-0800200c9a66: 338a22f0-ce2c-11e2-8b8b-0800200c9a66:#CASSANDRAEU{action: {action: {action: {action:CASSANDRASUMMITEU 25. #CASSANDRAEUCASSANDRASUMMITEU 26. Considerations for time series storage Choose row key carefully, since this partitions the records Think about how many records you want in a single row Denormalise on write into many indexes#CASSANDRAEUCASSANDRASUMMITEU 27. Client libraries Gossie (Go) Astyanax (Java) phpcassa (PHP)#CASSANDRAEUCASSANDRASUMMITEU 28. Analytics With Cassandra we lost the ability to carry out analytics eg: COUNT, SUM, AVG, GROUP BY We use Acunu Analytics to give us this abilty in real time, for preplanned query templates It is backed by Cassandra and therefore highly available, resilient and globally distributed Integration is straightforward #CASSANDRAEUCASSANDRASUMMITEU 29. events#CASSANDRAEUNSQAcunuC*CASSANDRASUMMITEU 30. AQL SELECT SUM(accepted), SUM(ignored), SUM(declined), SUM(withdrawn) FROM Allocations WHERE timestamp BETWEEN '1 week ago' AND 'now AND driver='LON123456789 GROUP BY timestamp(day) #CASSANDRAEUCASSANDRASUMMITEU 31. #CASSANDRAEUCASSANDRASUMMITEU 32. Operational perspective#CASSANDRAEUCASSANDRASUMMITEU 33. Allows a team of 2 to achieve things they wouldnt have considered before Cassandra existed Chris H, Operations Engineer#CASSANDRAEUCASSANDRASUMMITEU 34. #CASSANDRAEUCASSANDRASUMMITEU 35. 6machines per region3regionsus-east-1eu-west-1us-east-1eu-west-1Operational ClusterclustersStats Cluster3(stats cluster is a long story)ap-southeast-1#CASSANDRAEUCASSANDRASUMMITEU 36. eu-west-1us-east-1ap-southeast-1AZ1AZ1AZ1AZ1AZ1AZ1AZ2AZ2AZ2AZ2AZ2AZ2AZ3AZ3AZ3AZ3AZ3AZ3#CASSANDRAEUCASSANDRASUMMITEU 37. Stats ClusterAWS VPCs with Open VPN links 3 AZs per regionm1.large machines~ 1TB/nodeProvisoned IOPS EBS#CASSANDRAEUOperational Cluster~ 200GB/nodeCASSANDRASUMMITEU 38. Backups SSTable snapshot Used to upload to S3, but this was taking >6 hours and consuming all our network bandwidth Now take EBS snapshot of the data volumes#CASSANDRAEUCASSANDRASUMMITEU 39. Encryption Requirement for NYC launch We use dmcrypt to encrypt the entire EBS volume Chose dmcrypt because it is uncomplicated Our tests show a 1% performance hit in disk performance, which concurs with what Amazon suggest#CASSANDRAEUCASSANDRASUMMITEU 40. Datastax Ops Centre is a quick win#CASSANDRAEUCASSANDRASUMMITEU 41. Multi DC Something that Cassandra makes trivial Would have been very difficult to accomplish active-active inter-DC replication with a team of 2 without Cassandra Rolling repair needed to make it safe (we use LOCAL_QUORUM) We schedule narrow repairs on different nodes in our cluster each night#CASSANDRAEUCASSANDRASUMMITEU 42. Compression Our stats cluster was running at ~1.5TB per node We didnt want to add more nodes With compression, we are now back to ~600GB Easy to accomplish `nodetool upgradesstables` on a rolling schedule#CASSANDRAEUCASSANDRASUMMITEU 43. Management perspective#CASSANDRAEUCASSANDRASUMMITEU 44. The days of the quick and dirty are over Simon V, EVP Operations#CASSANDRAEUCASSANDRASUMMITEU 45. Technically, everything is fine Our COO feels that C* is technically good and beautiful, a perfectly good option Our EVPO says that C* reminds him of a time series database in use at Goldman Sachs that had very good performancebut there are concerns #CASSANDRAEUCASSANDRASUMMITEU 46. People who can attempt to query MySQL People who can attempt to query Cassandra#CASSANDRAEUCASSANDRASUMMITEU 47. #CASSANDRAEUCASSANDRASUMMITEU 48. Lessons learned#CASSANDRAEUCASSANDRASUMMITEU 49. There might be a gulf in experience#CASSANDRAEUCASSANDRASUMMITEU 50. 10Average years experience per team memberMySQL #CASSANDRAEUCassandra CASSANDRASUMMITEU 51. Lesson learned Have an advocate - get someone who will sell the vision internally Learn the theory - teach each team member the fundamentals Make an effort to get everyone on board#CASSANDRAEUCASSANDRASUMMITEU 52. Things can drift into failure#CASSANDRAEUCASSANDRASUMMITEU 53. #CASSANDRAEUCASSANDRASUMMITEU 54. #CASSANDRAEUCASSANDRASUMMITEU 55. #CASSANDRAEUCASSANDRASUMMITEU 56. #CASSANDRAEUCASSANDRASUMMITEU 57. #CASSANDRAEUCASSANDRASUMMITEU 58. Lesson learned Be pro-active with Cassandra, even if it seems to be running smoothly Peer-review data models, take time to think about them Big rows are bad - use cfstats to look for them Mixed workloads can cause problems - use cfhistograms and look out for signs of data modeling problems Think about the compaction strategy for each CF #CASSANDRAEUCASSANDRASUMMITEU 59. EBS is terrible#CASSANDRAEUCASSANDRASUMMITEU 60. Lessons learned EBS is nearly always the cause of Amazon outages EBS is a single point of failure (it will fail everywhere in your cluster) EBS is slow EBS is expensive EBS is unnecessary!#CASSANDRAEUCASSANDRASUMMITEU 61. Management need to know the trade offs#CASSANDRAEUCASSANDRASUMMITEU 62. Lessons learned Keep the business informed explain the tradeoffs in simple terms Sing from the same hymn sheet Make sure there solutions in place for every use case from the beginning#CASSANDRAEUCASSANDRASUMMITEU 63. People who can attempt to query MySQL#CASSANDRAEUPeople who can attempt to query CassandraCASSANDRASUMMITEU 64. Conclusions#CASSANDRAEUCASSANDRASUMMITEU 65. We like Cassandra Solid design HA characteristics Easy multi-DC setup Simplicity of operation#CASSANDRAEUCASSANDRASUMMITEU 66. Lessons for successful adoption Have an advocate, sell the dream Learn the fundamentals, get the best out of Cassandra Invest in tools to make life easier Keep management in the loop, explain the trade offs#CASSANDRAEUCASSANDRASUMMITEU 67. The future We will continue to invest in Cassandra as we expand globally We will hire people with experience running Cassandra We will focus on expanding our reporting facilities We aspire to extend our network (1M consumer installs, wallet) beyond cabs We