Designing Resilient Application Platforms with Apache Cassandra - Hayato Shimizu (DataStax)
-
Upload
jaxlondonconference -
Category
Technology
-
view
123 -
download
3
description
Transcript of Designing Resilient Application Platforms with Apache Cassandra - Hayato Shimizu (DataStax)
Building Highly Available Services Using Cassandra
USE jax_london;!SELECT * FROM presenters WHERE name = ‘Hayato Shimizu’;! name | title | company | area!----------------+---------------------+----------+------! Hayato Shimizu | Solutions Architect | DataStax | EMEA!
Apache Cassandra
• Created by Avinash Lakshman and Prashant Malik at Facebook • Amazon Dynamo + Google BigTable • Highly distributed database with data replication for redundancy • Active-Active Multi DC, master-less design – no single point of failure • High throughput! • Linearly scalable – volume, throughput • Used by many mission critical applications and services • 2.0 is out!
©2013 DataStax Confidential. Do not distribute without consent. 2
C* Architecture – Data Replication
©2013 DataStax Confidential. Do not distribute without consent. 3
• Token Range 0 -> 2127-1 in Ring Formation • Consistent Hashing Algorithm • Replica nodes in clockwise
C* Architecture - No Single Point of Failure
©2013 DataStax Confidential. Do not distribute without consent. 4
• Client Load Balances • Do not use a hardware LB
C* Architecture - Multi DC Replication
©2013 DataStax Confidential. Do not distribute without consent. 5
C* Architecture – Data Consistency
• C* offers TUNABLE consistency • Client decides consistency per query • ANY, ONE, TWO, THREE, QUORUM, LOCAL_QUORUM,
EACH_QUORUM, ALL • QUORUM = (replication_factor / 2 ) + 1 • Replication Factor = 3 can maintain Quorum with tolerance of 1 node
failure
©2013 DataStax Confidential. Do not distribute without consent. 6
Setting Up Cassandra for Multi DC On each node – edit the following file:
conf/cassandra-rackdc.properties
With the following entry: dc=DC1
rack=RACK1
On each node – edit the following file: conf/cassandra.yaml
With the following entry: endpoint_snitch: GossipingPropertyFileSnitch
Create keyspace:
CREATE KEYSPACE new_keyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 3, 'DC2' : 3};
©2013 DataStax Confidential. Do not distribute without consent. 7
C* Architecture – Data Centre Configuration
©2013 DataStax Confidential. Do not distribute without consent. 8
Data 1 Data 1
Data 1
Data 2
Data 2 Data 2 Data 3
Data 3
Data 3
Data 4
Data 4 Data 4
Cassandra Architecture - Writes
©2013 DataStax Confidential. Do not distribute without consent. 9
INSERT INTO…
Commit log
memtable
SSTable
Tooling
• Cassandra Download (http://planetcassandra.org) • DataStax Enterprise Download (http://www.datastax.com/download) • DataStax JAVA Driver (http://github.com/datastax/java-driver) • DataStax DevCenter (http://www.datastax.com)
©2013 DataStax Confidential. Do not distribute without consent. 10
Building
Highly Available Services
©2013 DataStax Confidential. Do not distribute without consent. 11
Single Data Centre
• Resiliency through C* Data Replication
©2013 DataStax Confidential. Do not distribute without consent.
12
Multi DC – Active/Passive
• Wasteful • Do you test this? Does it actually work when it fails over? • What is the decision point for failing over? • Do you try and fix your problem in the active DC? • Is it a manual process? • How long does it take to failover to passive DC? • How many people and which departments will need to be involved? • Incident managers?
©2013 DataStax Confidential. Do not distribute without consent. 13
Active-Everywhere is the Norm
Datacenter
Cloud
Source: (http://www.datastax.com/resources/whitepapers/bigdata) ©2013 DataStax Confidential. Do not distribute without consent. 14
Design Considerations - Active-Everywhere DC Strategies
• 24 x 7 services are what businesses and consumers now expect • Service failure costs money and reputational damage • 99.999+% service up time?
• Data Replication Strategies • Consistent data replication across all DCs • Eventually consistent replication across DCs
©2013 DataStax Confidential. Do not distribute without consent. 15
Design Considerations - Data Replication Strategies
• Latency is not going away – embrace it • Possible Solutions
• Sharded users • Full data consistency in all DCs • Eventually consistency to other DCs
©2013 DataStax Confidential. Do not distribute without consent. 16
Design Considerations - Full Consistency Across All Data Centres
• Does your service really require this? • Performance considerations • Think about your service usage patterns • Failure scenarios
• WAN Link failure
©2013 DataStax Confidential. Do not distribute without consent. 17
Design Considerations - Eventual Consistency Across DCs
• Identify data access patterns for each service • Data access patterns
• Write-Only • Read-Only • Mixture of both • Access frequency
©2013 DataStax Confidential. Do not distribute without consent. 18
Design Considerations - Failure Scenarios
• Data centre total failure – natural disaster, power, etc • Network storm • Network kit firmware upgrade failure • SAN Upgrades – wrong Fibre Channel cable pulled out • WAN link failure • Service dependency failure • Etc, etc
• Failure probabilities - do your maths!
©2013 DataStax Confidential. Do not distribute without consent. 19
User Session Persistence to One DC
©2013 DataStax Confidential. Do not distribute without consent. 20
DC1
C*
Service
Session 1
DC2
C*
Service
Session 2
Async Replication
DC Session Persistence Technique 1
• GTM – Global Traffic Management • DNS based solution • Hardware / SaaS solutions • Traffic weighting for each DC • Persistence guaranteed in private network using hardware • Internet facing slightly more difficult – DNS RFC spec
©2013 DataStax Confidential. Do not distribute without consent. 21
DC Session Persistence Technique 2
• A famous company providing edge based load balancing • Users connect via their service • Cookie / query string based
©2013 DataStax Confidential. Do not distribute without consent. 22
DC1 DC2
Edge Load Balancer https https
async
DC Session Persistence Technique 3
©2013 DataStax Confidential. Do not distribute without consent. 23
Application Tier Resilience
• Make it fault tolerant – stateless. • Make it horizontally scalable • Load balancer stickiness – really? • Use C* to store sessions - sessions will recover in a DR scenario.
©2013 DataStax Confidential. Do not distribute without consent. 24
App Tier App Tier
Session1 Session1
App Tier App Tier
Session1
Cassandra Replication Session1
Seamless Application Releases & System Maintenance
• 99.999+% SLA includes maintenance! • C* rolling upgrades • Kernel patching etc
• Schema Changes – C* will help • Code should now handle the data structure versions • Code deployment - statelessness will help here again!
©2013 DataStax Confidential. Do not distribute without consent. 25
Business Intelligence
©2013 DataStax Confidential. Do not distribute without consent. 26
Embracing the Cloud
• High demand can kill your service – make it scalable • Bursting into the cloud for peak load • Flexible provisioning model • DR on the cheap
©2013 DataStax Confidential. Do not distribute without consent. 27
Conclusions
©2013 DataStax Confidential. Do not distribute without consent. 28
Conclusion
• Developers - think about your infrastructure. Don’t just leave it to the Ops or DevOps teams.
• Ops / DevOps Engineers – think about the application and learn how they work.
• Collaborate with each other. • Building out resilient infrastructure is not that hard. Just requires some
thoughts, communications, and execution. • Think about scale. • Keep IT Simple • Use great tools like Cassandra
©2013 DataStax Confidential. Do not distribute without consent. 29
Thank You
Twitter @hayato_shimizu @datastaxEU @planetcassandra @cassandraEUROPE
©2013 DataStax Confidential. Do not distribute without consent. 30
Downloads http://planetcassandra.org http://www.datastax.com http://cassandra.apache.org
Thank You
Q&A
©2013 DataStax Confidential. Do not distribute without consent. 31