A Quick Look At Cassandra

download A Quick Look At Cassandra

of 30

  • date post

  • Category


  • view

  • download


Embed Size (px)


This from a presentation I did at the Phoenix Java User Group on 11/10/2010.

Transcript of A Quick Look At Cassandra

  • 1. A Quick Look At Bryan Williams

2. No SQL 3. History Created at Facebook in 2007 Open Sourced in 2008 Currently version 0.6.6 Version 0.7 in Beta 3 4. CAP Theorem Consistency Availability Partition Tolerance 5. Scaling Vertical More RAM Faster CPU Faster HD Horizontal More Servers Shared Load 6. Features Decentralized (peer to peer)ElasticShared Nothing ArchitectureTuneable ConsistencyAlways WriteableOptimized for excellent throughput on writes 7. Influences BigTable column family data model High Throughput Writes Dynamo Hight availabilty Scalability Eventual Consistency (Tuneable) 8. Data Model Cluster Keyspace Column Families Super Columns Columns 9. Cassandras CLI (Command Line Interface) 10. Secondary IndexesUse another column family with reverselookup Specify Metadata on the Column Familyand set the index name and type Support coming in 0.7 11. Writes Commit LogsMemtableSSTableHinted HandoffBloom FilterTombstone 12. Partitioning Random PartitionerOrder Preserving PartitionerCollating Order Preserving PartitionerByte Order Partitioner 13. Snitches Simple SnitchProperty File Snitch 14. Column Sorting AsciiType BytesType LexicalUUIDType LongType IntegerType TimeUUIDType UTF8Type Custom 15. Replication FactorSet per keyspace Specified in servers config file Indicates how many nodes you want tostore a value in on every write 16. Consistency Level Set per querySpecified by the clientIndicates how many nodes the client has decided must respond for a successful read/writeBased on replication factor, not on the number of nodes in the system 17. Write Consistency Levels Zero: No response requiredAny: 1st response from any nodeOne: 1st response (counting Hints)quorum: n/2 + 1All: All replicas must respond 18. Read ConsistencyLevels One: The first response is takenQuorum: N/2 + 1 replicas are required to respondAll: All replicas are required to respond 19. Gossiper Protocol used for intra-ring communicationRuns every second on a timerUsed by hinted-handoff 20. Anti-Entropy Replica synchronization mechanismEnsures data on different nodes are up to datemerkle treesHappens after each update 21. Read Repair When a read operation found inconsistent data in different nodesTimestamp for all replicas are checkedall replicas are updated based on most recent valueWeak vs Strong consistency entails whether Read Repair happens before or after returning results 22. Replication Strategies Simple Strategy Old Network Topology Strategy Network Topology Strategy 23. Java ClientOptions Thrift : http://incubator.apache.org/thriftAvro : http://avro.apache.orgHector :https://github.com/rantav/hectorPelops :http://code.google.com/p/pelopsKundera : http://code.google.com/p/kundera More : http://wiki.apache.org/cassandra/ClientOptions 24. Cassandra: The Definitive GuideAuthor:EbenHewitt Publisher:Oreilly Release:LateNovember 25. Thanks For ComingBryan Williams Email : Bwilliams@integrallis.com Twitter : @BryWilliams LINKS http://cassandra.apache.org http://wiki.apache.org https://github.com/ericflo/twissandra