Cassandra at arkivum

download Cassandra at arkivum

of 23

  • date post

  • Category


  • view

  • download


Embed Size (px)



Transcript of Cassandra at arkivum

  • 1. Cassandra at Arkivum Richard Lowe, Principal Engineer Arkivum
  • 2. 2 About Arkivum We offer a safe, secure archive service for digital data We use data archiving expertise to keep data for the long-term: for years, decades or forever Our service allows our customers to meet their compliance needs and asset retention goals whilst focusing on their core business Arkivum Limited, 2012
  • 3. 3 Our architecture Gateway appliance is installed at customer site running our software, talking across WAN using secure VPN to our software in our DCs File data is encrypted and stored on variety of storage media, including SSD, hard disk and tape Focus is on maintaining long term data integrity, not low latency or high availability Arkivum Limited, 2012
  • 4. 4 Legacy design Original code used an SQL database Our knowledge was biased towards RDBMS Normalization, J DBC, ACID, mature platform The software design assumed S QL Indexes and ad-hoc queries gave basic search functionality for relatively little extra effort Arkivum Limited, 2012
  • 5. 5 Relational model of a file systemCREATE TABLE files ( file_id VARCHAR NOT NULL PRIMARY KEY, parent_id VARCHAR NOT NULL, name VARCHAR NOT NULL, size BIGINT DEFAULT 0, created_date DATETIME DEFAULT CURRENT_TIMESTAMP, modified_date DATETIME DEFAULT CURRENT_TIMESTAMP, owner_uid INT DEFAULT 0, owner_gid INT DEFAULT 0, file_mode INT DEFAULT 493, file_attr INT DEFAULT 0, UNIQUE(parent_id, name)); Arkivum Limited, 2012
  • 6. 6 Relational model of a file systemGet a file by idSELECT * FROM files WHERE file_id = f90b3e92-0e96-482f-b4e5-f1ca071f26d6;List all files in a particular directorySELECT * FROM files WHERE parent_id = e98eaaaa-07a6-4ffa-bd21-f3975529718b;List all files modified in April 2010 and sort by sizeSELECT * FROM files WHERE modified_date > 2010-03-31 AND modified_date < 2010-05-01 ORDER BY bytesize DESC; Arkivum Limited, 2012
  • 7. 7 Why Cassandra? Scalability Meets our need to scale to billions of records Designed for high-availability, high-throughput environments Replication Data safety is paramount to us Cassandra replication is a really strong feature Stability Well supported and used worldwide in high-profile, high-end production systems Arkivum Limited, 2012
  • 8. 8 Cassandra model of a file systemApproach 1: Pretend were using a relational database Files CF "parent_id" "name" "size" "modified" "accessed" "gid" "uid" "mode" file_id UUID UTF8 Long Long Long Long Long Long Use column families as if theyre tables Use CQL because its like SQL Create secondary indexes for everything in case we want to query on it later Arkivum Limited, 2012
  • 9. 9 Cassandra model of a file systemApproach 1 doesnt work Column families are not tables CQL looks like SQL, but isnt SELECT * FROM Files WHERE modified > 2010-03-31; Secondary indexes arent cheap Cant sort based on column values, only on column names Arkivum Limited, 2012
  • 10. 10 Cassandra model of a file systemApproach 2: Use composite types and blobs Files CF (name, size, mtime, atime, gid, uid, mode) (parent_id, file_id) file_blob Serialize file record and store as single object instead of multiple values Use actual values as part of composite column name, so we can search and sort based on them Arkivum Limited, 2012
  • 11. 11 Cassandra model of a file systemApproach 2 doesnt work either Need to know all the values for a composite to query based on it - otherwise it means a range query, which is expensive file_exists = len(list(files_cf.get_range( start = CompositeType(MIN_UUID, file_id), finish = CompositeType(MAX_UUID, file_id), row_count = 1))) == 1 Sorting compares the entire composite, not each field [CompositeType(apples, 6), CompositeType(bananas, 2), CompositeType(oranges, 5), CompositeType(pears, 4)] Arkivum Limited, 2012
  • 12. 12 Cassandra model of a file systemApproach 3: De-normalize Files CF Directories CF "file" name file_id file_blob parent_id file_blob Look at the most common queries and optimize for those Most lookups should require just a single get or slice query Speed vs. space: do we really care if a record is stored twice? Arkivum Limited, 2012
  • 13. 13 Cassandra model of a file systemApproach 3 worksGet a file by idfile = unpackFile( files_cf.get(key=file_id, columns=[file]))List all files in a particular directoryfiles = unpackFiles(list( directories_cf.get(key=directory_id))) Arkivum Limited, 2012
  • 14. 14 Lessons learned CQL isnt necessarily the easiest or best interface Break the golden rule Composites are useful under limited circumstances Avoid wide rows, they can lead to pain Should focus on queries that are most important Post-processing or Map/Reduce can be used to meet needs of less common queries Arkivum Limited, 2012
  • 15. 15 Cassandra and network usage10Mbit connection, replicating to 2 nodes Arkivum Limited, 2012
  • 16. 16 Cassandra and network usageSo how can it be used on a slow WAN? Tune down the message and packet size rpc_send_buff_size_in_bytes rpc_recv_buff_size_in_bytes thrift_framed_transport_size_in_mb thrift_max_message_length_in_mb Be prepared for higher failure rates when things get busy rpc_timeout_in_ms Use an additional cache layer to reduce network I/O Arkivum Limited, 2012
  • 17. 17 Cassandra and network usage10MBit connection, replicating to 2 nodes, after tuning Arkivum Limited, 2012
  • 18. 18 Cassandra and network usageCassandra replicationis better than DIYalternative Arkivum Limited, 2012
  • 19. 19 Configuring Cassandra is keyCassandra has lots of configuration options.Taking time to understand and tweak them is worth the effort. Leavingthem as defa