Cassandra summit

24
Friday, August 10, 12

Transcript of Cassandra summit

Page 1: Cassandra summit

Friday, August 10, 12

Page 2: Cassandra summit

Friday, August 10, 12

Page 3: Cassandra summit

Friday, August 10, 12

Page 4: Cassandra summit

Friday, August 10, 12

Page 5: Cassandra summit

Friday, August 10, 12

Page 6: Cassandra summit

Flexible schema

Easily to scale, increased redundancy

Fast enough for web requests

Consolidate existing services

Hadoop support

Friday, August 10, 12

Page 7: Cassandra summit

Friday, August 10, 12

Page 8: Cassandra summit

Friday, August 10, 12

Page 9: Cassandra summit

FUDNo Ad-hoc queries

No Indexes

No range queries

Limited tooling

Code complexity

Friday, August 10, 12

Page 10: Cassandra summit

Friday, August 10, 12

Page 11: Cassandra summit

Friday, August 10, 12

Page 12: Cassandra summit

Thrift

CQL

REST

Friday, August 10, 12

Page 13: Cassandra summit

SOLR Schema

<?xml version="1.0" encoding="UTF-8" ?><schema name="my_column_family" version="1.0">

<types> <fieldType name="string" class="solr.StrField"/> <fieldType name="date" class="solr.DateField"/> </types>

<fields> <field name="id" type="string" indexed="true" stored="true"/> <field name="name" type="string" indexed="true" stored="true"/> <field name="released_at" type="date" indexed="true" stored="true"/> </fields>

<uniqueKey>id</uniqueKey> <defaultSearchField>name</defaultSearchField></schema>

Friday, August 10, 12

Page 14: Cassandra summit

Basic Queries

http://localhost:8983/solr/my_keyspace.my_column_family/select?q=name:foo

SELECT * FROM my_column_family WHERE solr_query='name:foo';

Friday, August 10, 12

Page 15: Cassandra summit

Wide Rows<?xml version="1.0" encoding="UTF-8" ?>

<schema name="my_column_family" version="1.0">

<types> <fieldType name="string" class="solr.StrField"/> <fieldType name="date" class="solr.DateField"/> </types>

<fields> <field name="id" type="string" indexed="true" stored="true"/> <field name="name" type="string" indexed="true" stored="true"/> <field name="released_at" type="date" indexed="true" stored="true"/> <dynamicField name="wide_*" type="string" indexed="true" stored="true"/> </fields>

<uniqueKey>id</uniqueKey> <defaultSearchField>name</defaultSearchField></schema>

Friday, August 10, 12

Page 16: Cassandra summit

Fuzzy Search<schema name="my_column_family" version="1.0">

<types> <fieldType name="string" class="solr.StrField"/> <fieldType name="ngram" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" preserveOriginal="1"/> <filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="15"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> </types> <fields> <field name="id" type="string" indexed="true" stored="true" /> <field name="name" type="string" indexed="true" stored="true" /> <field name="name_fuzzy" type="ngram" indexed="true" stored="true" /> </fields> <copyField source="name" dest="name_fuzzy"/> <uniqueKey>id</uniqueKey> <defaultSearchField>name</defaultSearchField></schema>

Friday, August 10, 12

Page 17: Cassandra summit

• Full-text indexing

• Trigrams

• Rich data formats (PDF, Word, HTML)

• Easy interop (REST,CSV, XML, JSON)

• Geo-spatial search

• Highlighting

• Auto-suggest

• Faceted search and filtering

Friday, August 10, 12

Page 18: Cassandra summit

Friday, August 10, 12

Page 19: Cassandra summit

Storm

Friday, August 10, 12

Page 20: Cassandra summit

Storm

Friday, August 10, 12

Page 21: Cassandra summit

Increased performance by 700% while growing

data by 500%

Friday, August 10, 12

Page 22: Cassandra summit

Reduced operational costs by 40%

Friday, August 10, 12

Page 23: Cassandra summit

Deleted 15,000 lines of code

Friday, August 10, 12

Page 24: Cassandra summit

Friday, August 10, 12