Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

46
Eric Redmond @coderoshi Riak Search 2.0

Transcript of Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Page 1: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Eric Redmond@coderoshi

Riak Search 2.0

Page 2: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

After two years, Riak 2.0 is out

Page 3: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

┬─┬ ( ^_^ノ)

Setup

Page 4: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

## The enabled Search set this 'on'. search = off !## The port number which Solr binds to. search.solr_port = 10014 !## The port number which Solr JMX binds to. search.solr_jmx_port = 10013 !## The arguments to pass to the Solr JVM. Non-standard ## arguments, i.e. -XX, may not be portable across JVM ## implementations. E.g. -XX:+UseCompressedStrings. search.solr_jvm_args = -Xms1g -Xmx1g -XX:+UseStringCache -XX:+UseCompressedOops

riak.conf

Page 5: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

## The enabled Search set this 'on'. search = on !## The port number which Solr binds to. search.solr_port = 10014 !## The port number which Solr JMX binds to. search.solr_jmx_port = 10013 !## The arguments to pass to the Solr JVM. Non-standard ## arguments, i.e. -XX, may not be portable across JVM ## implementations. E.g. -XX:+UseCompressedStrings. search.solr_jvm_args = -Xms1g -Xmx1g -XX:+UseStringCache -XX:+UseCompressedOops

riak.conf

Page 6: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

riak-admin cluster join [email protected] ... and so on... riak-admin cluster plan riak-admin cluster commit

Page 7: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

riak-admin cluster join [email protected] ... and so on... riak-admin cluster plan riak-admin cluster commit

riak-admin security add-user eric 12345 riak-admin security add-user admin 123456 riak-admin security grant search.query ON index simple TO any riak-admin security grant search.admin ON schema TO admin

Page 8: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

require 'riak' client = Riak::Client.new !

# create index client.create_search_index('simple') !

# tie the index to bucket 'cats' bucket = Riak::Bucket.new(client, 'cats') bucket.props = { search_index: 'simple' }

Page 9: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Indexing Datatypes

Page 10: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

UTF8中搜索UTF8חיפוש בΑναζήτηση σε UTF8

Искать в UTF8

Search in UTF8

Page 11: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Advanced Searchaka. Next Level Search

Page 12: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

solr = RSolr.connect(url: 'http://yokozuna01.bos1:8098/solr/docs')

resp = solr.get('select', params: {q: '*:*'})

Page 13: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014
Page 14: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Facets, Stats, and stuff

Page 15: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

eDisMax

defType=edismax

Page 16: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Highlighting

Page 17: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Languages

Page 18: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Map/Reduce

Page 19: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

{"inputs": { "module":"yokozuna", "function":"mapred_search", "arg":["docs","title_s:Key* AND language_s:en"] }, "query":[ {"map":{ "language":"javascript", "keep":false, "source":"function(v) { return [1]; }"}}, {"reduce"{ "language":"javascript", "keep":true, "name":"Riak.reduceSum" }} ] }

Page 20: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014
Page 21: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

bucket = Riak::Bucket.new(client, 'people') bucket.props = { search_index: 'faces' } bucket.get('stuart').store bucket = Riak::Bucket.new(client, 'cats') bucket.props = { search_index: 'faces' } bucket.get('stuart').store bucket = Riak::Bucket.new(client, 'mountains') bucket.props = { search_index: 'faces' } bucket.get('hood').store

Page 22: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

riak-admin bucket-type create faces '{"props":{"search_index":"faces"}}' !

bucket = Riak::Bucket.new(client, ‘people') bucket.get('stuart', type:'faces').store bucket = Riak::Bucket.new(client, ‘cats') bucket.get('stuart', type:'faces').store bucket = Riak::Bucket.new(client, ‘mountains') bucket.get('stuart', type:'faces').store

Page 23: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Schemas

Page 24: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

<schema name="default" version="1.5"> <fields> <field name="kinds" type="string" indexed="true" stored="false" multiValued="true" /> <field name="name" type="string" indexed="true" stored="true" /> ! <dynamicField name="*_ss" type="string" indexed="true" stored="true" multiValued="true"/> ! <field name="_yz_ed" type="_yz_str" indexed="true" stored="false"/> <field name="_yz_pn" type="_yz_str" indexed="true" stored="false"/> <field name="_yz_fpn" type="_yz_str" indexed="true" stored="false"/> <field name="_yz_vtag" type="_yz_str" indexed="true" stored="false"/> <field name="_yz_node" type="_yz_str" indexed="true" stored="false"/> <field name="_yz_rk" type="_yz_str" indexed="true" stored="true"/> <field name="_yz_rb" type="_yz_str" indexed="true" stored="true"/> </fields> </schema>

Page 25: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Fallen Nodes

Page 26: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Why?

Page 27: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Thermocline

Page 28: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Thermocline of Replication

Page 29: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Goals

Page 30: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

But... why?

Page 31: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014
Page 32: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

How

Page 33: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Yokozuna = Glue

Page 34: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Extractorsyz_extractor.erl yz_json_extractor.erl yz_noop_extractor.erl yz_text_extractor.erl yz_xml_extractor.erl yz_doc.erl

Page 35: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Interfaces

yz_pb_admin.erl yz_pb_search.erl yz_schema.erl yz_wm_extract.erl yz_wm_index.erl yz_wm_schema.erl yz_wm_search.erl

Page 36: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Security

Page 37: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Permissions

riak-admin security grant search.admin ON index TO admin

riak-admin security grant search.query ON index TO user

riak-admin security grant search.admin ON schema TO admin

riak-admin security grant search.query ON index wiki TO user

Page 38: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Administrationyz_pb_admin.erl yz_schema.erl yz_wm_index.erl yz_wm_schema.erl

Page 39: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Searchesyz_pb_search.erl yz_wm_search.erl

Page 40: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Backend Magicyokozuna.erl yz_app.erl yz_general_sup.erl yz_kv.erl yz_misc.erl yz_sup.erl

Page 41: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Distribution

yz_cover.erl yz_events.erl

Page 42: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

AAEyz_entropy.erl yz_entropy_mgr.erl yz_exchange_fsm.erl yz_index.erl yz_index_hashtree.erl yz_index_hashtree_sup.erl

Page 43: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Solr code

EntropyData.java Monitor.java yz_solr.erl yz_solr_proc.erl yz_solr_sup.erl

Page 44: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Stats

yz_stat.erl yz_stat_worker.erl

Page 45: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

“Testing only shows the presence, not

the absence, of bugs”

aae_test.erl yokozuna_essential.erl yz_errors.erl yz_fallback.erl yz_flag_transitions.erl yz_index_admin.erl yz_languages.erl yz_mapreduce.erl yz_monitor_solr.erl yz_pb.erl yz_rs_migration.erl yz_rt.erl yz_schema_admin.erl yz_siblings.erl yz_solr_start_timeout.erl yz_stat_test.erl yz_wm_extract_test.erl yz_component_tests.erl yz_json_extractor_tests.erl yz_kv_tests.erl yz_misc_tests.erl yz_text_extractor_tests.erl yz_xml_extractor_tests.erl yz_driver.erl yz_file_terms.erl

Dijkstra

Page 46: Eric Redmond – Distributed Search on Riak 2.0 - NoSQL matters Barcelona 2014

Thanks

Eric Redmond@coderoshi