Administering and Monitoring SolrCloud Clusters
-
Upload
sematext-group-inc -
Category
Technology
-
view
112 -
download
3
description
Transcript of Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud
Rafał Kuć – Sematext Group, Inc.@kucrafal @sematext sematext.com
Ta me…
Sematext consultant & engineerSolr.pl co-founderFather and husband
Solr Server
SolrCloud Concepts
Solr Server
Solr Server Solr Server
Shard1 Replica
Shard2 Replica
Shard2Shard1
Application
Local SolrCloud Cluster
java -Dbootstrap_confdir=./solr/revolution/conf -Dcollection.configName=revolution -DzkRun -DnumShards=1 -jar start.jar
Runs embedded ZooKeeperBootstraps collection with 1 shardsStarts Solr
Starting Solr Cluster
ZooKeeper ZooKeeper ZooKeeper
Solr Server Solr Server
-DzkHost=192.168.1.2:2181,192.168.1.1:2181,192.168.1.3:2181
Solr Server Solr Server
-DzkHost=192.168.1.1:2181,192.168.1.2:2181,192.168.1.3:2181
-DzkHost=192.168.1.3:2181,192.168.1.1:2181,192.168.1.2:2181
-DzkHost=192.168.1.3:2181,192.168.1.1:2181,192.168.1.2:2181
No Collection
No Collection No Collection
No Collection
Uploading Collection Configuration
./zkcli.sh -cmd upconfig -zkhost 192.168.1.1:2181 -confdir ./conf/ -confname revolution
ZooKeeper
ZooKeeper
ZooKeeper
Collection configuration Solr
Collections APICreate
Delete
Reload
Split
Create Alias
Delete Alias
Shard Creation/Deletionhttp://wiki.apache.org/solr/SolrCloud
Collection Creation
curl 'http://solrhost:8983/solr/admin/collections?action=CREATE&name=revolution&numShards=3&replicationFactor=4'
name
numShards
replicationFactor
maxShardsPerNode
createNodeSet
collection.configName
Collection Split Example
$ curl 'http://solr1:8983/solr/admin/collections?action=CREATE&name=collection1&numShards=2&replicationFactor=1'
Collection Split Example
$ curl 'http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=collection1&shard=shard1'
Getting Deeper – CoreAdmin API
curl 'http://solrhost:8983/solr/admin/cores?action=CREATE&name=newcore&collection=revolution&shard=shard2'
collection
shard
numShards
collection.configName
Schema – the API
Reading (Solr 4.2)FieldsDynamic fieldsTypesCopy fieldsName (4.3)Version (4.3)Unique Key (4.3)Similarity (4.3)
Writing (Solr 4.4)Adding new fieldsAdding copy fields
Reading Your Schema
curl -XGET 'http://solrhost:8983/solr/rev/schema/fields/name'
Full reference: http://wiki.apache.org/solr/SchemaRESTAPI
{ "responseHeader" : { "status" : 0, "QTime" : 5 }, "field" : { "name" : "name", "type" : "text_general", "indexed" : true, "stored" : true }}
Dynamic Schema Modifications<schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory>
curl -XPUT 'http://solrhost:8983/solr/rev/schema/fields/content' –d'{ "type" : "text", "stored" : "false", "copyFields" : ["catchAll"]}'
curl -XPOST 'http://solrhost:8983/solr/rev/schema/copyFields' -d '[ { "source" : "name", "dest" : [ "text", "personal" ] }]'
The Right Directory
_0.fdt _0.fdx _0.fnm _0.nvd
_1.fdt _1.fdx _1.fnm _1.nvd
StandardDirectory
SimpleFSDirectory
NIOFSDirectory
MMapDirectory
NRTCachingDirectory
RAMDirectory <directoryFactory name="DirectoryFactory" class="solr.NRTCachingDirectoryFactory" />
Segment Merging
a b c d e
Level 0 Level 1
cf g
Segment Merge Under Control
Merge policy
Merge scheduler
Merge factor
Merge policy configuration
https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig
Autocommit or Not?
<autoCommit> <maxTime>15000</maxTime> <maxDocs>1000</maxDocs> <openSearcher>false</openSearcher></autoCommit>
<autoSoftCommit> <maxTime>1000</maxTime> </autoSoftCommit>
Automatic data flush (hard commit)
Automatic index view refresh
Caches
q=lucene+revolution
fq=city:Dublin
Solr Cache
Refreshed with IndexSearcher
Configurable
Different purposes
Different implementations
Monitoring Importance
What to Pay Attention to?
Cluster State
Health
Shards and replica status
Shard placement
Failing nodes
Indexing Related Metrics
Index throughput
Document distribution
I/O subsystem metrics
Merging
Search - related Metrics
Count
Latency
Distribution among nodes
Anomalies and spikes
Monitoring Memory and GC
Heap details
Pool size
Pool utilization
Garbage collection count
Garbage collection time
Monitoring OS Related Metrics
CPU details
Load
I/O activity
Network usage
Solr Administration Panel
Solr & JMX<jmx />
java -Dcom.sun.management.jmxremote –jar start.jar
Solr & JMX
SPMIndex statistics
Request # and latency
Caches and warmup
CPU
JVM Memory and OS Memory
Garbage collector
OS related statistics
SPM Dashboard
Other Monitoring Tools
Ganglia http://ganglia.sourceforge.net/
New Relic http://www.newrelic.com/
Opsview http://www.opsview.com
Too much is too much
Too hot
Caches
We Are Hiring !
Dig Search ?Dig Analytics ?Dig Big Data ?Dig Performance ?Dig working with and in open – source ?We’re hiring world – wide !
http://sematext.com/about/jobs.html
Rafał Kuć @kucrafal [email protected]
Sematext @sematext http://sematext.com http://blog.sematext.com
SPM discount code: LR2013SPM20
Thank You !
@ Sematext booth ;)