Lucene revolu+on 2013
SIMPLE & “CHEAP” SOLR CLUSTER
Stéphane GamardSearchbox CTO
1Lucene revolu+on 2013
Lucene revolu+on 2013 3
Searchbox -‐ Search as a Service
“We are in the business of providing search engines on demand”
Lucene revolu+on 2013
Solr Provisioning
4
High Availability• Redundancy• Sustained QPS• Monitoring• Recovery
Index Provisioning• Collec+on crea+on• Cluster resizing• Node distribu+on
Lucene revolu+on 2013
Solr Clustering
5
LB
Master
Slave
Slave
Master
Slave
Backup Backup
Master
Slave
Slave
LB
Monitoring
Before 4.x:
Master/SlaveCustom Rou+ngComplex Provisioning
Lucene revolu+on 2013
Solr Clustering
6
A6er 4.x:
NodesAutoma+c Rou+ngSimple Provisioning
Node
Monitoring
Node Node Node
ZK
NodeNode Node
ZK
ZKLB LB
Thank you to the SolrCloud Team !!!
Lucene revolu+on 2013
What is SolrCloud?
7
Backward compa=bility• Plain old Solr (with Lucene 4.x)• Same schema• Same solrconfig• Same plugins
Some plugins might need update (distrib)
Lucene revolu+on 2013
What is SolrCloud?
8
Centralized configura=on
• /conf
• /conf/schema.xml
• /conf/solrconfig.xml
• numShards
• replica+onFactor
• ...
Node
Monitoring
Node Node Node
ZK
NodeNode Node
ZK
ZKLB LB
Lucene revolu+on 2013
What is SolrCloud?
9
Configura=on & Architecture Agnos=c Nodes
Node
Monitoring
Node Node Node
ZK
NodeNode Node
ZK
ZKLB LB
• ZK driven configura+on
• Shard (1 core)
• ZK driven role:
• Leader
• Replica
• Peer & Replica+on
• Disposable
Lucene revolu+on 2013
What is SolrCloud?
10
Automa=c Rou=ng
Node
Monitoring
Node Node Node
ZK
NodeNode Node
ZK
ZKLB LB
• Smart client connect to ZK
• Any node can forward a requests to node that can process it
Lucene revolu+on 2013
What is SolrCloud?
11
Collec=on API• Abstrac+on level• An index is a collec+on• A collec+on is a set of shards• A shard is a set of cores
• CRUD API for collec+on
“Collec?ons represents a set of cores with iden)cal configura?on. The set of cores of a collec?on covers the en?re index”
Lucene revolu+on 2013
What is SolrCloud?
12
Node
Core
Shard
Collec=on Abstrac+on level of interac+on & config
Scaling factor for collec+on size (numShards)
Scaling factor for QPS (replica?onFactor)
Scaling factor for cluster size (liveNodes)
=> SolrCloud is highly geared toward horizontal scaling
Lucene revolu+on 2013 13
nodes => Single effort for scalability
That’s SolrCloud
High Availability• Redundancy• Sustained QPS• Monitoring• Recovery
# replicas
ZK (clusterstatus, livenodes)peer & replica+on
# replicas & # shards
Lucene revolu+on 2013 14
Collection
Shards
Cores
Nodes
SolrCloud -‐ Design
Key metrics• Collec+on size & complexity• JVM requirement• Node requirement
Lucene revolu+on 2013 15
SolrCloud -‐ Collec+on Metrics
Pubmed Index• ~12M documents• 7 indexed fields• 2 TF fields• 3 sorted Fields• 5 stored Fields
Lucene revolu+on 2013 16
A note on sharding “The magic sauce of webscale”
Ram requirement effect
!"
#!!!"
$!!!"
%!!!"
&!!!"
'!!!"
(!!!"
!" $" &" (" )" #!" #$"
!"#$%$&'()*$
# shards
ram
Lucene revolu+on 2013 17
A note on sharding “The magic sauce of webscale”
Disk requirement effect
!"
#"
$!"
$#"
%!"
%#"
&!"
&#"
'!"
'#"
#!"
!" %" '" (" )" $!" $%" $'" $("
!"#$%&%#'()*%
# shards
disk
spa
ce
“hidden quote for the book”
Lucene revolu+on 2013 18
SolrCloud -‐ Collec+on Configura+on
Pubmed Index• ~12M documents• 7 indexed fields• 2 TF fields• 3 sorted Fields• 5 stored Fields
Configura=on• numShards: 3• replica+onFactor: 2• JVM ram: ~3G• Disk: ~15G
Lucene revolu+on 2013 19
SolrCloud -‐ Core Sizing
Heuris=cally inferred from “experience”• Size on shard, not collec+on• Do NOT starve resources on nodes• Senle for JVM/Disk sizing • Large amount of spare disk (op+mize)
RAM Disk3 G 60 G
Lucene revolu+on 2013 20
SolrCloud -‐ Cluster Availability
Depends on the nodes!!!Instance ram disk $/h Nodes Min Size $/core/m
m1.medium 3.75 410 0.12 1 6 6 87
m1.large 7.5 850 0.24 2 6 12 87
m1.xlarge 15 1690 0.48 5 6 30 70
m2.xlarge 17.1 420 0.41 5 6 30 60
m2.2xlarge 34.2 850 0.82 11 6 66 54
m1.medium 3.75 410 0.12 3 6 18 28
CCtrl (paas) 1.02 420 -‐ 1 6 6 75( )
Lucene revolu+on 2013 21
SolrCloud -‐ Monitoring
Solr Monitoring• clusterstate.json• /livenodes
Node Monitoring *• load average• core-‐to-‐resource consump+on (Core to CPU)• collec+on-‐to-‐node consump+on (LB logs)
Lucene revolu+on 2013 22
SolrCloud -‐ Provisioning
Stand-‐by nodes• Automa+cally assigned as replica• provides a metric of HA
Node addi=on * (self healing)• Scheduled check on cluster conges+on• Automa+cally spawn new nodes per need
Lucene revolu+on 2013 23
SolrCloud -‐ Conclusion
Using SolrCloud is like juggling• Gets bener with prac+ce• There is always some magic leq• Could become very overwhelming• When it fails you loose your balls
Test -‐> Test -‐> Test -‐> some more Tests -‐> Test
Lucene revolu+on 2013 24
What would make our current SolrCloud cluster even more awesome:• Balance/distribute core based on machine load
• Standby core (replicas not serving request and auto-‐shurng down
Next Steps
Lucene revolu+on 2013 25
Requirement for solrCloud:
• Solr Mailing list: solr-‐[email protected]
Further informa+on
• blogs & feed: hnp://www.searchbox.com/blog/• Searchbox email: [email protected]
Further Informa+on
Lucene revolu+on 2013
CONFERENCE PARTYThe Tipsy Crow: 770 5th AveStarts after Stump The ChumpYour conference badge gets you in the door
TOMORROW Breakfast starts at 7:30Keynotes start at 8:30
CONTACTStephane [email protected]
26Lucene revolu+on 2013
Top Related