Download - Alfresco tech talk live on solr august 2012

Transcript
Page 1: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence © Copyright 2012 Ixxus Ltd. Commercial in Confidence

Thanks for joining today’s episode of �Tech Talk Live on �

Solr �

We will start just after �9am PST, 12pm EST, 5pm UK, 6pm CET.

Please use the dial in numbers or your PC (VoIP) �

for the audio portion.

All lines will be on mute during the webinar. �Please use the chat box to submit questions.

Page 2: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

Speaker Bio

•  Principal ECM Architect at Ixxus and has been working on Alfresco solutions for the past 5 years.

•  He is a frequent blogger about Alfresco and Solr and his blog is published at ecmstuff.blogspot.com.

•  He has also written a book about Alfresco called Alfresco 3 Business Solutions

•  Alfresco Certified Engineer •  At DevCon 2011 he presented a session about customizing the Upload

File(s) dialog in Alfresco Share •  Currently working on an Alfresco and Solr project for major London client

2

Page 3: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

Agenda

•  What’s Solr? •  How is it integrated with Alfresco? •  Why is Alfresco using it? •  What content is searchable when using Solr? •  When not to use Solr? •  Demo

3

Page 4: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

What’s Solr?

4

•  Apache Solr is an open source enterprise full-text search server •  Created by Yonik Seeley for CNET •  Contributed to Apache Jan 2006 •  Version 1.4 released Nov 2009

•  Uses Apache Lucene Java search library at its core for full-text indexing and search

•  Runs within a Servlet container such as Apache Tomcat •  It supports any programming language since it has REST-like HTTP/XML and

JSON APIs

Page 5: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

What’s Solr continued?

5

•  Written in Java and provides plug-in interfaces for building extensions to the search server, such as:

•  Faceted searches and navigation •  "Did you mean" spell checking on queries •  Query completion •  Finding documents that are "More like this" •  Results highlighting •  Geospatial search for filtering and sorting by distance

Page 6: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

What’s Solr continued?

6

•  Conceptually, Solr can be broken down into four main areas: •  Schema (schema.xml)

•  Field types and fields (including dynamic fields)

•  Configuration (solrconfig.xml) •  Lucene index parameters •  Request handler mappings •  Cache settings •  Plug-ins

•  Indexing •  Searching

Page 7: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

What’s Solr continued?

7

•  Security? •  Relies on server or container security •  Totally open by default, you need to secure it! •  No standard document level security model

Page 8: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

What’s a Solr Core?

8

•  A Solr core holds one Lucene index and the supporting Solr configuration for that index

•  Sometimes the word "core" is used synonymously with "index“

Page 9: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

How is Solr integrated with Alfresco

•  Alfresco uses a patched Apache Solr 1.4 version •  Custom Schema •  2 cores:

9

Page 10: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

Why is Alfresco using Solr?

•  The following are some of the reasons why Alfresco decided to move from an embedded Lucene search engine to the stand-alone Solr search server: •  Scale content search independently •  Make clustering easier •  Improve performance •  Support cross-locale ordering

10

Page 11: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

What Alfresco Repo content is searchable?

•  What content in the Alfresco repository is searchable via Solr? •  All content in the Workspace store (the main store used for most of

the live content) •  All content in the Archive store (content that has been soft deleted)

•  What’s not searchable? •  Content in other stores such as AVM (deprecated and being phased

out) is not supported •  Multi-tenant searches are not supported in version 4.0.0 (supposed

to be supported in later versions 4.c) •  Records Management (RM 2.0 will support Solr)

11

Page 12: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

When not to use Solr?

•  If you are not running in Tomcat •  Want to use Records Management Module •  Require in-transaction indexing

12

Page 13: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence © Copyright 2012 Ixxus Ltd. Commercial in Confidence

Demo

13

Page 14: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

Cluster Configuration

•  Example of cluster configuration:

14

Page 15: Alfresco tech talk live on solr august 2012

© Copyright 2012 Ixxus Ltd. Commercial in Confidence

Additional Resources

•  http://docs.alfresco.com/4.0/index.jsp?topic=%2Fcom.alfresco.enterprise.doc%2Fconcepts%2Fsolr-intro.html

•  http://wiki.alfresco.com/wiki/Alfresco_And_SOLR •  http://lucene.apache.org/solr/ •  http://ecmstuff.blogspot.se/2012/05/getting-going-with-solr-in-

alfresco-4.html

15