Lucene Case Studies ApacheCon EU 2009

24
Lucene/Solr Case Studies Presented by Erik Hatcher March 25, 2009 ApacheCon EU Amsterdam [email protected] 1

Transcript of Lucene Case Studies ApacheCon EU 2009

Page 1: Lucene Case Studies  ApacheCon EU 2009

Lucene/SolrCase Studies

Presented by Erik HatcherMarch 25, 2009ApacheCon EU

[email protected]

1

Page 2: Lucene Case Studies  ApacheCon EU 2009

Lucene Development with Ant

<index index="${index.base.dir}/index" xmlns="antlib:org.apache.lucene.ant"> <fileset dir="${files.dir}"/></index>

2

Page 3: Lucene Case Studies  ApacheCon EU 2009

Rossetti Archive

• Purpose: Dramatically improve findability and serendipitous discovery of Rossetti materials

• Data Source: TEI-like XML

• Challenges: case sensitive search, academic relevance tuning

• Technologies: Lucene, Tapestry

• http://www.rossettiarchive.org/rose/

3

Page 4: Lucene Case Studies  ApacheCon EU 2009

4

Page 5: Lucene Case Studies  ApacheCon EU 2009

5

Page 6: Lucene Case Studies  ApacheCon EU 2009

Collex

• Purpose: Build search/discover/share platform for scholarly objects, starting with 19th century domain (NINES) but aiming general purpose.

• Data Sources: RDF and MARC

• Challenges: Tagging update speed

• Technologies: Solr, Ruby on Rails

• http://www.collex.org

6

Page 7: Lucene Case Studies  ApacheCon EU 2009

7

Page 8: Lucene Case Studies  ApacheCon EU 2009

8

Page 9: Lucene Case Studies  ApacheCon EU 2009

9

Page 10: Lucene Case Studies  ApacheCon EU 2009

Blacklight• Purpose: Open source scalable clean next generation library

discovery interface.

• Data Sources: MARC, Fedora, EAD, ... anything

• Challenges: academia, competitors

• Technologies: Solr, Ruby, Rails, Java indexer(SolrMarc)

• "A process, not a product"

• http://code4lib.org/node/177

• http://blacklightopac.org/

10

Page 11: Lucene Case Studies  ApacheCon EU 2009

11

Page 12: Lucene Case Studies  ApacheCon EU 2009

12

Page 13: Lucene Case Studies  ApacheCon EU 2009

13

Page 14: Lucene Case Studies  ApacheCon EU 2009

Blacklight story

• Bethany Nowviskie, Bess Sadler, and Erik Hatcher. “Adapting an Open Source, Scholarly Web 2.0 System for Findability in Library Catalogs.” Library 2.0 Initiatives in Academic Libraries. (Laura Cohen, ed.). Association of College and Research Libraries: Chicago, 2008.

14

Page 15: Lucene Case Studies  ApacheCon EU 2009

Flare

• Distilled from Blacklight development, proof-of-concept Rails plugin

• Features: suggest, saved searches, pie chart faceting, Simile Timeline/Exhibit integration

• http://wiki.apache.org/solr/Flare

15

Page 16: Lucene Case Studies  ApacheCon EU 2009

Solritas

• Light-weight Velocity templated Solr output

• rapid prototyping

• http://wiki.apache.org/solr/Solritas

16

Page 17: Lucene Case Studies  ApacheCon EU 2009

LucidFind

• Purpose: Company technology showcase, community focused service. Indexed lucene.apache.org/*: wiki, web, code, issues, e-mail, nice UI

• Challenges: None to speak of

• Technologies: Solr, PHP, Ant

• http://www.lucidimagination.com/search

17

Page 18: Lucene Case Studies  ApacheCon EU 2009

Solr

Search Results

Lucene

Email

Lucene

Wiki

Lucene

Web

Lucene

Code

Lucene

Issue Tracker

Lucid

Blog

Lucid

CMS

18

Page 19: Lucene Case Studies  ApacheCon EU 2009

19

Page 20: Lucene Case Studies  ApacheCon EU 2009

Powered by Ant$ ant -pBuildfile: build.xml

Main targets:

archive-focus-logs Archive rolling log files for posterity commit Commit to Solr delete-source Delete specified source from index index-code Index Lucene projects code index-issues Index JIRA issues and comments index-lia Index Lucene in Action (1st edition) index-lucid Index Lucid site and articles index-mail Index mail index-web Index Lucene web content index-wiki Index Lucene wiki optimize Optimize Solr index

20

Page 21: Lucene Case Studies  ApacheCon EU 2009

Questions?

21

Page 22: Lucene Case Studies  ApacheCon EU 2009

Answer:"it depends"

22

Page 23: Lucene Case Studies  ApacheCon EU 2009

lucidimagination.com

23

Page 24: Lucene Case Studies  ApacheCon EU 2009

e-book now available!Print coming this summer

http://www.manning.com/hatcher3

24