Building SaaS Solutions for Online Media Using Apache Solr
-
Upload
lucidworks-archived -
Category
Documents
-
view
1.392 -
download
3
description
Transcript of Building SaaS Solutions for Online Media Using Apache Solr
![Page 1: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/1.jpg)
Building SaaS solutions with Apache Solr
Alberto Mijares, Canoo Engineering AG [email protected], 26/05/2011
Twitter: @lemaiol
![Page 2: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/2.jpg)
Bullet point time!
3
![Page 3: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/3.jpg)
What I Will Cover § Practical applications of Apache Solr and
Apache Lucene: how to increase the time spent by a user in an website and do website “cross-selling”.
§ Use case: how Canoo helped Axel Springer Switzerland to increased the page impressions, user permanence time and traffic in their financial online newspapers.
§ Key concepts: • How to achieve this using Lucene & Solr • How to profit from a SaaS business model
4
![Page 4: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/4.jpg)
Who I am § Alberto Mijares § Canoo Engineering AG § Background in web applications and standards:
• Participated in W3C Semantic Web interest group (SWEO)
• Led web standards compliance tools development in the past (Web Accessibility and Mobile Web)
• Led enterprise information retrieval projects in the recent past
• Actually coaching Google Web Toolkit projects’ development
5
![Page 5: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/5.jpg)
Who is Canoo § People:
• Dirk Koenig: Groovy founder • Andres Almiray: Griffon project lead and Java
Champion • Hamlet D’Arcy: Groovy committer and enthusiast • … almost 40 more top software engineers
6
§ Products: • WebTest: framework for web functional testing • RIA Suite (aka ULC): Java based RIA framework • FindIT: information retrieval and search tools • WMTrans: language analysis tools
![Page 6: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/6.jpg)
Canoo FindIT
http://www.canoo.com/videos/FindIT.html
7
![Page 7: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/7.jpg)
Stop “bullet-pointing”!
8
![Page 8: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/8.jpg)
The facts
9
Axel Springer group is a market leader
Bilanz, Handelszeitung and Stocks
In Switzerland financials are important!
Financial language is German
Online media is the future
![Page 9: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/9.jpg)
The facts
10
Axel Springer group is a market leader
Bilanz, Handelszeitung and Stocks
In Switzerland financials are important!
Financial language is German
Online media is the future
![Page 10: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/10.jpg)
The gap
Make the online versions more profitable
11
Make all newspapers “market leaders”
![Page 11: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/11.jpg)
The gap
Make the online versions more profitable
12
Make all newspapers “market leaders”
![Page 12: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/12.jpg)
The how
Workshop
13
“Related articles”
“Cross-selling”
![Page 13: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/13.jpg)
The how
Workshop
14
“Related articles”
“Cross-selling”
![Page 14: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/14.jpg)
The analysis
Find a funding model
15
Use Lucene’s “More like this”
Integrate back the suggestions
Implement a selection mechanism
![Page 15: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/15.jpg)
The analysis
Find a funding model
16
Use Lucene’s “More like this”
Integrate back the suggestions
Implement a selection mechanism
![Page 16: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/16.jpg)
The issues “More like this” was “experimental”
17
Works out-of-the-box only in English
Without “semantics” not always makes sense
Indexing full pages produces noise
![Page 17: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/17.jpg)
The issues “More like this” was “experimental”
18
Works out-of-the-box only in English
Without “semantics” not always makes sense
Indexing full pages produces noise
![Page 18: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/18.jpg)
The key
19
![Page 19: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/19.jpg)
The key
20
![Page 20: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/20.jpg)
The functional requirements
Discover and index articles
21
Extract only content
Simple and flexible query service
![Page 21: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/21.jpg)
The functional requirements
Discover and index articles
22
Extract only content
Simple and flexible query service
![Page 22: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/22.jpg)
The funding model
23
![Page 23: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/23.jpg)
The business model
24
SaaS
![Page 24: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/24.jpg)
The “other” requirements Lucene-based analysis pipeline
25
Web oriented platform
Multi-application platform
Reliable, fast and scalable
Plan B?
![Page 25: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/25.jpg)
The “other” requirements Lucene-based analysis pipeline
26
Web oriented platform
Multi-application platform
Reliable, fast and scalable
Plan B?
![Page 26: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/26.jpg)
The search Wraps Lucene in a nice way
27
It is mature and Open Source
Supports scheduling, REST API, DIH,…
Scalability out-of-the-box
Well documented and has professional support
![Page 27: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/27.jpg)
The search Wraps Lucene in a nice way
28
It is mature and Open Source
Supports scheduling, REST API, DIH…
Scalability out-of-the-box
Well documented and has professional support
![Page 28: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/28.jpg)
The plan
From POC to PROD in “80 days”
29
![Page 29: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/29.jpg)
The plan
From POC to PROD in “80 days”
30
![Page 30: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/30.jpg)
The results
Google analytics
31
![Page 31: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/31.jpg)
The results
Google analytics
32
![Page 32: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/32.jpg)
The conclusions
33
![Page 33: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/33.jpg)
The Q&A
34
Thanks!
![Page 34: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/34.jpg)
Sources § Links
• http://people.canoo.com/share • http://www.canoo.com • http://www.canoo.net • http://www.leo.org • http://www.bilanz.ch • http://www.handelszeitung.ch • http://www.stocks.ch
35
![Page 36: Building SaaS Solutions for Online Media Using Apache Solr](https://reader031.fdocuments.net/reader031/viewer/2022020101/555097f5b4c90595208b46f4/html5/thumbnails/36.jpg)
Architecture
Platform: Apache Solr 1.4.1 Architecture:
Solr container Web container
Springer Solr Springer WebApp
Customer 2 Solr Customer 2 WebApp
Customer 3 Solr Customer 3 WebApp
Extern access Intern access
Requests