Amazon Cloudsearch Session With Elsevier: re:Invent 2013
-
Upload
michael-bohlig -
Category
Technology
-
view
973 -
download
0
description
Transcript of Amazon Cloudsearch Session With Elsevier: re:Invent 2013
![Page 1: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/1.jpg)
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Enrich Search User Experience For Different Parts of Your Application Using Amazon CloudSearch
Jon Handler, CloudSearch Solution Architect
November 15, 2013
![Page 2: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/2.jpg)
Agenda • Sourcing your documents • Retrieval and ranking • Search user interface • Performance and Scale
• Developer example: Peter Simpkin, Solution Architect, Elsevier
![Page 3: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/3.jpg)
![Page 4: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/4.jpg)
![Page 5: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/5.jpg)
Architecting with CloudSearch
![Page 6: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/6.jpg)
Hands-Off Operation
SEARCH INSTANCE Index Partition n
Copy 1
SEARCH INSTANCE Index Partition 2
Copy 2
SEARCH INSTANCE Index Partition n
Copy 2
SEARCH INSTANCE Index Partition 2
Copy n
SEARCH INSTANCE
Document Quantity and Size
Search Request Volume and Complexity
Index Partition n Copy n
SEARCH INSTANCE Index Partition 1
Copy 1
SEARCH INSTANCE Index Partition 2
Copy 1
SEARCH INSTANCE Index Partition 1
Copy 2
SEARCH INSTANCE Index Partition 1
Copy n
![Page 7: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/7.jpg)
MovieMate Application
![Page 8: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/8.jpg)
Multiple Sources Multiple Functions
![Page 9: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/9.jpg)
When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. !
Iron Man (2008)!
Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...!
Iron Man 2 (2010)!
When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. !
Iron Man 3 (2013)!
On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...!
The Man With The Iron Fists (2012) !
Cancel Iron Man!
Movies Search Social Account Nearby
Done Iron Man
!
Movies Search Social Account Nearby
Mobile Experience
![Page 10: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/10.jpg)
Agenda • Sourcing your documents • Retrieval and ranking • Search user interface • Performance and Scale
• Developer example: Peter Simpkin, Elsevier Oxford
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 11: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/11.jpg)
CloudSearch Documents • Unique identifier • Version • Fields
– Indexed according to configuration – Source of matches
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 12: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/12.jpg)
Amazon RDS
Application Content Movie data Theater data User reviews, lists etc.
DynamoDB
User actions
Amazon S3
Help files Media (clips, images) Articles
![Page 13: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/13.jpg)
Bootstrap Strategy
Source System
Processing Script
Queuing Batching
Amazon EC2
Amazon EC2
Amazon CloudSearch
Amazon SQS
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 14: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/14.jpg)
Document Construction • One source will be the master
for each record
determine doc id and version create fields for each auxiliary source gather additional data send or queue the document
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 15: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/15.jpg)
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
Relational DB
Movie
Title
Description
TheaterID
Theater
Name
AddressesID
ShowtimesID
Addresses
Street
City
State
Showtimes
Date
Time
State
![Page 16: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/16.jpg)
S3 • Clips, images, reviews • Apache Tika to extract content • S3 Metadata for additional fields
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 17: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/17.jpg)
Dynamo DB
DynamoDB CloudSearch
Table Domain
Item DocumentAttribute FieldAttributeAttributeAttribute
FieldFieldField
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 18: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/18.jpg)
When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. !
Iron Man (2008)!
Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...!
Iron Man 2 (2010)!
When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. !
Iron Man 3 (2013)!
On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...!
The Man With The Iron Fists (2012) !
Cancel Iron Man!
Movies Search Social Account Nearby
Done Iron Man
!
Movies Search Social Account Nearby
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 19: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/19.jpg)
Searching Show Times id title description t_name t_street date time
1 Iron Man
... Galaxy Main 11/11 12:30pm
2 Iron Man
... Galaxy Main 11/11 1:15pm
3 Iron Man
... Galaxy Main 11/11 2:45pm
4 Iron Man
... Galaxy Main 11/11 6:00pm
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 20: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/20.jpg)
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
Heterogenous Data
![Page 21: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/21.jpg)
Multi Domain Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 22: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/22.jpg)
Updating CloudSearch
Amazon EC2 Amazon CloudSearch
Amazon SQS Amazon EC2
Amazon S3 DynamoDB Amazon RDS
Web Server
Users
Update Processor
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 23: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/23.jpg)
Section Summary • Multiple sources • Bootstrap / Update • Heterogeneous data
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 24: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/24.jpg)
Agenda • Sourcing your documents • Retrieval and ranking • Search user interface • Performance and Scale
• Developer example: Peter Simpkin, Elsevier Oxford
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 25: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/25.jpg)
Good Matches When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. !
Iron Man (2008)!
Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...!
Iron Man 2 (2010)!
When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. !
Iron Man 3 (2013)!
On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...!
The Man With The Iron Fists (2012) !
Cancel Iron Man!
Movies Search Social Account Nearby
![Page 26: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/26.jpg)
The Search Algorithm • Locate documents that satisfy Boolean
constraints – Usually intersection
• Relevance rank those documents – Differentiates from databases by relevance
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 27: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/27.jpg)
Document Structure Movie
title
description
user_rating
likes
release_date
latitude
longitude
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 28: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/28.jpg)
Configuring for Search • Text fields for individual word search
– User-generated and external text – titles, descriptions
• Literal fields for exact matches – Application-generated text like facets
• Integer fields for range searching and ranking
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 29: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/29.jpg)
Searching Text http(s)://<endpoint>/2011-02-01/search? • Simple searches
– q=<text>
• Filtering – bq= (or title:'iron' (and description:'iron' description:'man'))
• Filtering with integer ranges – bq=(and 'iron man' year:..2010)
• Geo filtering – bq=(and 'iron man' latitude:12700..12900 longitude:5700..5800)
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 30: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/30.jpg)
Search Results {"rank": "-‐text_relevance", "match-‐expr": "(label 'iron man')", "hits": { "found": 204, "start": 0, "hit": [ { "id": "sontsst12cf5f88b42" }, { "id": "sopvopr12ab017f082" }, { "id": "sorzrpw12ac468a13b" }, ] }, ... }
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 31: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/31.jpg)
Relevant Results When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. !
Iron Man (2008)!
Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...!
Iron Man 2 (2010)!
When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. !
Iron Man 3 (2013)!
On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...!
The Man With The Iron Fists (2012) !
Cancel Iron Man!
Movies Search Social Account Nearby
![Page 32: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/32.jpg)
Customizing Ranking • text_relevance and cs.text_relevance • Rank expressions
– Compute a score for each document – &rank=<function>
• Defined in the console • Defined at query-time
– &q='iron-man'&rank-recency=text_relevance + year &rank=recency
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 33: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/33.jpg)
Field Weighting
![Page 34: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/34.jpg)
Field Weighting • Adjust relative importance of fields • &rank-title=
cs.text_relevance({"weights":{"title":4.0}, "default_weight":1})
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 35: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/35.jpg)
Popularity
![Page 36: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/36.jpg)
Popularity • Convert floating point to integer • Weight by the number of ranks • rank-pop=text_relevance +
log10(user-rating * number-user-ranks) * 10 + metascore * 3
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 37: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/37.jpg)
Freshness
![Page 38: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/38.jpg)
Freshness • Exponential decay function
• &rank-decay=text_relevance + 200*Math.exp(-0.1*days_ago)
r = ce−λt
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 39: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/39.jpg)
Done Iron Man
!
Movies Search Social Account Nearby
Location Sort
![Page 40: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/40.jpg)
Location Sort • Latitude and longitude
expressed as integers • Denormalized for particular
theaters with locations
Movie
title
description
user_rating
likes
release_date
latitude
longitude
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 41: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/41.jpg)
Location Sort • Cartesian distance function
• &rank-geo=sqrt(pow(latitude - lat, 2) + pow(longitude - lon), 2)
• &rank=-geo
(lat − latuser )2 + (lon− lonuser )
2
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 42: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/42.jpg)
![Page 43: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/43.jpg)
Rank Expressions: Combined • &rank-combined=text_relevance + 2.0 * geo +
0.5 * popularity + 0.3 * freshness • &rank=combined
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 44: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/44.jpg)
Section Summary • Search API basics • Customizing ranking
– Field weighting, popularity, freshness, GEO, combined
• Rank expression comparison tool
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 45: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/45.jpg)
Agenda • Sourcing your documents • Retrieval and ranking • Search user interface • Performance and Scale
• Developer example: Peter Simpkin, Elsevier Oxford
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 46: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/46.jpg)
Facets
![Page 47: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/47.jpg)
Facets
![Page 48: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/48.jpg)
Simple Faceting: Document
Movie
title
description
genre
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 49: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/49.jpg)
Simple Faceting: Configuration Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 50: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/50.jpg)
Simple Faceting: Query q=iron+man&facet=genre {"rank": "-‐text_relevance", "match-‐expr": "(label 'star wars')", "hits": {"found": 7, "start": 0, "hit": [] }, "facets": { "genre": { "constraints": [ {"value": "Family", "count": 62}, {"value": "Action/Adventure", "count": 21}, {"value": "Drama", "count": 5 },
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 51: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/51.jpg)
Simple Faceting: UI <div class='facet'> <ul class='facet_list'> <?php $genres = $resultsObj-‐>facets-‐>genre-‐>constraints; for ($i = 0; $i < count($genres); $i++) { $curGenre = $genres[$i]; $curCount = $thisGenre-‐>count; ?> <li class='facet_item'> <div class='facet_name'><?=$curGenre?></div> <div class='facet_count'><?=$curCount?></div> </li> <?php } ?> </ul> </div>
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 52: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/52.jpg)
Facets
![Page 53: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/53.jpg)
Document • title: Lincoln • description: ... • oscar1: Awards • oscar2: Awards/Best Actor • oscar3: Awards/Best Actor/
Daniel Day Lewis
Movie title description oscar1 oscar2 oscar3
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 54: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/54.jpg)
Query &q=lincoln&facet=oscar1,oscar2,oscar3 {"rank": "-‐text_relevance", "hits":{...}, "facets": { "oscar1": { "constraints": [ {"value": "Awards", "count": 23}, {"value": "Nominations", "count": 124}]}, "oscar2": { "constraints": [ {"value": "Awards/Best Actor", "count": 6}, {"value": "Awards/Best Actress", "count": 3}...]}, "oscar3": { "constraints": [ {"value": "Awards/Best Actor/Daniel Day Lewis", "count": 1}, {"value": "Awards/Best Actor/Denzel Washington", "count": 2}...]},
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 55: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/55.jpg)
Drilldown • bq=oscar1:'Awards' • bq=oscar2:'Awards/Best Actor' • bq=oscar3:'Awards/Best Actor/Daniel Day Lewis' • bq=(and 'star' oscar2:'Awards/Best Actor')
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 56: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/56.jpg)
Section Summary • Simple faceting • Hierarchical faceting • Hierarchical data handling
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 57: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/57.jpg)
Agenda • Sourcing your documents • Retrieval and ranking • Search user interface • Performance and Scale
• Developer example: Peter Simpkin, Elsevier Oxford
![Page 58: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/58.jpg)
The Search Algorithm • Locate documents that satisfy Boolean
constraints – Usually intersection
• Relevance rank those documents – Differentiates from databases by relevance
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 59: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/59.jpg)
Performance Best Practices • Match set size • Text queries perform better than integer queries • Complex relevance functions
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 60: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/60.jpg)
Optimizing Index Size • Trade off literal and uint for cost/performance • Result fields matter most • Enabling faceting increases size
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 61: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/61.jpg)
Wrap Up • Sourcing documents from various locations • Building queries and ranking • UI Components for faceting • Getting the most out of your index
![Page 62: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/62.jpg)
Agenda • Sourcing your documents • Retrieval and ranking • Search user interface • Performance and Scale
• Developer example: Peter Simpkin, Elsevier Oxford
Sourcing your documents Retrieval and ranking Search user interface Performance and Scale Developer example
![Page 63: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/63.jpg)
Agenda
• Elsevier Intro • Search Problem Statement • Enterprise Content Search • Hints and Tips • CloudSearch Observations
![Page 64: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/64.jpg)
• 7,000+ employees in 26 countries • 2,200 journals / article market
share 25% • $3B revenue • Scientific, Technical & Medical
![Page 65: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/65.jpg)
Customers Products Academic Research Institutions
Government & Health
Corporate Research Labs
Individual Researchers
![Page 66: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/66.jpg)
Content Systems
Content Challenges:
• No central place for consumers to discover content
• Is not currently possible to search and retrieve atomic assets
• Assets are not reusable across products
Consumer Platforms
![Page 67: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/67.jpg)
Enterprise Content Search Engine
Search Opportunities:
• Create a comprehensive inventory to discover easily content Elsevier owns
• Provide access to Granular / Modular content they want at will
• Assets must be uniquely addressable
Empower our product development partners
![Page 68: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/68.jpg)
Enterprise Content Search eco-system
Federated Content Warehouse Product Platform Data center
E.U Corporate Data center
U.S Corporate Data center
Amazon S3 DynamoDB
Amazon SWF Amazon CloudSearch
SDF metadata
![Page 69: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/69.jpg)
Simple Search UI
![Page 70: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/70.jpg)
Elsevier Technical Drivers & Approach • Fully-managed, full featured search service in
the cloud • Automatically scales for data & traffic • Easy to set up and use • PoC created in days • Search Engine as a Service • Pay-as-you-go pricing model
![Page 71: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/71.jpg)
Hints & Tips (and issn:'0022-1694'
(and type:'1.2'
(and (not action:'D')
(or (and pubstartdate:..2013176 pubenddate:2005002..)
(or (and pubstartdate:2005001
(and pubstarttime:0.. pubstarttime:..235959))
(or (and pubstartdate:2013177 pubstarttime:..235959)
(or (and pubenddate:2005001 pubendtime:0..)
(and pubenddate:2013177
(and pubendtime:..235959 pubendtime:0..)))))))))
• Query Response Time = 5 seconds
![Page 72: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/72.jpg)
Optimising Nested Queries (and issn:'0022-1694' type:'1.2'
(not action:'D')
(or (and pubstartdate:..2013176 pubenddate:2005002..)
(and pubstartdate:2005001 pubstarttime:0..235959)
(and pubstartdate:2013177 pubstarttime:0..235959)
(and pubenddate:2005001 pubendtime:0..)
(and pubenddate:2013177 pubendtime:0..235959)))
• Response Time = 2.5 seconds
![Page 73: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/73.jpg)
Optimised Nested Query ((not action:'D')
(or (and issn:'0022-1694' and type‘1.2'
and pubstartdate:..2013176 pubenddate:2005002..)
(and issn:'0022-1694' and type‘1.2'
and pubstartdate:2005001 pubstarttime:0..235959)
(and issn:'0022-1694' and type‘1.2'
and pubstartdate:2013177 pubstarttime:0..235959)
(and issn:'0022-1694' and type‘1.2'
and pubenddate:2005001 pubendtime:0..)
(and issn:'0022-1694' and type‘1.2'
and pubenddate:2013177 pubendtime:0..235959)))
• Response Time = 0.17ms
![Page 74: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/74.jpg)
CloudSearch Observations facilitate knowledge sharing on content matters across Elsevier’s product platforms
ability to leverage content infrastructure and capabilities across Elsevier’s divisions
easy to integrate with existing on-premise Content Systems
speed to market, allows developers to focus building other core Content Strategy components
need to spend time optimising queries to maximise performance
![Page 75: Amazon Cloudsearch Session With Elsevier: re:Invent 2013](https://reader034.fdocuments.net/reader034/viewer/2022051609/547f8c3bb4af9f8f7c8b4707/html5/thumbnails/75.jpg)
Thank YouPlease give us your feedback on this presentation
As a thank you, we will select prize winners daily for completed surveys!
SVC302