Tokyo azure meetup #11 introduction to azure machine learning
Introduction to Azure Search
-
Upload
radoslav-gatev -
Category
Software
-
view
103 -
download
0
Transcript of Introduction to Azure Search
Why Search Is Important?
Why Search Is Important?
Azure SearchA search-as-a-service solution allowing developers to incorporate great search experiences into applications without managing infrastructure or needing to become search experts.
ScenariosIf an app offers lots of content your users will be more effective searching instead of browsing• Ecommerce• Social content• Line-of-business applications
Using Azure Search
Create service
Create index
Index data
Search
Tune result
s
PlaceFinder
1. Provisioning Search ServicesSearch service is:• Scope of capacity, billing, authentication• Managed through the portal or
management API• May have one or more indexes• Service name -> API root URL
https://mysvc.search.windows.net
2. Defining indexesSearch index is:• A searchable collection of documents• Has a schema• Has various options, e.g. scoring profiles,
CORS• Index name -> API URL:
https://mysvc.search.windows.net/indexes/myindex
Index schemaList of fields and their configuration:• Data types: string, int, double, datetime,
boolean, geo-point• Single valued or collections
Each field can be used for:• Search• Suggestions• Filters
• Sorting• Facets• Results
API: Create an indexPOST /indexes/myindex?api-version=2015-02-28-PreviewHost: mysvc.search.windows.netapi-key: [YOUR_ADMIN_KEY]Content-Type: application/json
{ "fields": [ {"name": "placeId","type": "Edm.String", "key": true}, {"name": "nameBg", "type": "Edm.String"}, {"name": "type", "type": "Edm.String" } ], "corsOptions": { "allowedOrigins": [ "*" ] }}
3. Indexing data• Data is indexed in batches
o Up to 1000 operations: upload, merge, delete or mergeOrUpload
o POST to …/indexes/myindex/docs/index• A success response ensures durability
o Client needs to check response body for individual operation status
• Data will be searchable a few seconds latero The data must be indexed, depends on how busy the
system is
API: Batch UploadPOST /indexes/myindex/docs/index?api-version=2015-02-28Host: mysvc.search.windows.netapi-key: [YOUR_ADMIN_KEY]Content-Type: application/json
{ "value": [ { "@search.action": "upload", "placeId": "3620764888", "nameBg": "Университет по ...", "placeType": "university", ... }, ... ]}
Indexing approaches• Push
o .NET SDKo REST API
• Pullo Azure SQL Databaseo SQL Server hosted in Azure VMo DocumentDBo Blob Storage
4. Searching• Scope for search is an index• The search API offers a number of options
o Full-text search including user-friendly operators
o Query support: strict filters, sorting, paging and field selection
o Facetingo Hit highlighting
• Results include scores plus requested fields
API: Search• Simple search:…/docs/search=your search query goes there• Search with a strict filter:…/docs?search=Sofia&$filter=placeType eq ‘university'• Search with sorting, paging, field selection:…/docs?search=Sofia&$orderby=nameBg asc&$top=5&$select=nameBg• Faceting:…/docs?search=Sofia&facet=placeType• Hit highlighting:…/docs?search=Sofia&highlight=nameEn
Geospatial Search• Search in documents within X km of my
location• Sort results by distance from my location• Search for documents within a given
polygon
Search Suggestions• Building block for auto-complete• Tricky balance of speed and features• Suggestions come from document data
5. Tuning• Default: scoring based on text relevance• Scoring profiles for tuning scores
o Field weights: relative importance of fieldso Scoring functions: describe what matters to
you• One or more scoring profiles for different
scenarios
PlaceFinder• Indexed Open Street Map for Sofia• Used Azure SQL indexer which is powered
by a view• Used technologies:
o ASP.NET Web APIo Azure Search .NET SDKo KnockoutJS
• Project repository:https://github.com/RadoslavGatev/PlaceFinder
Scaling
Partitions(more documents, more storage, parallelism)
1, 2, 3, 4, 6, 12
Repl
icas
(mor
e qu
erie
s, m
ore
HA)
1 - 1
2
R x P <= 36
Service tiers
* Minimum two replicas for read-SLA, three replicas for read-write-SLA** Can be increased by calling Azure support
Service tier Free Basic(Preview)
Standard S1
Availability SLA No Yes* Yes*Max documents 10,000 1 million 180** million
(15 million/partition)
Max partitions N/A 1 12Max replicas N/A 3 12Max storage 50 MB 2 GB 300** GB
(25 GB/partition)
A few ideas• Use cache-aside pattern and Redis Cache
o Reduce monetary costs o Maintain high availability
• Multi-service scalingo Use load balancingo Prevent whole-region failureso Handle extreme search workloads
Upcoming events
SQLSaturday #519 in may!http://www.sqlsaturday.com/519/
Thanks to our Sponsors:Global Sponsor:
Platinum Sponsors:
Swag Sponsors: Media Partners:
With the support of: