Elasticsearch 5 in Amazon Elasticsearch Service

29
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch 5 in Amazon Elasticsearch Service Darin Briskman Amazon Web Services Technical Evangelist [email protected] or @briskmad 15 Feb 2017 Jon Handler AWS Principal Solutions Architect [email protected] or @_searchgeek

Transcript of Elasticsearch 5 in Amazon Elasticsearch Service

Page 1: Elasticsearch 5 in Amazon Elasticsearch Service

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Elasticsearch 5 in Amazon Elasticsearch Service

Darin BriskmanAmazon Web Services Technical Evangelist

[email protected] or @briskmad

15 Feb 2017

Jon HandlerAWS Principal Solutions Architect

[email protected] or @_searchgeek

Page 2: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Amazon Search Services

Amazon CloudSearch

Amazon Elasticsearch

Service

Page 3: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

OpenSourceDistributedIndex

ManagedServiceusingElasticsearchandKibana

Fullymanaged;Zeroadmin

HighlyAvailableandReliable

RESTfulAPIforeasyintegrationAmazon

Elasticsearch Service

Page 4: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Amazon Elasticsearch Service Leading Use Cases

Log Analytics & Operational Monitoring

• Monitor the performance of applications, web servers, and hardware

• Easy to use, powerful data visualization tools to detect issues quickly

• Dig into logs in an intuitive, fine-grained way

• Kibana provides fast, easy visualization

Search

• Application or website provides search capabilities over diverse documents

• Tasked with making this knowledge base searchable and accessible

• Text matching, faceting, filtering, fuzzy search, auto complete, highlighting, and other search features

• Query API to support application search

Page 5: Elasticsearch 5 in Amazon Elasticsearch Service

Leading enterprises trust Amazon Elasticsearch Service for their search and analytics applications

Media&Entertainment

OnlineServices Technology Other

Page 6: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Adobe Developer Platform (Adobe I/O)

P R O B L E M• Cost effective monitor

for XL amount of log data

• Over 200,000 API calls per second at peak -destinations, response times, bandwidth

• Integrate seamlessly with other components of AWS eco-system.

S O L U T I O N• Log data is routed with

Amazon Kinesis to Amazon Elasticsearch Service, then displayed using AES Kibana

• Adobe team can easily see traffic patterns and error rates, quickly identifying anomalies and potential challenges

B E N E F I T S• Management and

operational simplicity

• Flexibility to try out different cluster configduring dev and test

AmazonKinesisStreams

Spark StreamingAmazon

Elasticsearch Service

Data Sources

1

Page 7: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

McGraw Hill Education

P R O B L E M• Supporting a wide catalog

across multiple services in multiple jurisdictions

• Over 100 million learning events each month

• Tests, quizzes, learning modules begun / completed / abandoned

S O L U T I O N

• Search and analyze test results, student/teacher interaction, teacher effectiveness, student progress

• Analytics of applications and infrastructure are now integrated to understand operations in real time

B E N E F I T S

• Confidence to scale throughout the school year. From 0 to 32TB in 9 months

• Focus on their business, not their infrastructure

Page 8: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Easy toUse

Deployaproduction-readyElasticsearchclusterinminutes

Simplifiestime-consumingmanagementtaskssuchassoftwarepatching,failurerecovery,backups,andmonitoring

Open

GetdirectaccesstotheElasticsearchopen-sourceAPI

FullycompatiblewiththeopensourceElasticsearchAPI,forallcodeandapplications

Secure

SecureElasticsearchclusterswithAWSIdentityandAccessManagement(IAM)policieswithfine-grainedaccesscontrolaccessforusersandendpoints

Automaticallyappliessecuritypatcheswithoutdisruption,keepingElasticsearchenvironmentssecure

Available

ProvideshighavailabilityusingZoneAwareness,whichreplicatesdatabetweentwoAvailabilityZones

Monitorsthehealthofclustersandautomaticallyreplacesfailednodes,withoutservicedisruption

AWSIntegrated

IntegrateswithAmazonKinesisFirehose,AWSIOT,andAmazonCloudWatchLogsforseamlessdataingestion

AWSCloudTrailforauditing,AWSIdentityandAccessManagement(IAM)forsecurity,andAWSCloudFormationforcloudorchestration

Scalable

Scaleclustersfromasinglenodeupto20nodes

ConfigureclusterstomeetperformancerequirementsbyselectingfromarangeofinstancetypesandstorageoptionsincludingSSD-poweredEBSvolumes

Amazon Elasticsearch Service Benefits

Page 9: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Easy to use and scalable

AWS SDK

AWS CLI

AWSCloudFormation

Elastic LoadBalancingAWS IAM

Amazon CloudWatch

AWS CloudTrail

Page 10: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Open

• Drop-in replacement• Zero-change, no-risk

migration to or from open source Elasticsearch

Page 11: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Secure

• Control access based on originating IP or Principal

• Mix policies to provide application access and Kibana access

• Use IAM roles to provide access for other services

Page 12: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Available

Amazon Elasticsearch Service cluster

1

3

Instance 1

2

1 2

Instance 2

3

2

1

Instance 3

Availability Zone 1 Availability Zone 2

2

1

Instance 4

3

3

Page 13: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Logstash

RESTCWL Agent

EC2 Instances

Amazon Kinesis

AmazonRDS

AmazonDynamoDB

AmazonSQS

Queue

LogstashCluster

Amazon Elasticsearch

Service

Amazon CloudWatch

AWSLambda

AWSCloudTrail

Access Logs

Amazon VPC Flow

Logs

Amazon S3 bucket

AWS IoT

Amazon Kinesis Firehose

AWS integrated

Amazon ECS

Page 14: Elasticsearch 5 in Amazon Elasticsearch Service

Dedicated master nodes improve stability

Amazon ES cluster

1

3

3

1

Instance 1

2

1

1

2

Instance 2

3

2

2

3

Instance 3Dedicated master nodes

Data nodes: queries and updates

Page 15: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Firehose delivery architecture with transformations

intermediate Amazon S3

bucket

backup S3 bucket

source records

data source

source records

Amazon ElasticsearchService

Firehosedelivery stream transformed

records transformedrecords

transformation failure

delivery failure

Page 16: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Repository Search

• File metadata and possibly file contents for traditional search

• Lambda to keep the repository current

• Good for up to ~60TB of metadata/source data (current limits)

See also: Indexing S3 Metadata blog post by Amit Sharma

Page 17: Elasticsearch 5 in Amazon Elasticsearch Service

Amazon Elasticsearch Service support for Elasticsearch 5

Page 18: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

What to do with a terabyte of logs?

Page 19: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Visualize it with Kibana 5!

Page 20: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Scripting with Amazon Elasticsearch Service

Scripting is fully supported using the Painless language. With scripts you can

• Change the precedence of search results• Delete index fields by query• Modify search results to return specific fields• Alter elements in a field

Painless is explicitly designed for Elasticsearch and is both performant and secure.

Page 21: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Ingest Pipelines and Processors

When you index documents, you can specify a pipeline.The pipeline can have a series of processors that pre-process the data before indexing.Twenty processors are available, some are simple:{ "append":

{ "field": "field1" "value": ["item2", "item3", "item4"] } }

Others are more complex, like the Grok processor for regex with aliased expressions.

Page 22: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Lots of New Elasticsearch APIs

/_alias/_aliases/_all/_analyze/_bulk/_cache/clear (Index only)/_cat/_cluster/allocation/explain/_cluster/health/_cluster/pending_tasks/_cluster_settings (PUT only):indices.breaker.fielddata.limitindices.breaker.request.limitindices.breaker.total.limit

/_cluster/state/_cluster/stats/_count/_delete_by_query*/_explain/_field_stats/_flush/_forcemerge (Index only) /_mapping/_mget/_msearch/_mtermvectors/_nodes/_plugin/kibana/_recovery (Index only)

/_refresh/_reindex*/_rollover/_search/_search profile/_segments (Index only) /_shard_stores/_shrink/_snapshot/_stats/_status/_tasks/_template/_termvectors/_update_by_query*/_validate

Page 23: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Shrink and Rollover

Shrink an index to a single shard:POST source_index/_shrink/target_index

Very useful for time-series indexes once ingestion is done!

Rollover an index based on number of documents:POST logs_index/_rollover

{ "conditions": {"max_docs": 100000 } }

Page 24: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Supported Elasticsearch 5 Plugins

• Smart Chinese Analysis plugin• Stempel Polish Analysis plugin• Ingest Processor Attachment plugin• Ingest Geoip Processor Plugin• Ingest User Agent Processor plugin• Mapper Murmur3 Plugin

中文Polskie

Page 25: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Testing Ingest Performance

• Load generator• m4.large, single process, single thread

• Amazon Elasticsearch Service• 1 instance, 1 primary, no replicas, EBS gp2 storage

• Data• 1.8m apache web log lines, comprising 196 MB

• _bulk API calls with 10K lines per call• Monitoring data gathered from load generator process

and from the Amazon Elasticsearch Service domain

Page 26: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Amazon Elasticsearch Service with v2.3 EngineInstance Avg Index Docs/sec

m3.medium 3.93 ms 2811

m3.2xlarge 11.83 ms 3966

r3.large 8.87 ms 3932

r3.8xlarge 10.58 ms 4404

I2.2xlarge 11.2 ms 5305

Ingest Performance Test Results

Instance Avg Index Docs/sec

m3.medium 3.12 ms 3629m3.2xlarge 11.1 ms 5816r3.large 8.76 ms 7221r3.8xlarge 9.59 ms 7726I2.2xlarge 10.3 ms 9676

Amazon Elasticsearch Service with v5.1 Engine

Up to 82% more documents per second!

Page 27: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Migrating from v2.3 to v5.1

The easy way:1. Create a new Amazon Elasticsearch Service v5.1 cluster2. Snapshot your v2.3 indexes3. Restore the indexes to the v5.1 cluster

… but this won’t get most of the benefits of v5.1

There are many breaking changes in v5, documented athttps://www.elastic.co/guide/en/elasticsearch/reference/5.1/breaking-changes.html

Page 28: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Three Things to Remember

• Amazon Elasticsearch Service is a drop-in replacement for new and existing Elasticsearch workloads

• Deploy, manage, and scale Elasticsearch more easily in the AWS cloud

• Support for Elasticsearch 5.1 brings scripting, additional plugins and additional performance to Amazon Elasticsearch Service

Page 29: Elasticsearch 5 in Amazon Elasticsearch Service

Get started at https://aws.amazon.com/elasticsearch-service/

Findoutmore:https://aws.amazon.com/elasticsearch-service/

AWSCentralizedLogging:https://aws.amazon.com/answers/logging/centralized-logging/

ElasticsearchattheAWSDatabaseBlog:https://aws.amazon.com/blogs/database/category/elasticsearch/

OraskyourSolutionsArchitect!

Amazon Elasticsearch

Service