Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete...

41
© 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use case examples

Transcript of Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete...

Page 1: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Pete NaylorSr Database Specialist SA - AWS

Introduction to Amazon DynamoDB- and some use case examples

Page 2: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Agenda

• DynamoDB’s place in the history of purpose-built databases

• Key concepts

• How do I use DynamoDB?

- Basic data models and controls

• Challenges with changing and imbalanced loads

- And how DynamoDB deals with them

• Integrated DynamoDB architecture and data flows

Page 3: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates. © 2020, Amazon Web Services, Inc. or its Affiliates.

Database evolution and DynamoDB

Page 4: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

2004 – Amazon.com operational database scaling woeshttps://www.cnet.com/news/amazon-com-hit-with-outages/

2007 – Whitepaper describing internal distributed database solutionhttp://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

2012 – Reimagined implementation made available via AWS as DynamoDBhttps://aws.amazon.com/about-aws/whats-new/2012/01/18/aws-announces-dynamodb/

2018 – Amazon.com almost entirely migrated to Redshift, Aurora, DynamoDBhttps://twitter.com/ajassy/status/1060979175098437632

2019 – Continuing rapid innovation informed by unmatched experiencehttp://amzn.to/2fAJXH9

DynamoDB historyHow did we get here?

Page 5: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

1. Reduce dependence on commercially licensed relational database engines

2. Minimize operational complexity and administrative overhead

1. Reduce dependence on commercially licensed relational database engines

2. Minimize operational complexity and administrative overhead

3. Provide best possible customer experience across all data indexing needs

DynamoDB historyMotivations at Amazon

1. Reduce dependence on commercially licensed relational database engines

“A one size fits all database doesn't fit anyone”- Werner Vogels

(and elsewhere)

Page 6: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Internet-scale applications

Users 1M+

Data volume TB-PB-EB

Locality Global

Performance Milliseconds-microseconds

Request rate Millions

Access Mobile, IoT, devices

Scale Incrementally based on traffic

Economics Pay as you go

Developer access Instant API access

Page 7: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

By: A

rturo

Par

davi

la II

Iht

tps:

//ww

w.fl

ickr

.com

/pho

tos/

apar

davi

la/a

lbum

s/72

1576

7462

4352

641

Page 8: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Snap (Snapchat)

Database writes peak secondsafter Chicago Cubs winthe World Series.

Page 9: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

A migration from mainframe

• Retail business ran on mainframe,which became a bottleneck

• Migrated financial transaction data to DynamoDB

• Unbound scale for customers and app developers

“We built a secure and resilient cloud infrastructure that could solve the scalability and reliability problems with a serverless architecture.”

Srini UppalapatiCapital One

Page 10: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

SQL and DynamoDB side by side

Optimized for storage Optimized for compute

Normalized/relational Denormalized/hierarchical

Ad hoc queries Instantiated views for known patterns

Scale vertically Scale horizontally

Good for OLAP Built for OLTP at scale

Traditional SQL DynamoDB

Page 11: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Prioritiesü Securityü Durabilityü Scalabilityü Availabilityü Low Latencyü Easy to Useü Easy to Manage

Core Strengths of DynamoDBSounds cool, but when should I use it?

Considerations:• SQL / NoSQL• Relational / Non-Relational• Operational / Analytical

DynamoDB solves for:• Horizontal scaling• Decoupled compute/storage• Asynchronous roll-ups/aggregations• Availability/Durability/Latency• Well-known access patterns• Schema flexibility

Page 12: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates. © 2020, Amazon Web Services, Inc. or its Affiliates.

Key concepts

Page 13: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Scaling databases

Traditional SQL NoSQL

DBDB

Scale up

DBP1

DBPn

DBP2

DBP3

Scale out to many shards

Page 14: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Scaling NoSQL databases

Most NoSQL databases DynamoDB

DB

Servers and clusters DynamoDB: partitions

DBDB

DBDB

DBDB

DBDB

DB

DBP1

DBP1

DBP1

DBP1

DBP1

DBP1DB

P1DBP1

DBP1

DBP1

DBP1

DBP1DB

P1DBP1

DBP1

DBP1

DBP1

DBP1DB

P1DBP1

DBP1

DBP1

DBP1

DBP1DB

P25DBP26

DBP27

DBP28

DBP29

DBP30

Basic premise: There is a way to shard data that’s horizontally scalable.

Page 15: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Incremental scaling with DynamoDB

Workload:data volume, reads, writes

DynamoDB resources:storage, read, and write capacity

Page 16: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Server 1

T1.p1

Table 1 Table 2 Table 3

Server N

T1.pn

You work with tables…

1K WU or 3K RUup to 10 GB

DynamoDB does the rest under the hood…

Page 17: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

DynamoDB table

Items

Primary key – unique item identifier

Terminology

Sort key conditions:

all items==, <, >, >=, <=“begins with”“between”

sorted resultscountstop/bottom N

AttributesPartition key

- 1:1 relationships- distribute traffic- collection identity

Sort key(optional)

- 1:many relationships- collect related items- efficient filtering- sorting

Page 18: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Global secondary index (GSI)

GSIsA5

(part.)A4

(sort)A1

(table key)A3

(projected)

Table

INCLUDE A3

A4(part.)

A5(sort)

A1(table key)

A2(projected)

A3(projected) ALL

A2(part.)

A1(table key) KEYS_ONLY

Capacity provisioned separately from table

Online indexing

A1(partition)

A2 A3 A4 A5

Alternate partition (+sort) key for alternate materialized view

Up to 20 GSIs per table

Page 19: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Sharding/partitioning

Hash(1) = 7B

Orders

00

55

AA

FF

Partition A

Partition B

Partition C

Hash.MIN = 0

Hash.MAX = FFKeyspace

Hash(2) = 48

Hash(3) = CD

DynamoDB table

OrderId: 1CountryCode: 1ASIN: [B00X4WHP5E]

OrderId: 2CountryCode : 1ASIN: [B00OQVZDJM]

OrderId: 3CountryCode : 1ASIN: [B00U3FPN4U]

Denormalized Record

Related data is stored together for efficient access

Partition key

Page 20: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Availability Zone A

Partition A

Host 4 Host 6

Availability Zone B Availability Zone C

Partition APartition A Partition CPartition C Partition C

Host 5

Partition B

Host 1 Host 3Host 2

Partition B

Host 7 Host 9Host 8

Partition B

CustomerOrdersTable

Across 3 Availability Zones in the Region -one replica is elected to serve as “leader”.

Three ReplicasOrderId: 1CustomerId: 1ASIN: [B00X4WHP5E]

Hash(1) = 7B

A view “from a different angle”

Partition A

Page 21: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates. © 2020, Amazon Web Services, Inc. or its Affiliates.

Modeling data for DynamoDB

Page 22: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Partition key

• A good sharding (partitioning) scheme affords even distribution of both data and workload as they grow

• Key concept: partition key as the dimension of scalability• Distribute traffic and data across partitions – horizontal scaling

• Ideal scaling conditions:• The partition key is from a high cardinality set (that grows)• Requests are evenly spread over the key space• Requests are evenly spread over time

Page 23: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Denormalization

CART

CartId: 1,Version: v3

CART_ITEM

ItemId: 2, Qty: 1,CartId: 1

Traditional database

CART

CartId: 1,Version: v3,Items: [{ID: 2, Qty: 1},{ID: 5, Qty: 2}]

DynamoDB

Normalized schema: multiple items – one table per entity type Denormalized: one item

with flexible schema

Page 24: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Ensuring data consistency on updates

Example: add/remove shopping cart items

1. get cart => vread = Version2. update cart:

IF Version = vreadadd/remove cart items++Version

ELSE go back to Step 1.

Use ConditionExpressionin DynamoDB

Optimistic concurrency control

Page 25: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Updating cart: data consistency using OCC

CART

CartID:2, Version: 3, CartItems: [{ID: 2, Qty: 1},{ID: 5, Qty: 2}]

ü Use conditions to implement optimistic concurrency control ensuring data consistency

ü Single-item operations are ACIDü GetItem call can be eventually consistent

1. Get the cart: GetItem

{ "TableName": “Cart”, "Key": {“CartId”: {“N”: “2”}}

}

2. Update the cart: conditional PutItem (or UpdateItem)

{ "TableName": “Cart”, "Item": {

“CartID”: {“N”: “2”},“Version”: {“N”:”4”}, “CartItems”: {…}

},"ConditionExpression": “Version = :ver”, "ExpressionAttributeValues": {":ver": {“N”: “3”}}

}

Page 26: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Yes, you can have strongly consistent read-after-write and concurrency control

with DynamoDB

Page 27: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

One-to-one relationships or key-values

• Use a table or GSI with a partition key• Use GetItem or BatchGetItem API

Example: Given a user or email, get attributes

Users tablePartition key Attributes

UserId = bob Email = [email protected], JoinDate = 2011-11-15UserId = fred Email = [email protected], JoinDate = 2011-12-01

Users-Email-GSIPartition key Attributes

Email = [email protected] UserId = bob, JoinDate = 2011-11-15Email = [email protected] UserId = fred, JoinDate = 2011-12-01

Page 28: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

One-to-many relationships or parent-children

• Use a table or GSI with a partition and sort key• Use the Query API to get multiple items

Example: Given a device, find all readings between epoch X, Y

Device-measurementsPart. key Sort key AttributesDeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90

Page 29: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Many-to-many relationships

• Use a table and GSI with the partition and sort key elements switched

• Use the Query API

Example: Given a user, find all games. Or given a game, find all users.

User-Games-TablePart. key Sort keyUserId = bob GameId = Game1UserId = fred GameId = Game2UserId = bob GameId = Game3

Game-Users-GSIPart. key Sort keyGameId = Game1 UserId = bobGameId = Game2 UserId = fredGameId = Game3 UserId = bob

Page 30: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Yes, you can model complex data relationships with DynamoDB

Page 31: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates. © 2020, Amazon Web Services, Inc. or its Affiliates.

Challenges with growing datasets, variable throughput, imbalanced workloads

Page 32: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Time-To-Live (TTL)Enabled TTL on Table

Page 33: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Provisioned Mode with Auto Scaling

Page 34: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

On-demand Mode: Spiky workloads

Page 35: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Imbalanced load and DynamoDB Adaptive Capacity

DBShard

1

DBShard

n

DBShard

2

DBShard

3

{k=X, v=Y}

Poor distribution of traffic across the indexed data

Page 36: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Dealing with concentrated read throughput

DynamoDB Accelerator (DAX)

Fully managed, highly available: handles all software management, fault tolerant, replication across multi-AZs within a region

DynamoDB API compatible: seamlessly caches DynamoDB API calls, no application re-writes required

Write-through: DAX handles caching for writes

DynamoDB

YourApplications

DynamoDBAccelerator

Page 37: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates. © 2020, Amazon Web Services, Inc. or its Affiliates.

Integrating DynamoDB into your data flow

Page 38: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Complex queries and analytics

• Use the service that best meets the requirements• Amazon Athena, Amazon Redshift, Amazon Elasticsearch Service…

• Deliver data updates reliably with DynamoDB Streams• Integrates with AWS Lambda• End-to-end serverless

DynamoDB Streams

DynamoDB table

AWS Lambda

Page 39: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Querying in microservices architectures

AWS LambdaDynamoDB Streams

Different materialized views exposed as microservices

Query service 2

Amazon ES

Query service 1

Query service 3

Amazon Redshift

Amazon DynamoDB

Command side Query side

Amazon DynamoDB

Page 40: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Closing summary

• Data management at internet scale gave rise to DynamoDB

- and to polyglot persistence: use the right database for each job

• DynamoDB: highly-automated (serverless) distributed database

- ideal for mission-critical OLTP use cases

• Remove scaling concerns – distribute your data and traffic

• You can have data consistency and integrity in DynamoDB

• You can model relationships beyond key-value in DynamoDB

• Distributed databases are hard to operate - use DynamoDB!

Page 41: Introduction to Amazon DynamoDB · © 2020, Amazon Web Services, Inc. or its Affiliates. Pete Naylor Sr Database Specialist SA - AWS Introduction to Amazon DynamoDB - and some use

© 2020, Amazon Web Services, Inc. or its Affiliates.

Thank you!