(DAT202) Managed Database Options on AWS

74
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Pranav Nambiar, Sr. Manager of Product Management, AWS Database Services Jeongsang Baek, VP of Engineering, IGAWorks October 2015 | Las Vegas, NV DAT 202 Managed Database Options on AWS

Transcript of (DAT202) Managed Database Options on AWS

Page 1: (DAT202) Managed Database Options on AWS

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Pranav Nambiar, Sr. Manager of Product Management, AWS Database Services

Jeongsang Baek, VP of Engineering, IGAWorks

October 2015 | Las Vegas, NV

DAT 202

Managed Database Options on AWS

Page 2: (DAT202) Managed Database Options on AWS

One size fits all … doesn’t quite work

Page 3: (DAT202) Managed Database Options on AWS

How can we optimize for scale, performance, cost?

Scale

Cost

Performance

Page 4: (DAT202) Managed Database Options on AWS

How we wish …

This is a

worry-free zone

WORRY

Page 5: (DAT202) Managed Database Options on AWS

What to expect from the session

• Why managed database services?

• SQL vs NoSQL

• AWS database options

• Amazon DynamoDB—A nonrelational managed database

• Amazon RDS—A relational managed database

• Amazon ElastiCache—A managed in-memory cache

• Amazon Redshift—A managed data warehouse

• Useful insights from IGAWorks

• Wrap-up

Page 6: (DAT202) Managed Database Options on AWS

Why managed database services?

Page 7: (DAT202) Managed Database Options on AWS

If you host your databases on-premises

Power, HVAC, net

Rack and stack

Server maintenance

OS patches

DB s/w patches

Database backups

Scaling

High availability

DB s/w installs

OS installation

you

App optimization

Page 8: (DAT202) Managed Database Options on AWS

If you host your databases on-premises

Power, HVAC, net

Rack and stack

Server maintenance

OS patches

DB s/w patches

Database backups

Scaling

High availability

DB s/w installs

OS installation

you

App optimization

Page 9: (DAT202) Managed Database Options on AWS

If you host your databases in Amazon EC2

Power, HVAC, net

Rack and stack

Server maintenance

OS patches

DB s/w patches

Database backups

Scaling

High availability

DB s/w installs

OS installation

you

App optimization

Page 10: (DAT202) Managed Database Options on AWS

If you host your databases in Amazon EC2

OS patches

DB s/w patches

Database backups

Scaling

High availability

DB s/w installs

you

App optimization

Power, HVAC, net

Rack and stack

Server maintenance

OS installation

Page 11: (DAT202) Managed Database Options on AWS

If you choose a managed DB service

Power, HVAC, net

Rack and stack

Server maintenance

OS patches

DB s/w patches

Database backups

App optimization

High availability

DB s/w installs

OS installation

you

Scaling

Page 12: (DAT202) Managed Database Options on AWS

Quick summary of the options

• Self-Managed—You are responsible for the hardware,

OS, security, updates, backups, replication etc., but have

full control over it.

• EC2 Instances—You only need to focus on the database

level updates, patches, replication, backups etc. and

don’t have to worry about the hardware or the OS

installation.

• Fully Managed—You get features such as backup and

replication etc. as a package service and don’t have to

bother with patching and updates.

Page 13: (DAT202) Managed Database Options on AWS

What are the AWS managed DB

options?

Page 14: (DAT202) Managed Database Options on AWS

A managed service for each major DB type

Amazon

DynamoDB

Document

and Key-

Value Store

Amazon

RDS

SQL

Database

Engines

Amazon

ElastiCache

In-Memory

Key-Value

Store

Amazon

Redshift

Data

Warehouse

Page 15: (DAT202) Managed Database Options on AWS

Pick the best tool for the job

Page 16: (DAT202) Managed Database Options on AWS

Decisions

NoSQL

vs. SQL

Aurora

vs.

MySQLDynamoDB

vs. Mongo

Page 17: (DAT202) Managed Database Options on AWS

NoSQL vs. SQL for a new app: how to choose?

• Schema-less, easy reads

and writes, simple data

model

• Scaling is easy

• Focus on performance and

availability at any scale

• Strong schema, complex

relationships,

transactions and joins

• Scaling is difficult

• Focus on consistency

over scale and availability

NoSQL SQL

Page 18: (DAT202) Managed Database Options on AWS

What is Amazon DynamoDB?

Page 19: (DAT202) Managed Database Options on AWS

Amazon DynamoDB

NoSQL database

Fully managed

Single-digit millisecond latency

Massive and seamless scalability

Low costAmazon

DynamoDB

Page 20: (DAT202) Managed Database Options on AWS

Popular use cases

Ad Tech IoT GamingMobile

& Web

Ad serving,

retargeting, ID

lookup, user

profile

management,

session-

tracking, RTB

Tracking state,

metadata and

readings from

millions of

devices, real-

time

notifications

Recording

game details,

leaderboards,

session

information,

usage history,

and logs

Storing user

profiles,

session details,

personalization

settings, entity

specific

metadata

Page 21: (DAT202) Managed Database Options on AWS

Predictable, low latency performance

Consistent single-digit millisecond latency even at massive scales

Page 22: (DAT202) Managed Database Options on AWS

Writes

Replicated continuously to 3 AZs

Persisted to disk (custom SSD)

Reads

Strongly or eventually consistent

No latency trade-off

Automatic replication for rock-solid durability and

availability

Page 23: (DAT202) Managed Database Options on AWS

Amazon DynamoDB is a schemaless database

Attributes

Schema-lessSchema is defined per item

Items

Table

Item Key

Page 24: (DAT202) Managed Database Options on AWS

Define the desired performance using provisioned

throughput

Read

capacity unitsWrite

capacity units

1 RPS > 2.5 M

requests in a

month

Page 25: (DAT202) Managed Database Options on AWS

You pay for the resources that you use

Monthly

bill = GB +

Pricing varies by region. Further details at http://aws.amazon.com/dynamodb/pricing/

Storage

consumed

Write

capacity

units

(WCUs)

+

Read

capacity

units

(RCUs)

Free tier:

• Generous free tier of 25 GB, 25 WCUs, and 25 RCUs

• That is, you get over 60M read requests and 60M write request for free in a month

• The free tier is indefinite—you benefit from this every month

Page 27: (DAT202) Managed Database Options on AWS

What is Amazon RDS?

Page 28: (DAT202) Managed Database Options on AWS

Relational databases

Fully managed

Fast, predictable performance

Simple and fast to scale

Low cost, pay for what you useAmazon

RDS

Amazon Aurora

Page 29: (DAT202) Managed Database Options on AWS

Use cases

Applicable wherever you need relational databases

eCommerce Gaming

Websites IT Solutions

Apps

Reporting

Page 30: (DAT202) Managed Database Options on AWS

RDS feature matrix

Feature Aurora MySQL PostgreSQL Oracle SQL Server

VPC

High availability

Instance scaling

Encryption Coming

soon

Read replicas Oracle Golden

GateCross region

Max storage 64 TB 6 TB 6 TB 6 TB 4 TB

Scale storage Auto

Scaling

Provisioned IOPS NA 30,000 30,000 30,000 20,000

Largest instance R3.8XL R3.8XL R3.8XL R3.8XL R3.8XL

Page 31: (DAT202) Managed Database Options on AWS

Amazon Aurora: Fast, available, and MySQL-compatible

SQL

Trans-

actions

AZ 1 AZ 2 AZ 3

Caching

Amazon

S3

5x faster than MySQL on

same hardware

Sysbench: 100K writes/sec

and 500K reads/sec

Designed for 99.99%

availability

6-way replicated storage

across 3 AZs

Scale to 64 TB and 15 read

replicas

Page 32: (DAT202) Managed Database Options on AWS

Amazon RDS is simple and fast to scale

Database instance types

offer a range of CPU and

memory selections

Scale up or down among

instance types on demand

Database storage is

scalable on demand

Page 33: (DAT202) Managed Database Options on AWS

Amazon RDS offers fast, predictable storage

General Purpose

(SSD) for most

workloads

Provisioned IOPS

(SSD) for OLTP

workloads up to

30,000 IOPS

Magnetic for small

workloads with

infrequent access

Page 34: (DAT202) Managed Database Options on AWS

High availability Multi-AZ deployments

Enterprise-grade fault tolerance solution for

production databases

Page 35: (DAT202) Managed Database Options on AWS

Choose cross-region replication for enhanced data locality,

even more ease of migration

Even faster recovery in the

event of disaster

Bring data close to your

customers

Promote to a master for

easy migration

Page 36: (DAT202) Managed Database Options on AWS

Monthly

bill = +

Further details at http://aws.amazon.com/rds/pricing/

You pay for the resources that you use

Storage

consumed

Duration for which DB

instances were used

(Price depends on

type of storage)

(Price depends on

type of DB instance)

Free tier (for first 12 months)

• 750 micro DB instance hours

• 20 GB of DB storage

• 20 GB for backups

• 10 million I/O operations

GBN ×

Page 37: (DAT202) Managed Database Options on AWS

Selected Amazon RDS customers

Page 38: (DAT202) Managed Database Options on AWS

What is Amazon ElastiCache?

Page 39: (DAT202) Managed Database Options on AWS

In-memory key-value store

High-performance

Memcached and Redis

Fully managed; zero adminAmazon

ElastiCache

Page 40: (DAT202) Managed Database Options on AWS

Caching layer for performance or cost optimization

of an underlying database

Storage of ephemeral key-value data

High-performance application patterns such as

leaderboards (for gaming users), session

management, event counters, in-memory lists

Popular use cases

Page 41: (DAT202) Managed Database Options on AWS

Key ElastiCache features

• Fully managed

• Cache node auto-

discovery

• Multi-AZ node

placement

• Fully managed

• Multi-AZ with

auto-failover

• Persistence

• Read replicas

Page 42: (DAT202) Managed Database Options on AWS

How ElastiCache billing works

Monthly

bill = N ×

Further details at http://aws.amazon.com/elasticache/pricing/

Duration for which the

nodes were usedNumber of nodes

(Price depends on type

of node)

Free tier (for first 12 months)—750 micro cache node hours

Page 43: (DAT202) Managed Database Options on AWS

Selected ElastiCache customers

Page 44: (DAT202) Managed Database Options on AWS

What is Amazon Redshift?

Page 45: (DAT202) Managed Database Options on AWS

Amazon

Redshift

a lot faster

a lot cheaper

a whole lot simpler

Relational data warehouse

Massively parallel; petabyte scale

Fully managed

HDD and SSD platforms

$1,000/TB/year; starts at $0.25/hour

Page 46: (DAT202) Managed Database Options on AWS

Popular use cases

10x cheaper

Easy to provision

Higher DBA productivity

Traditional

enterprises

10x faster

No programming

Easily leverage BI tools,

Hadoop, machine

learning, streaming

Companies

with big data

Analysis in-line with

process flows

Pay as you go, grow as

you need

Managed availability and

disaster recovery

SaaS

companies

Page 47: (DAT202) Managed Database Options on AWS

Amazon Redshift architectureLeader node

• Simple SQL endpoint

• Stores metadata

• Optimizes query plan

• Coordinates query execution

Compute nodes

• Local columnar storage

• Parallel/distributed execution of all

queries, loads, backups, restores,

resizes

Start at just $0.25/hour, grow to 2 PB

(compressed)

• DC1: SSD; scale 160 GB–326 TB

• DS2: HDD; scale 2 TB–2 PB

10 GigE

(HPC)

IngestionBackupRestore

JDBC/ODBC

Page 48: (DAT202) Managed Database Options on AWS

Amazon Redshift is fast

Dramatically less I/O

Column storage

Data compression

Zone maps

Direct-attached storage

Large data block sizes

10 | 13 | 14 | 26 |…

… | 100 | 245 | 324

375 | 393 | 417…

… 512 | 549 | 623

637 | 712 | 809 …

… | 834 | 921 | 959

10

324

375

623

637

959

ID Age State Amount

123 20 CA 500

345 25 WA 250

678 40 FL 125

957 37 WA 375

Page 49: (DAT202) Managed Database Options on AWS

Fully managed, continuous/incremental backups

Multiple copies within cluster

Continuous and incremental backups

to Amazon S3

Continuous and incremental backups

across regions

Streaming restore

Amazon S3

Amazon S3

Region 1

Region 2

Page 50: (DAT202) Managed Database Options on AWS

Amazon Redshift offers rock-solid fault tolerance

Amazon S3

Amazon S3

Region 1

Region 2

Disk failures

Node failures

Network failure

AZ/region level disasters

Page 51: (DAT202) Managed Database Options on AWS

You pay for what you use

Further details at https://aws.amazon.com/redshift/pricing/

Monthly

bill = N ×

Duration for which the

nodes were usedNumber of nodes

(Price depends on type

of node)2 month free trial

Leader node is free

No upfront costs, pay as you go

Price includes three data copies

Backup storage is free up to 100% of provisioned storage

3x data compression on average

Page 52: (DAT202) Managed Database Options on AWS

Redshift has a large ecosystem

Data Integration Systems IntegratorsBusiness Intelligence

Page 53: (DAT202) Managed Database Options on AWS

Selected Amazon Redshift customers

Page 54: (DAT202) Managed Database Options on AWS

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Jeongsang Baek, VP of Engineering, IGAWorks

October 2015

IGAWorksRe-architecting Your Application at the Speed of

AWS Innovation

Page 55: (DAT202) Managed Database Options on AWS

No.1 mobile business platform in Korea

Page 56: (DAT202) Managed Database Options on AWS

IGAWorks provides

• Adbrix: App analytics and marketing attribution

• Adpopcorn: Monetization

• Live Operation: Operating tools for in-app campaigns

• Nanoo, Jiver: In-app engagement

All services are offered at no cost

Page 57: (DAT202) Managed Database Options on AWS

Architecture of legacy service

Adbrix User

MobileDevice

Amazon Route 53

EC2Analytics MSSQL

DatabasesAnalytics

AWS Tokyo region

EC2Tracking

API

MSSQLDatabases

ActivityStorage

Over hundreds of EC2 instances

Over dozens of MSSQL instances

Over 1 PB EBS

Page 58: (DAT202) Managed Database Options on AWS

Challenges

• Cost burden

• Operational burden

• Performance improvements

Page 59: (DAT202) Managed Database Options on AWS

Use case: Adpopcorn

• Any app can be media for incentivized ads

• Reward a user in exchange for completing an action such as

installing or running the advertiser’s app

• Types

• Offerwall

• Lock screen ads

Page 60: (DAT202) Managed Database Options on AWS

Participating incentivized ads

1. Open offerwall

2. Request available ads 3. Read available ads

5. Response available offers

6. Install and run advertiser’s app

Ad serve API Ad inventory

7. Sends the first run activity

8. Put a participation logMobiledevice

Participation logs

4. Check participation logs and de-duplicate ads

9. Give promised reward

Page 61: (DAT202) Managed Database Options on AWS

Points to improve performance

• Ad inventory

• Store complex relational data

• Boost DB read request

• Participation log

• High read/write throughput

• Low latency

Page 62: (DAT202) Managed Database Options on AWS

Re-Architecting Adpopcorn

ElastiCacheAd Inventory

AWS Elastic Beanstalk

Ad Serve API

Dynamo DBParticipation Log

Route 53MobileDevice

AWS Tokyo region

Amazon Kinesis

Participation Stream

Elastic Beanstalk

ETL Worker

Amazon RDSMonetization

Report

Amazon RDSAd Inventory

Page 63: (DAT202) Managed Database Options on AWS

Use case: Adbrix

• Legacy

• Stored ‘all’ activities to MSSQL EC2 instances

• Expensive to store raw data to Amazon Elastic Block Store

• Hard to scale out and distribute data

• If one EC2 instance went down, then the whole service failed

• Storage size limitation

• Need to constantly monitor the storage whether it is full or not

Page 64: (DAT202) Managed Database Options on AWS

Re-architecting Adbrix

EMR-SparkDaily Batch

Analysis

Adbrix User

MobileDevice

Route 53

EC2Adbrix

Analytics

DatabaseAdbrix

Analytics

Elastic Beanstalk

Activity Tracker

Amazon Kinesis

Elastic Beanstalk

Activity Process

Amazon S3Activity

Storages

Amazon Lambda

Micro-batch loading

Amazon Redshift

BI Analysis

AWS Tokyo region AWS N. Virginia region

Cross Region

Replication

ElastiCacheAd Inventory

Dynamo DBParticipation Log

Amazon RDSAd Inventory

Page 65: (DAT202) Managed Database Options on AWS

• Amazon RDS:

- For ad inventory with strong schema, complex relationships, queryable data

- High availability Multi-AZ deployments

• Amazon DynamoDB:

- For participation log with heavy read/write load

- Single-digit millisecond latency

• Amazon ElastiCache:

- Redis/Memcached for fast and complex caching ad inventory

- Offloading the massive read request from RDS

• Amazon Redshift:

- For petabyte-scale big data analysis

- Export business insight easily by using reporting tool

DB heroes!

Fully-managed! Low cost! High performance!

Page 66: (DAT202) Managed Database Options on AWS

Monthly cost report

Jan Feb Mar Apr May Jun Jul Aug

IGAWorks Cost Trend in 2015

Amazon ElastiCache

Amazon RDS

Amazon DynamoDB

Amazon Redshift

Others

Page 67: (DAT202) Managed Database Options on AWS

Result

• Reduced 40% cost of analysis

• Scaled out more easily to support 130 million devices

• Guaranteed 2-digit latency from ad serve API response

+ Recruitment policy is changed

Page 68: (DAT202) Managed Database Options on AWS

Lesson learned

Start your business today.

You may face with a difficult problem.

However, AWS already has the solutions.

Page 69: (DAT202) Managed Database Options on AWS

To sum up…

Page 70: (DAT202) Managed Database Options on AWS

Review: AWS managed DB services

Amazon

DynamoDB

Document

and Key-

Value Store

Amazon

RDS

SQL

Database

Engines

Amazon

ElastiCache

In-Memory

Key-Value

Store

Amazon

Redshift

Data

Warehouse

Page 71: (DAT202) Managed Database Options on AWS

Benefits of AWS managed database services

Pay only for what

you use

No up-front cost

Fully managed

services

AWS handles

installs, patching,

restarts

Easy to scale

Grow as you need

Designed for use

with other AWS

services

AWS

Data PipelineAmazon

EC2

Amazon

S3

Amazon

CloudWatchAmazon

SNS

Amazon

VPC

Page 72: (DAT202) Managed Database Options on AWS

Related Sessions

• DAT201 - Introduction to Amazon Redshift

Oct 7 – 1:30pm – 2:30pm

• DAT204 - NoSQL? No Worries: Building Scalable

Applications on AWS NoSQL Services

Oct 7 – 1:30pm – 2:30pm

• DAT301 - Amazon Aurora Deep Dive

Oct 7 – 2:40pm – 3:45pm

• DAT407 - Amazon ElastiCache: Deep Dive

Oct 8 – 11am – 12pm

Page 73: (DAT202) Managed Database Options on AWS

Thank you!

Page 74: (DAT202) Managed Database Options on AWS

Remember to complete

your evaluations!