GettingStartedWithAurora SF Loft - Meetupfiles.meetup.com/8763012/DatabaseDay_Overview of Amazon...
Transcript of GettingStartedWithAurora SF Loft - Meetupfiles.meetup.com/8763012/DatabaseDay_Overview of Amazon...
©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
Getting Started with Amazon Aurora
KD Singh, AWS Solutions Architect
June2016 | San Francisco, CA
Meet Amazon Aurora ……Databases reimagined for the cloud
R Speed and availability of high-end commercial databases
R Simplicity and cost-effectiveness of open source databases
R Drop-in compatibility with MySQL
R Simple pay as you go pricing
Delivered as a managed service
Relational databases were not designed for cloud
Multiple layers of functionality all in a monolithic stack
SQL
Transactions
Caching
Logging
Not much has changed in last 30 years
Even when you scale it out, you’re still replicating the same stack
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Storage
Application
Reimagining the relational database
What if you were inventing the database today?
You wouldn’t design it the way we did in 1970.
You’d build something ü that can scale out ….ü that is self-healing…. ü that leverages existing AWS services …
A service-oriented architecture applied to the database
Moved the logging and storage layer into a multi-tenant, scale-out database-optimized storage service
Integrated with other AWS services like Amazon EC2, Amazon VPC, Amazon DynamoDB, Amazon SWF, and Amazon Route 53 for control plane operations
Integrated with Amazon S3 for continuous backup with 99.999999999% durability
Control PlaneData Plane
Amazon DynamoDB
Amazon SWF
Amazon Route 53
Logging + Storage
SQL
Transactions
Caching
Amazon S3
1
2
3
Fastest growing service in AWS history
Business applications
Web and mobile
Content management
E-commerce, retail
Internet of Things
Search, advertising
Business intelligence, analytics
Games, media
Aurora customer adoption
Expedia: On-line travel marketplace
§ Real-time business intelligence and analytics on a growing corpus of on-line travel market place data.
§ Current SQL server based architecture is too expensive. Performance degrades as data volume grows.
§ Cassandra with Solr index requires large memory footprint and hundreds of nodes, adding cost.
Aurora benefits:§ Aurora meets scale and performance
requirements with much lower cost.§ 25,000 inserts/sec with peak up to 70,000. 30ms
average response time for write and 17ms for read, with 1 month of data.
World’s leading online travel company, with a portfolio that includes 150+ travel sites in 70 countries.
Thomas Publishing: Connecting buyers and suppliers
§ Have been using Oracle for production database.
§ Rapidly growing volumes of data, need to increase efficiency and deliver results on shorter timelines.
§ Required significant upfront investment in both infrastructure and Oracle license expense.
Aurora benefits:
§ Migrated their production database from Oracle to Amazon Aurora using the AWS Database Migration Service and Schema Conversion Tool.
§ Entire migration process was completed in less than 4 weeks.
In business for over a century, connecting buyers and suppliers across all industrial sectors, evolving from an industrial trade print publisher into industry’s most respected group of digital-friendly businesses.
Alfresco: Enterprise content management
§ Scaling Alfresco document repositories to billions of documents.
§ Support user applications that require sub-second response times.
Aurora benefits:
§ Scaled to 1 billion documents with a throughput of 3 million per hour, which is 10 times faster than their current environment.
§ Moving from large data centers to cost-effective management with AWS and Aurora.
Leading the convergence of enterprise content management and business process management. More than 1,800 organizations in 195 countries rely on Alfresco, including leaders in financial services, healthcare, and the public sector.
Higher Performance, Lower Cost
§ Safe.com lowered their bill by 40% by switching from sharded MySQL to a single Aurora instance.
§ Double Down Interactive (gaming) lowered their bill by 67% while also achieving better latencies (most queries ran faster) and lower CPU utilization.
Aurora benefits:
§ Due to high performance and large storage support, sharded MySQL instances can be consolidated in fewer Aurora instances
§ High performance allows for smaller instances
§ Automated storage provisioning removes the need for storage “headroom”
§ No additional storage for read replicas
“When we ran Alfresco’s workload on Aurora, we were blown away to find that Aurora was 10X faster than our MySQL environment” said John Newton, Founder and CTO of Alfresco. “Speed matters in our business and Aurora has been faster, cheaper, and considerably easier to use than MySQL”
Amazon Aurora is fast
SQL benchmark results
4 cl ient machines wi th 1 ,000 connections each
WRITE PERFORMANCE READ PERFORMANCE
Single cl ient machine wi th 1 ,600 connections
Using MySQL SysBench with Amazon Aurora R3.8XL with 32 cores and 244 GB RAM
Reproducing these results
h t tp s: / /d0 .aw ss ta t i c. com /p rodu ct -ma rke t ing /Au ro ra / RDS_Au ro ra_Pe rfo rmance_A sse ssmen t_Bench ma r king_ v1 -2 .pd f
AMAZON AURORA
R3.8XLARGE
R3 .8XLARGE
R3 .8XLARGE
R3 .8XLARGE
R3 .8XLARGE
• Create an Amazon VPC (or use an existing one).
• Create 4 EC2 R3.8XL client instances to run the SysBench client. All 4 should be in the same Availability Zone (AZ).
• Enable enhanced networking on your clients.
• Tune Linux settings (see whitepaper referenced below).
• Install SysBench version 0.5.
• Launch a r3.8xlarge Amazon Aurora DB instance in the same VPC and AZ as your clients.
• Start your benchmark!
1
2
3
4
5
6
7
Beyond benchmarks
If only real-world applications saw benchmark performance.
POSSIBLE DISTORTIONSReal-world requests contend with each other.Real-world metadata rarely fits in the data dictionary cache.Real-world data rarely fits in the buffer cache.Real-world production databases need to run at high availability.
Scaling user connections
SysBench OLTP workload250 tables
Connections Amazon AuroraAmazon RDS MySQL30 K IOPS (single AZ)
50 40,000 10,000
500 71,000 21,000
5,000 110,000 13,000
8xUP TO
FASTER
Scaling table count
SysBench write-only workload1,000 connections, default settings
TablesAmazon Aurora
MySQLI2.8XL
local SSD
MySQLI2.8XL
RAM disk
RDS MySQL
30 K IOPS(single AZ)
10 60,000 18,000 22,000 25,000
100 66,000 19,000 24,000 23,000
1,000 64,000 7,000 18,000 8,000
10,000 54,000 4,000 8,000 5,000
11xUP TO
FASTER
Number of write operations per second
Scaling dataset sizeSYSBENCH WRITE-ONLY
DB Size Amazon AuroraRDS MySQL
30 K IOPS (single AZ)
1GB 107,000 8,400
10GB 107,000 2,400
100GB 101,000 1,500
1TB 26,000 1,200
67xUP TO
FASTER
DB Size Amazon AuroraRDS MySQL
30K IOPS (single AZ)
80GB 12,582 585
800GB 9,406 69
CLOUDHARMONY TPC-C
136xUP TO
FASTER
Running with read replicas
SysBench write-only workload250 tables
Updates persecond Amazon Aurora
RDS MySQL30 K IOPS (single AZ)
1,000 2.62 ms 0 s
2,000 3.42 ms 1 s
5,000 3.94 ms 60 s
10,000 5.38 ms 300 s
500xUP TO
LOWER LAG
Do fewer IOs
Minimize network packets
Cache prior results
Offload the database engine
DO LESS WORKProcess asynchronously
Reduce latency path
Use lock-free data structures
Batch operations together
BE MORE EFFICIENT
How Do we achieve these results?
Aurora Requires Fewer IOs
Binlog Data Double-write bufferLog records FRM files, metadata
T YPE O F W RI T ES
EBS mirrorEBS mirror
AZ 1 AZ 2
Amazon S3
MYSQL WITH STANDBY
SEQUENTIALWRITE
SEQUENTIALWRITE
EBSAmazon Elastic Block Store (EBS)
PrimaryInstance
StandbyInstance
AZ 1 AZ 3
PrimaryInstance
Amazon S3
AZ 2
ReplicaInstance
AMAZON AURORA
ASYNC4/6 QUORUM
DISTRIBUTED WRITES
Highly available storage
Six copies of data;; quorum system for read/write;; latency tolerant
Background scrubbing;; CRC on the wire & on disk
Peer to peer gossip replication for catchup and recovery
Continuous back to S3 as a quorum set member
10GB segments as unit of repair or hotspot rebalance
AZ 1 AZ 2 AZ 3
Amazon S3
Survivable caches
We moved the cache out of the database process
Cache remains warm in the event of a database restart
Lets you resume fully loaded operations much faster
Instant crash recovery + survivable cache = quick and easy recovery from DB failures
SQLTransactions
Caching
SQLTransactions
Caching
SQLTransactions
Caching
Caching process is outside the DB process and remains warm across a database restart
Traditional databasesHave to replay logs since the last checkpoint
Single-threaded in MySQL;; requires a large number of disk accesses
Amazon AuroraUnderlying storage replays redo records on demand as part of a disk read
Parallel, distributed, asynchronous
Checkpointed Data Redo Log
Crash at T0 requiresa re-application of theSQL in the redo log sincelast checkpoint
T0 T0
Crash at T0 will result in redo logs being applied to each segment on demand, in parallel, asynchronously
Instant crash recovery
Faster, more predictable failover
AppRunningFailure Detection DNS Propagation
Recovery Recovery
DBFailure
MYSQL
AppRunning
Failure Detection DNS Propagation
Recovery
DBFailure
AURORA WITH MARIADB DRIVER
15 - 3 0 s e c
5 - 2 0 s e c
Amazon Aurora is easy to use
“Amazon Aurora’s new user-friendly monitoring interface made it easy to diagnose and address issues. Its performance, reliability and monitoring really shows Amazon Aurora is an enterprise-grade AWS database.” – Mohamad Reza, Information Systems Officer at United Nations
Simplify storage management
§ Continuous, incremental backups to Amazon S3
§ Instantly create user snapshots—no performance impact
§ Automatic storage scaling up to 64 TB—no performance impact
§ Automatic restriping, mirror repair, hot spot management, encryption
Up to 64TB of storage – auto-incremented in 10GB units
up to 64 TB
30
Simplify monitoring with AWS consoleCloudWatch RDS Metricsl CPU utilizationl Storagel Memoryl Swap usagel DB connectionsl I/O (read and write)l Latency (read and write)l Throughput (read and write)l Replica lagl 50+ system/OS metricsl 1-60s granularity
CloudWatch Alarmsl Similar to on-premises custom monitoring tools
Simplify data security
R Encryption to secure data at rest • AES-256;; hardware accelerated• All blocks on disk and in Amazon S3 are encrypted• Key management via AWS KMS
R SSL to secure data in transit
R Network isolation via Amazon VPC by default
R No direct access to nodes
R Supports industry standard security and data protection certifications
Storage
SQL
Transactions
Caching
Amazon S3
Application
If you host your databases on-premises
Power, HVAC, netRack and stack
Server maintenance
OS patches
DB s/w patchesDatabase backups
ScalingHigh availability
DB s/w installs
OS installation
you
App optimization
If you host your databases in Amazon EC2
Power, HVAC, netRack and stack
Server maintenance
OS patches
DB s/w patchesDatabase backups
ScalingHigh availability
DB s/w installs
OS installation
you
App optimization
If you choose a managed DB service
Power, HVAC, netRack and stack
Server maintenance
OS patches
DB s/w patchesDatabase backups
App optimization
High availability
DB s/w installs
OS installation
you
Scaling
Simple pricingNo licensesNo lock-inPay only for what you use
Discounts44% with a 1-year RI63% with a 3-year RI
vCPU Mem Hourly Price
db.r3.large 2 15.25 $0.29
db.r3.xlarge 4 30.5 $0.58
db.r3.2xlarge 8 61 $1.16
db.r3.4xlarge 16 122 $2.32
db.r3.8xlarge 32 244 $4.64
• Storage consumed, up to 64 TB, is $0.10/GB-month• IOs consumed are billed at $0.20 per million I/O• Prices are for Virginia
Enterprise grade, open source pricing
Cost of ownership: Aurora vs. MySQLMySQL configuration hourly cost
Primaryr3.8XL
Standbyr3.8XL
Replicar3.8XL
ReplicaR3.8XL
Storage6TB / 10K PIOP
Storage6TB / 10K PIOP
Storage6TB / 5K PIOP
Storage6TB / 5K PIOP
$3.78/hr
$3.78/hr$3.78/hr $3.78/hr
$2,42/hr
$2,42/hr $2,42/hr
Instance cost: $15.12/ hrStorage cost: $8.30 / hr
Total cost: $23.42 / hr
$2,42/hr
Cost of ownership: Aurora vs. MySQLAurora configuration hourly cost
Instance cost: $13.92/ hrStorage cost: $4.43 / hr
Total cost: $18.35 / hr
Primaryr3.8XL
Replicar3.8XL
ReplicaR3.8XL
Storage / 6TB
$4.64/ hr $4.64/ hr $4.64/ hr
$4.43 / hr
*At a macro level Aurora saves over 50% in storage cost compared to RDS MySQL.
21.6%Savings
§ No idle standby instance
§ Single shared storage volume
§ No PIOPs – pay for use IO
§ Reduction in overall IOP
Cost of ownership: Aurora vs. MySQLFurther opportunity for saving
Instance cost: $6.96/ hrStorage cost: $4.43 / hr
Total cost: $11.39 / hrStorage IOPs assumptions:1. Average IOPs is 50% of Max IOPs2. 50% savings from shipping logs vs. full pages
51.3%Savings
Primaryr3.8XL
Replicar3.8XL
Replicar3.8XL
Storage / 6TB
$2.32/ hr $2.32/ hr $2.32/ hr
$4.43 / hr
r3.4XL r3.4XL r3.4XL
§ Use smaller instance size
§ Pay-as-you-go storage
Start your first migration in 10 minutes or less
Keep your apps running during the migration
Replicate within, to or from Amazon EC2 or RDS
Move data to the same or different database
engine
AWSDatabase Migration
Service
CustomerPremises
Application Users
AWS
Internet
VPN
Start a replication instance
Connect to source and target databases
Select tables, schemas, or databases
Let AWS Database Migration Service create tables, load data, and keep them in sync
Switch applications over to the target at your convenience
Keep your apps running during the migration
AWSDatabase Migration
Service
Migrate off Oracle and SQL Server
Move your tables, views, stored procedures and DML to MySQL, MariaDB, and Amazon Aurora
Know exactly where manual edits are needed
Download at aws.amazon.com/dms
AWSSchema Conversion
Tool
ALTER SYSTEM CRASH [INSTANCE | DISPATCHER | NODE]
ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN [DISK index | NODE index] FOR INTERVAL interval
ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type
[TO ALL | read_replica | availability_zone] FOR INTERVAL interval
Simulate failures using SQL
To cause the failure of a component at the database node:
To simulate the failure of disks:
To simulate the failure of networking: