AWS as a Data Platform - AWS Symposium 2014 - Washington D.C.
-
Upload
amazon-web-services -
Category
Business
-
view
468 -
download
0
description
Transcript of AWS as a Data Platform - AWS Symposium 2014 - Washington D.C.
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
AWS as a Data PlatformChris Keyser
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Ease of useLower costs
Why AWS?
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
no capital investment
pay as you go
no subscriptions
only pay for what you use
Ease of useLower costs
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
programmable
zero admin easy to configure
integrate with existing tools
Ease of useLower costs
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
One tool to rule them all
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
II
Use the right tools
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Movement and Coordination
Data PipelineDirect Connect Storage GatewayImport / Export
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Storage and Analysis Services
EC2EBS
Instance Storage
RedshiftRDS
SQL Stores
EMR
Hadoop
DynamoDB
NOSQL
Kinesis
Stream
CloudSearch
Search
S3
Storage Services
CloudFrontGlacier
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Movement and Coordination
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Movement and Coordination - Plumbing
Ship us your disks
Direct Connect
Storage Gateway
Import / Export
Dedicated network pipes
Storage backup & archiving
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
AWS Data Pipeline
Resource management
Scheduling, execution, and retry
Dependency tracking
Failure notification
Movement and Coordination - Orchestration
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Data Storage and Analysis
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Storage Services – Object Store
Amazon S3
> 1.5 million peak requests/sec
Designed for 99.999999999% durability
Trillions of objects
Stores anything
Lifecycle and Versioning
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Storage Services - Archive Storage
Low cost, durable archiving
“Cold Storage”
Infrequently accessed data
Integrated S3 lifecycle policies
Amazon Glacier
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Storage Services – Edge Caching
Simple to use with global footprint
Streaming support
Large file distribution
Private content
S3, EC2 and ELB integration
Geo restrictions
Amazon CloudFront
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Instance Storage - Options
Ephemeral Storage (“local”)You manage backup/restoralHigh Storage instances available i2.8xlarge – 6.4 TB SSD (350K IOPS) hs1.8xlarge – 48 TB Disk Storage
AmazonEC2
Elastic Block Storage“Network Attached Storage”Snapshot, EncryptionProvisioned throughput (IOPS)
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Instance Storage - Build Your Own
AmazonEC2
NFS
MongoDB
Cassandra
GraphLab
Titan
Kafka
Luster
Gluster
Flume
Scribe
Presto
…and more
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
MySQL, Oracle, SQLServer, Postgres
Backup/Restore, High Availability
Push Button Scalability
Up to 3 TB and 30K IOPS
Amazon RDS
SQL Stores - Managed Relational DB
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Relational data warehouse
Massively parallel
Petabyte scale
Fully managed
$1,000/TB/Year
Amazon Redshift
SQL Stores- Petabyte Data Warehouse
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
SQL Stores- Amazon Redshift Architecture
• Leader Node– SQL endpoint
– Stores metadata
– Coordinates query execution
• Compute Nodes– Local, columnar storage
– Execute queries in parallel
– Backup and restore via S3
– Parallel load from S3, EMR, or DynamoDB
• HW optimized for data processing– DW1: 2TB – 1.6PB Magnetic
– DW2: 160GB – 256TB SSD
10 GigE(HPC)
IngestionBackupRestore
SQL Clients/BI Tools
128GB RAM
16TB disk
16 cores
Amazon S3 / DynamoDB / SSH
JDBC/ODBC
128GB RAM
16TB disk
16 coresCompute Node
128GB RAM
16TB disk
16 coresCompute Node
128GB RAM
16TB disk
16 coresCompute Node
LeaderNode
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
NoSQL Database
Seamless scalability
Zero admin
Single digit millisecond latency
Amazon DynamoDB
NoSQL – Dial Up Capacity
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
WRITESContinuously replicated to 3 AZ’s
Quorum acknowledgmentPersisted to disk (custom SSD)
READSStrongly or eventually consistent
No trade-off in latency
NoSQL - Durable Low Latency at Scale
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Hive, Impala, Spark, Pig, MapReduce
Easy to use; fully managed
On-demand and spot pricing
Persistent and transient clusters
Deep integration with S3
Amazon Elastic Map
Reduce
Hadoop – On Demand
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Master instance group
Task instance groupCore instance group
HDFS HDFS
Amazon S3Amazon Redshift
Amazon DynamoDB
Hadoop – Tuned for AWS
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Real-time data collection
Seamlessly scale to gigabytes/s
Low cost managed service
EMR integration
Low cost managed service
Streaming - at Scale
Amazon Kinesis
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Streaming - Amazon Kinesis Architecture
Amazon Web Services
AZ AZ AZ
Durable, highly consistent storage replicates dataacross three data centers (availability zones)
Millions ofsources producing100s of terabytes
per hour
FrontEnd
AuthenticationAuthorization
Ordered streamof events supportsmultiple readers
Inexpensive: $0.028 per million puts
Aggregate analysis in Hadoop or data Warehouse
Machine learning algorithms or sliding window analytics
Real-time dashboards and alarms
Aggregate andArchive to S3
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Fully managed search engine
Simple to operate
Highly available
User configurable scaling
Advanced feature support
Search – Made Simple
AmazonCloudSearch
34 languagesAlgorithmic stemmingGeospatial searchFaceted search
SuggestionsHighlightingField weighting…
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
The right tool. At the right time. At the right scale.
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
Thank YouChris Keyser