Download - The Third Why at Gx24

Transcript
Page 1: The Third Why at Gx24

The Third “Why”

Julio Faerman2014-09-29 GX24

http://jfaerman.com.br/gx24

Page 2: The Third Why at Gx24

https://startwithwhy.com

Page 3: The Third Why at Gx24

Security

Availability

ComplianceFault Tolerance

Throughput

Latency

Page 4: The Third Why at Gx24

…but what is the difference?

Page 5: The Third Why at Gx24

16 years2000+ employees

40 million user

http://aws.amazon.com/solutions/case-studies/netflix/http://www.enotechconsulting.com/2013/04/aws-s3-behind-netflix-success/http://variety.com/2014/digital/news/netflix-youtube-bandwidth-usage-1201179643/

Amazon Web Services for 100%

of Streaming

34.2% of all downstream

during primetime

Page 6: The Third Why at Gx24

AmazonSimpleStorageService

• Durable, scalable and fast storage (99.999999999%)

• 2+ Trillion (1012) objects• 1.1+ Million RPS• Native HTTP/S• Full featured:

Permissions, Static Hosting, Logging, Versionamento, Archival and Expiration Lifecycle, Torrent, Tags, Redundancy, Requester Pays, Criptography, Reduced Redundancy and more

DEMO

Page 7: The Third Why at Gx24

1. “Low, pay-as-you-go pricing with no up-front expenses or

long-term commitments.”

2. “Instantly deploy new applications, scale up as your

workload grows, and scale down based on demand.”

http://aws.amazon.com/about-aws/

Page 8: The Third Why at Gx24

3…

Page 9: The Third Why at Gx24

“We will make electricity so cheap that only the rich

will burn candles.”Thomas Edison

The Big Switch: http://amzn.com/039334522X

Page 10: The Third Why at Gx24

Day 1

Page 11: The Third Why at Gx24

http://aws.amazon.com/solutions/case-studies/

Page 12: The Third Why at Gx24

Fear, Uncertainty and Doubt

Topsy Elephant: https://www.youtube.com/watch?v=eh_mJfWKNTI

Page 13: The Third Why at Gx24

http://youtu.be/GRVPGC1haTM

Page 14: The Third Why at Gx24

SecurityComplianceCapacityFault ToleranceCostComplexityBillingScalabilityAvailabilityLatencyThroughput

Page 15: The Third Why at Gx24

Proof of Concept• Quantitative > Qualitative• Iterative• Incremental

Page 16: The Third Why at Gx24

http://www.infoq.com/presentations/JPL-cloud

Page 17: The Third Why at Gx24

JPL Missions

Page 18: The Third Why at Gx24

“Internet of Things”?

Page 19: The Third Why at Gx24

“Batch” Big Data

Page 20: The Third Why at Gx24

“Streaming” Big Data

Page 21: The Third Why at Gx24

How unique data systems are?

http://nathanmarz.com/blog

Page 22: The Third Why at Gx24

3 Interfaces to Amazon Web ServicesConsole, CLI, SDK

Page 23: The Third Why at Gx24

AmazonKinesis

• Real-time processing of streaming data

• High Throuput and Elastic• Integrate with Amazon S3, Amazon

Redshift, and Amazon DynamoDB• Locking, Sharding, Rollback and

more with Kinesis Client Library

Dashboard

CEP

Storage

Page 24: The Third Why at Gx24

AmazonElastic

MapReduce

• Distributed processing with Apache Hadoop

• Near linear scalability• Resizable and disposable Clusters• Apache Hadoop ecosystem:

Hive, Pig, Impala, Spark, ..., …, …• Instant automatic provisioning• Simplified Administration• 5.5M+ Clusters

Page 25: The Third Why at Gx24

• Petabyte Scale Data Warehousing

• Massively parallel OnLine Analytic Processing

• Resizable without downtime• Managed provisioning and

administration• Compatible with PostgreSQL

AmazonRedshift

Page 26: The Third Why at Gx24

Amazon Redshift Architecture

Leader Node

• SQL endpoint

• Stores metadata

• Coordinates query execution

Compute Nodes

• Local, columnar storage

• Execute queries in parallel

• Load, backup, restore via Amazon S3; load from Amazon DynamoDB or SSH

Two hardware platforms

• Optimized for data processing• DW1: HDD; scale from 2TB to 1.6PB• DW2: SSD; scale from 160GB to 256TB

10 GigE(HPC)

IngestionBackupRestore

SQL Clients/BI Tools

128GB RAM

16TB disk

16 cores

Amazon S3 / DynamoDB / SSH

JDBC/ODBC

128GB RAM

16TB disk

16 coresCompute Node

128GB RAM

16TB disk

16 coresCompute Node

128GB RAM

16TB disk

16 coresCompute Node

LeaderNode

Page 27: The Third Why at Gx24

ETL from EMR/Hive to Amazon Redshift trough Amazon S3

EMR S3 Redshift

Extract & Transform Load

UnstructuredUnclean

StructuredClean

ColumnarCompressed

Page 28: The Third Why at Gx24
Page 29: The Third Why at Gx24

7+ Billion

Page 30: The Third Why at Gx24

~50 to ~3500 Instances in 3 days

Page 31: The Third Why at Gx24

AmazonAuto

Scaling

• Adjust capacity to demand• Automated and customizable

provisioning• Integrated monitoring and load

balancing• Maintain fleet size across

availability zones• On-demmand or scheduled actions

DEMO

Page 32: The Third Why at Gx24

280+ Releases in 2014

http://aws.amazon.com/newhttp://aws.amazon.com/blogs/aws

Page 33: The Third Why at Gx24

Where to begin?

Page 34: The Third Why at Gx24

http://aws.amazon.com/training/intro_series/

Page 35: The Third Why at Gx24
Page 36: The Third Why at Gx24

http://aws.amazon.com/training/

Page 37: The Third Why at Gx24

Julio [email protected]

http://jfaerman.com.br/gx24

Thank you! Questions?