Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic...

17
Amazon Web Services Università degli Studi di Roma Tor VergataDipartimento di Ingegneria Civile e Ingegneria Informatica Corso di Sistemi Distribuiti e Cloud Computing A.A. 2013/14 Valeria Cardellini Amazon Web Services A collection of Web services that provides developers with compute, storage, and more advanced services, such as managing structured and semi-structured data, messaging Mostly popular for IaaS services and primarily for its elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1

Transcript of Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic...

Page 1: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

Amazon Web Services

Università degli Studi di Roma “Tor Vergata” Dipartimento di Ingegneria Civile e Ingegneria Informatica

Corso di Sistemi Distribuiti e Cloud Computing A.A. 2013/14

Valeria Cardellini

Amazon Web Services

•  A collection of Web services that provides developers with compute, storage, and more advanced services, such as managing structured and semi-structured data, messaging

•  Mostly popular for IaaS services and primarily for its elastic compute service EC2

Valeria Cardellini - SDCC 2013/14

1

Page 2: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

Which services

•  Compute & Networking –  Elastic Compute Cloud (EC2) –  Auto Scaling –  Elastic Load Balancing –  Route 53 –  Virtual Private Cloud (VPC) –  Amazon WorkSpaces –  Amazon Marketplace

•  Storage & CDN –  Simple Storage Service (S3) –  Glacier –  Elastic Block Store (EBS) –  CloudFront

•  Analytics –  Elastic MapReduce (EMR) –  Data Pipeline –  Kinesis

•  Database –  DynamoDB –  Relational Database Service

(RDS) –  ElastiCache –  Redshift

•  Application services (some) –  Simple Queue Service (SQS) –  Simple Notification Service

(SNS) –  Elastic Transcoder

•  Deployment & Management –  CloudWatch –  Elastic Beanstalk –  Identity and Access Management

(IAM)

2 Valeria Cardellini - SDCC 2013/14

AWS global infrastrastructure

•  AWS Regions http://aws.amazon.com/about-aws/globalinfrastructure/ –  Geographically dispersed –  Current regions: US East (Northern Virginia), US West

(Oregon), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Sydney), Asia Pacific (Beijing), South America (Sao Paulo)

–  Special AWS GovCloud region for US government agencies and customers (security issues!)

Valeria Cardellini - SDCC 2013/14

3

Page 3: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

Elastic Compute Cloud (EC2) •  Provides elastic compute capacity in the cloud

–  Allows to increase/decrease capacity (start/stop instances) within minutes

–  From one to thousands of server instances can be launched simultaneously

•  Instances can be configured for memory, number of processors, and storage by choosing “T-shirt sizes”

•  Pay-per-use on hourly basis •  Guarantees complete control over instances

–  root SSH access, GUI, command line tools, APIs •  Offers advanced services

–  Elastic Block Store –  Elastic Load Balancer –  CloudWatch + AutoScaling –  Elastic IP –  Amazon Elastic Beanstalk

4 Valeria Cardellini - SDCC 2013/14

EC2 instances

Valeria Cardellini - SDCC 2013/14

5

•  Processing power of EC2 instances expressed as number of core and EC2 Compute Units (ECU) -  1 ECU = equivalent CPU capacity of a 1.0-1.2 GHz 2007

Opteron or 2007 Xeon processor -  See http://aws.amazon.com/ec2/pricing/ for pricing

•  EC2 instance types http://aws.amazon.com/ec2/instance-types/ –  General-purpose

•  Various “T-shirt sizes”: small, medium, large, xlarge, 2xlarge –  Compute optimized

•  medium, large, xlarge, 2xlarge, 4xlarge, 8xlarge –  Memory optimized

•  xlarge, 2xlarge, 4xlarge, 8xlarge –  Storage optimized

•  xlarge, 2xlarge, 4xlarge, 8xlarge –  GPU instances

•  2xlarge, 4xlarge –  Micro instances

•  Very low-cost instance option, providing a small amount of CPU resources (good choice for SDCC project development!)

Page 4: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

EC2 instances (2) •  On-demand instances

–  Billing per-hour with no long-term commitments •  Reserved instances

–  One-time payment to reserve an instance for 1 or 3 years –  Significant discount on hourly usage charge

•  Spot instances –  Enable users to bid for unused EC2 capacity –  Run whenever the user bid exceeds the current spot price –  Spot price fluctuates periodically depending on supply/demand

6 Valeria Cardellini - SDCC 2013/14

•  Interesting related issue we are investigating: how to determine optimal (and long term) IaaS provider pricing strategy and IaaS user bidding strategy

Amazon Machine Images

•  Amazon Machine Images (AMI) are templates from which it is possible to create a virtual machine –  Contain a software configuration, including an operating

system •  Created from scratch or bundled from existing EC2

instances •  Once created, stored in S3 and identified by a unique

identifier –  Personal use –  Available to other users

Valeria Cardellini - SDCC 2013/14

7

Page 5: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

EC2 locations •  Amazon EC2 is hosted in multiple locations world-wide

(AWS Regions) •  Each region is completely independent

•  Each region consists of one or more availability zones •  Availability Zones

–  Distinct locations in the same region engineered to be insulated from failures in other availability zones

–  Availability Zones in a region are connected through low-latency links

–  Used to protect applications from failure of a single location

8 Valeria Cardellini - SDCC 2013/14

EC2 interface •  AWS management console •  Command Line Interface (CLI) •  Web services (WSDL file) •  Query APIs

•  Third-party libraries •  Third-party management frameworks

–  E.g., Cloud42

Valeria Cardellini - SDCC 2013/14

9

Page 6: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

Elastic Block Store (EBS)

•  Provides block level storage volumes for EC2 instances –  Off-instance storage that persist independently from the life of an

EC2 instance –  EBS volumes from 1 GB to 1 TB

•  EBS volumes can be used at instance’s boot partitions or attached to running instances as standard block devices –  A volume can only be attached to one instance at time, but many

volumes can be attached to a single instance –  EBS volumes can be attached only to instances in the same

availability zone

•  EBS volumes automatically replicated within the same availability zone to avoid data loss

•  EBS provides the ability to create point-in-time snapshots of volumes that can be stored using S3

10 Valeria Cardellini - SDCC 2013/14

Elastic IP & Virtual Private Cloud

•  Elastic IP –  IP address is not associated with a particular instance but

with a user account –  The user controls an elastic IP address until he explicitly

releases it –  Allows to mask instance or availability zone failures by

quickly remapping the elastic IP address to another instance/load balancer

•  Virtual Private Cloud –  Enables enterprises to connect their existing infrastructure to

a set of isolated AWS compute resources via a Virtual Private Network (VPN) connection

11 Valeria Cardellini - SDCC 2013/14

Page 7: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

CloudWatch

•  Provides monitoring for AWS resources (e.g., EC2 instances) and user applications run on AWS

•  CloudWatch is metric repository –  AWS services put metrics in the repository –  Users retrieve statistics based on those metrics

12 Valeria Cardellini - SDCC 2013/14

CloudWatch concepts •  Metric

–  Time-ordered set of data points –  PutMetricData API allows users to create custom metrics

•  Statistics –  Metric data aggregated over specified periods of time –  Available statistics: min, max, sum, average, sample count –  Retrieved by GetMetricStatistics API

•  Period –  Length of time associated with a specific CloudWatch statistic –  Mesaured in seconds, ranges from 60 (one minute) to 1209600

(two weeks)

•  Alarm –  Watches a single metric over a specified time period –  Performs one or more actions based on the value of the metric

relative to a given threshold over a number of time periods

13 Valeria Cardellini - SDCC 2013/14

Page 8: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

CloudWatch metrics

•  EC2 metrics –  CPUUTilization!–  DiskReadOps/DiskWriteOps!–  DiskReadBytes/DiskWriteBytes!–  NetworkIn/NetworkOut!

•  Elastic Load Balancing metrics –  Latency!–  RequestCount!–  HealthyHostCount/UnHealthyHostCount!–  Count of HTTP response codes (2xx, 3xx, 4xx, 5xx) generated

by Load Balancer or back-end instances

14 Valeria Cardellini - SDCC 2013/14

ServicesServices EditEdit Valeria CardelliniValeria Cardellini OregonOregon HelpHelp

1

4

1

2

DashboardAlarms

ALARMINSUFFICIENTOKBilling

MetricsSelected MetricsAuto ScalingDynamoDBEBSEC2ELBElastiCacheRDSSNSSQS

Create Alarm

Create Alarm

Create AlarmCreate AlarmCopy URL

Copy URL

Copy URLCopy URL

Update Graph

Update Graph

Update GraphUpdate Graph

Time Range

Tools

�$$"� 1h | 3h | 6h | 12h | 1d | 3d | 1w | 2w

��!�'�)� �&$!('�

�%$"� �� � �$(%&���$

�$� � � "�#('�&���$

CPUUtilization (Percent), DiskReadBytes (Bytes)Average 5 Minutes

2/300:00

2/302:00

2/304:00

2/306:00

2/308:00

2/310:00

0

10

20

30

40

-1

-0.5

0

0.5

1

Left axis units: Percent Right axis units: Bytes

CPUUtilization DiskReadBytes

Browse Metrics

Browse Metrics

Browse MetricsBrowse Metrics ��'$����$��������'%��&���%�����'%��& EC2 Metrics

Showing all results (Showing all results (105105) for ) for EC2 MetricsEC2 Metrics..

Select All | ClearEC2 > Per-Instance Metrics

���������� ����'�!�+�'�$#

���������� �& �����*'�&

InstanceId Metric Name

Feedback© 2008 - 2014, Amazon Web Services, Inc. or its affiliates. All rights reserved. Privacy Policy Terms of Use

���������

CloudWatch interface •  Command line tools •  Libraries

–  JAVA, .NET, PHP, Ruby •  Query API

–  HTTP/HTTPS GET or POST requests •  AWS Management Console

15 Valeria Cardellini - SDCC 2013/14

Page 9: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

CloudWatch Alarms & Auto Scaling •  An alarm watches a single metric over a time period and performs

one or more actions based on the value of the metric relative to a given threshold over a number of time periods

•  Possible states: OK, ALARM, INSUFFICIENT_DATA!•  When an alarm changes its state, an action is invoked

–  Notification through Amazon SNS –  Auto Scaling policy

Example Threshold = 3

minimum breach = 3 periods

16 Valeria Cardellini - SDCC 2013/14

Auto Scaling •  Auto Scaling allows to automatically scale EC2 capacity

up or down based on user-defined policies, health status checks, and schedules, e.g., –  Dynamically based on conditions specified by user (e.g.,

increasing CPU utilization of the Amazon EC2 instance) –  Predictably according to a schedule defined by user (e.g.,

every Friday at 13:00:00). •  Enabled by Amazon CloudWatch

–  Uses CloudWatch alarms

17 Valeria Cardellini - SDCC 2013/14

Page 10: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

Auto Scaling policies

•  Auto Scaling policy defines action to take when an alarm state changes

•  For every monitored event, 2 policies to be defined: –  Scale-in policy –  Scale-out policy

•  Example: –  Scale-out: to add new Amazon EC2 instances in increments of 3

instances to the Auto Scaling Group when the average CPU utilization of the Amazon EC2 fleet goes above 70%

–  Scale-in: to remove Amazon EC2 instances in the same increments when CPU Utilization falls below 10%

18 Valeria Cardellini - SDCC 2013/14

Auto Scaling policies (2) •  A policy can be created using PutScalingPolicy API

with the following parameters: –  AdjustmentType: possible values are ChangeInCapacity, ExactCapacity, PercentChangeInCapacity!

–  Cooldown: amount of time after a scaling activity completes before any further trigger-related scaling activities can start

–  PolicyName!–  ScalingAdjustment: number of instances by which to scale

(positive or negative)

•  Sample request: https://autoscaling.amazonaws.com/?AutoScalingGroupName=my-test-asg &ScalingAdjustment=30

&AdjustmentType=PercentChangeInCapacity &PolicyName=my-scaleout-policy &Version=2011-01-01 &Action=PutScalingPolicy &AUTHPARAMS

19 Valeria Cardellini - SDCC 2013/14

Page 11: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

Elastic Load Balancing (ELB)

•  Automatically distributes incoming traffic across multiple EC2 instances

20 Valeria Cardellini - SDCC 2013/14

ELB features •  Detects unhealthy instances within a pool and

automatically reroutes traffic to healthy instances •  Enabled across multiple availability zones within a region

–  Not between availability zones in different regions! •  Uses a least loaded balancing policy •  Supports sticky sessions

–  Load balancer generated HTTP cookies –  Application-generated HTTP cookies

•  Supports HTTPS •  Enables the client to define an application health check

for the instances through the following parameters: –  Threshold, Interval, Target, Timeout,

UnhealthyThreshold!

•  Provides APIs to add/remove instances –  RegisterInstancesWithLoadBalancer!–  DeregisterInstancesWithLoadBalancer! 21

Valeria Cardellini - SDCC 2013/14

Page 12: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

EC2 and related services problems •  General problems

–  No real-time billing –  Performance level of a single VM may be variable and there is

very little guarantees regarding the performance of seemingly identical virtual machine instance; e.g., see Resource Provisioning of Web Applications in Heterogeneous Clouds

–  Load balancing policy cannot be customized •  ELB bugs

–  Problems with start/stop instances, better use launch/terminate –  If an instance crashes it remains forever in “unhealthy” status –  Unhealthy instances are not automatically replaced

•  CloudWatch problems –  Metric variation over a time interval is missing –  Request count considers only the requests processed by the

load balancer (system throughput behind the ELB) 22

Valeria Cardellini - SDCC 2013/14

Simple Storage Service (S3) •  Distributed, eventually consistent data store •  Main features

–  Insert, write, read, delete on objects up to 5 TB each •  Objects cannot be manipulated (renamed, modified, relocated)

–  Each object is stored in a bucket and retrieved via a unique key •  Buckets cannot be further partitioned: no physical grouping of objects inside a

bucket (e.g., directories) •  Buckets cannot be nested

–  Accessible through REST and SOAP interfaces –  Access control policies, as well as encryption for both secure transit

and secure storage on disk –  Objects are redundantly stored on multiple devices across multiple

facilities in an Amazon S3 Region •  PUT and COPY operations synchronously store data across multiple facilities

before returning SUCCESS –  Data integrity regularly verified using checksum –  Reduced Redundancy Storage (RRS): storage option to reduce

costs by storing data at lower levels of redundancy 23 Valeria Cardellini - SDCC 2013/14

Page 13: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

S3 pricing and SLA •  Pricing http://aws.amazon.com/s3/pricing/

–  May depend on the AWS region –  Different pricing for storage, request, and data transfer –  Storage pricing (US East)

•  $0.085 per GB for first TB, decreasing price for more TB –  Request pricing (US East)

•  ($0.005 per 1000 PUT, COPY, POST, LIST requests, $0.004 per 10000 GET requests)

–  Data transfer pricing (US East) •  Depends on location source and destination (same or another AWS

region, Internet): IN free, OUT ranges from free (to same region) to $0.002 per GB (to another AWS region) to 0.120 per GB (to Internet)

•  SLA http://aws.amazon.com/s3/sla/ –  AWS will use commercially reasonable efforts to make Amazon

S3 available with a Monthly Uptime Percentage (defined below) of at least 99.9% during any monthly billing cycle… 24

Valeria Cardellini - SDCC 2013/14

Elastic MapReduce •  Enables to process vast amounts of data

–  Applications: web indexing, data mining, logfile analysis, data warehousing, financial analysis, scientific simulation, …

•  Utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon EC2 and S3 –  Apache Hadoop is an open source Java software framework that

supports data-intensive distributed applications running on large clusters of commodity hardware

•  Hadoop implementation of the MapReduce framework –  data in a job flow subdivided in smaller chunks so that they can

be processed in parallel (map function) –  processed data are recombined into the final solution (reduce

function)

•  Allows to implement data processing applications in many languages including Java, Perl, Ruby, Python, PHP, R, or C++

25 Valeria Cardellini - SDCC 2013/14

Page 14: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

Databases •  DynamoDB

–  Fully managed NoSQL database service that provides fast and predictable performance with seamless scalability

–  Consistency model •  Eventually consistent reads (default) •  Strongly consistent reads

–  Automatically spreads the table data and traffic over a sufficient number of servers to handle the request capacity specified by the customer and the amount of data stored, while maintaining consistent, fast performance

–  Data items stored on SSDs and automatically replicated across multiple availability zones in a region

–  Integration with Elastic MapReduce

•  SimpleDB, another AWS NoSQL database, is no longer listed on AWS product page –  Can be a sign that Amazon is phasing it out in favour of its new

counterpart DynamoDB 26 Valeria Cardellini - SDCC 2013/14

Databases (2) •  RDS

–  Relational DB (MySQL or Oracle DB Engine) –  Automatic configuration, management (software patching,

backup) and monitoring –  For MySQL, possible to associate one or more read replicas

27 Valeria Cardellini - SDCC 2013/14

Page 15: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

Simple Queue Service (SQS)

•  Message queuing service that enables asynchronous message based communication between distributed components of an application

•  When a message is received, it becomes “locked” while being processed –  If the message processing fails, the lock will expire and the

message will be available again

28 Valeria Cardellini - SDCC 2013/14

Route 53

•  Highly available and scalable DNS web service –  Routing queries to infrastructure running in AWS and outside

AWS

•  Authoritative DNS server that answers DNS queries with low latency by using a global network of DNS servers –  Queries are routed to the nearest DNS server through IP

anycast –  Designed to automatically scale to handle very large query

volumes without any human intervention •  Routing to AWS resources

–  Latency-based routing: to the lowest-latency AWS endpoint available

–  Weighted Round Robin: probabilistic selection based on statically assigned weight

29 Valeria Cardellini - SDCC 2013/14

Page 16: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

CloudFront •  Content delivery network on top of the Amazon distributed

storage infrastructure •  Delivers static, dynamic and streaming content using a

global network of edge replica servers •  Requests are routed to the closest edge replica through

DNS redirection •  Objects organized in distributions

–  A distribution identifies the origin server –  A distribution has a unique CloudFront.net domain name (e.g.

mydistribution.cloudfront.net)

•  An origin server is the location of the definitive version of an object –  This could be another ASW (S3 bucket, Amazon EC2 instance) or an

external origin server

•  With respect to other CDNs, pay-per-use pricing

30 Valeria Cardellini - SDCC 2013/14

Elastic Beanstalk

•  Allows to deploy and manage applications in the AWS cloud leveraging AWS services such as EC2, S3, SNS, Elastic Load Balancing, and Auto-Scaling

•  Deployment for Java –  Users upload a WAR file containing a Java Web application –  Elastic Beanstalk handles the provisioning of a load balancer and the

deployment of the WAR file to one or more EC2 instances running the Apache Tomcat application server

•  Features –  Easy to deploy new application versions to running environments –  E-mail notifications through SNS when application health changes or

application servers are added or removed •  Auto scaling and load balancing parameters fully customizable

through the AWS Management Console •  Pricing: no additional charge for Elastic Beanstalk, the user pays

only for the underlying AWS resources consumed by the application

31 Valeria Cardellini - SDCC 2013/14

Page 17: Amazon Web Services - Computer Engineering Group ·  · 2014-02-03Amazon Web Services ... elastic compute service EC2 Valeria Cardellini - SDCC 2013/14 1 . ... (IAM) Valeria Cardellini

32 Valeria Cardellini - SDCC 2013/14