Predictor-Directed Stream Buffers Timothy Sherwood Suleyman Sair Brad Calder.
Brad Calder Director/Architect Microsoft Corporation ES04.
Transcript of Brad Calder Director/Architect Microsoft Corporation ES04.
Windows Azure Storage – Essential Cloud Storage Services
Brad CalderDirector/ArchitectMicrosoft Corporation
ES04
Windows Azure
Windows Azure is the foundation of Microsoft’s Cloud Platform
It is an “Operating System for the Cloud” and provides Essential Services for the Cloud Virtualized Computation Scalable Storage Automatic Management Developer SDK
Windows Azure Storage
The goal is to allow users and applications to Access their data efficiently
from anywhere at anytime Store data for any length of time Scale to store any amount of data Be confident that the data will not be lost Pay for only what they use/store
Windows Azure Storage
Storage that is Durable Scalable (capacity and throughput) Highly Available Security Performance Efficient
Rich Data Abstractions Service communication: queues, locks, … Large user data items: blobs, blocks, … Service state: tables, caches, …
Simple and Familiar Programming Interfaces REST Accessible and ADO.NET
Windows Azure Storage Account
User creates a globally unique storage account name Receive a 256 bit secret
key when creating account Provides security for accessing the store
Use secret key to create a HMAC SHA256 signature for each request
Use signature to authenticate request at server
Account
Blob Table Queue
Fundamental Data Abstractions
Blobs – Provide a simple interface for storing named files along with metadata for the file
Tables – Provide structured storage.
A Table is a set of entities, which contain a set of properties
Queues – Provide reliable storage and delivery of messages for an application
Blob Storage Concepts
BlobContainerAccount
sally
pictures
IMG001.JPG
IMG002.JPG
movies MOV1.AVI
Storage Account And Blob Containers
Storage Account An account can have many Blob Containers
Container A container is a set of blobs Sharing policies are set at the container level
Public READ or Private Associate Metadata
with Container Metadata is
<name, value> pairs Up to 8KB per container
List the blobs in a container
BlobContainerAccount
sally
pictures
IMG001.JPG
IMG002.JPG
movies MOV1.AVI
Blob Namespace
Blob URL
http://<Account>.blob.core.windows.net/<Container>/<BlobName>
Example: Account – sally Container – music BlobName – rock/rush/xanadu.mp3 URL: http://sally.blob.core.windows.net/music/rock/rush/xanadu.mp3
BlobContainerAccount
sally
pictures
IMG001.JPG
IMG002.JPG
movies MOV1.AVI
Blob Features And Functions
Store Large Objects (up to 50 GB each) Standard REST PUT/GET Interfacehttp://<Account>.blob.core.windows.net/<Container>/<BlobName>
PutBlob Inserts a new blob or overwrites the existing blob
GetBlob Get whole blob or by starting offset, length
DeleteBlob Support for Continuation on Upload
Associate Metadata with Blob Metadata is <name, value> pairs Set/Get with or separate from blob data bits Up to 8KB per blob
Continuation On Upload Scenario
Want to upload a large 10 GB file into the cloud
If upload fails in the middle, need an efficient way to resume the upload from where it failed (rather than starting over)
Uploading A Blob Via Blocks
Uploading a Large Blob
10 GB Movie
Windows Azure Storage
Bloc
k Id
1Bl
ock
Id 2
Bloc
k Id
3
Bloc
k Id
N
blobName = “TheBlob.wmv”;PutBlock(blobName, blockId1, block1Bits);PutBlock(blobName, blockId2, block2Bits);…………PutBlock(blobName, blockIdN, blockNBits);PutBlockList(blobName,
blockId1,…,blockIdN);
TheBlob.wmvTheBlob.wmv
Benefit: • Efficient continuation
and retry • Parallel and out of order
upload of blocks
PutBlockList Example
Bloc
k Id
1
Bloc
k Id
3
Bloc
k Id
2
BlobName = ExampleBlob.wmv
Bloc
k Id
4Bl
ock
Id 2
Bloc
k Id
3Bl
ock
Id 4
Bloc
k Id
4
Committed and readable version of blob
Example Uploading Blocks Out of Order Same Block IDs Unused Blocks
Sequence of Operations PutBlock(BlockId1) PutBlock(BlockId3) PutBlock(BlockId4) PutBlock(BlockId2) PutBlock(BlockId4)
PutBlockList(BlockId2, BlockId3, BlockId4)
Blob Storage Concepts – Adding Blocks
BlockBlobContainerAccount
sally
pictures
IMG001.JPG
IMG002.JPG
movies MOV1.AVI
Block 1
Block 2
Block 3
Blob As A List Of Blocks
Blob Consists of a List of Blocks
Properties of Blocks Each Block defined by a Block ID
Up to 64 Bytes, scoped by Blob Name Blocks are immutable A block is up to 4MB
Do not have to be same size
BlockList Operations
PutBlockList for a Blob Provide the list of blocks to comprise the
readable version of the blob If multiple blocks are uploaded with same Block
ID – Last committed block wins Blocks not used will be garbage collected
GetBlockList for a Blob Returns the list of blocks that represent the
readable (committed) version of the blob Block ID and Size of Block is returned for each block
REST PutBlock
PUThttp://dvd.blob.core.windows.net/movies/TheBlob.wmv?comp=block &blockid=BlockId1 &timeout=60 HTTP/1.1 Content-Length: 4194304Content-MD5: HUXZLQLMuI/KZ5KDcJPcOA==Authorization: SharedKey dvd: F5a+dUDvef+PfMb4T8Rc2jHcwfK58KecSZY+l2naIao=x-ms-date: Mon, 27 Oct 2008 17:00:25 GMT
……… Block Data Contents ………
Account Container Blob Name
REST PutBlockList
PUThttp://dvd.blob.core.windows.net/movies/TheBlob.wmv?comp=blocklist &timeout=120 HTTP/1.1 Content-Length: 161213Authorization: SharedKey dvd: QrmowAF72IsFEs0GaNCtRU143JpkflIgRTcOdKZaYxw=x-ms-date: Mon, 27 Oct 2008 17:00:25 GMT
<?xml version=“1.0” encoding=“utf-8”?><BlockList> <Block>BlockId1</Block> <Block>BlockId2</Block> ………………</BlockList>
Account Container Blob Name
REST GetBlob
GEThttp://dvd.blob.core.windows.net/movies/TheBlob.wmvHTTP/1.1Authorization: SharedKey dvd: RGllHMtzKMi4y/nedSk5Vn74IU6/fRMwiPsL+uYSDjY=X-ms-date: Mon, 27 Oct 2008 17:00:25 GMT
• Get whole blob
GEThttp://dvd.blob.core.windows.net/movies/TheBlob.wmv?timeout=60 HTTP/1.1 Range: bytes=1024000-2048000
• Get a range of the blob
Blob Enumeration – Delimiter
Enumerates the list of blobs in a container Hierarchical listing: prefix + delimiter Continuation: NextMarker + Marker
Container “movies” has:
Action/Rocky.aviAction/Rocky2.aviDrama/Crime/GodFather.aviDrama/Crime/GodFather2.aviDrama/LordOfRings.aviThriller/TheBlob.wmv
List top level “directories”
REST Request:GET http://dvd.blob.windows.net/movies?comp=list &delimiter=/Results:<BlobPrefix>Action</BlobPrefix><BlobPrefix>Drama</BlobPrefix><BlobPrefix>Horror</BlobPrefix>
Container “movies” has:
Action/Rocky1.wmvAction/Rocky2.wmvAction/Rocky3.wmvAction/Rocky4.wmvAction/Rocky5.wmvDrama/Crime/GodFather1.wmvDrama/Crime/GodFather2.wmvDrama/Memento.wmvHorror/TheBlob.wmv
Blob Enumeration – Prefix
Enumerates the list of blobs in a container Hierarchical listing: prefix + delimiter Continuation: NextMarker + Marker
Container “movies” has:
Action/Rocky1.aviAction/Rocky2.avi………Action/Rocky5.aviDrama/Crime/GodFather1.aviDrama/Crime/GodFather2.aviDrama/Gladiator.aviHorror/TheBlob.wmv
List directory “Drama”:
REST Request:GET http://dvd.blob.windows.net/movies?comp=list &prefix=Drama/ &delimiter=/
Results:<BlobPrefix>Drama/Crime</BlobPrefix><Blob>Drama/Memento.wmv</Blob>
Container “movies” has:
Action/Rocky1.wmvAction/Rocky2.wmvAction/Rocky3.wmvAction/Rocky4.wmvAction/Rocky5.wmvDrama/Crime/GodFather1.wmvDrama/Crime/GodFather2.wmvDrama/Memento.wmvHorror/TheBlob.wmv
Container “movies” has:
Action/Rocky1.wmvAction/Rocky2.wmvAction/Rocky3.wmvAction/Rocky4.wmvAction/Rocky5.wmvDrama/Crime/GodFather1.wmvDrama/Crime/GodFather2.wmvDrama/Memento.wmvHorror/TheBlob.wmv
Blob Enumeration – MaxResults
Enumerates the list of blobs in a container Hierarchical listing: prefix + delimiter Continuation: NextMarker + Marker
Max Results and Next Marker
REST Request:GET http://dvd.blob.windows.net/movies?comp=list &prefix=Action &maxresults=3
Results:<Blob>Action/Rocky1.wmv</Blob><Blob>Action/Rocky2.wmv</Blob><Blob>Action/Rocky3.wmv</Blob><NextMarker>OpaqueMarker1</NextMarker>
Blob Enumeration – Continuation
Enumerates the list of blobs in a container Hierarchical listing: prefix + delimiter Continuation: NextMarker + Marker
Using Continuation Marker
REST Request:GET http://dvd.blob.windows.net/movies?comp=list &prefix=Action &maxresults=3&marker=OpaqueMarker1Results:<Blob>Action/Rocky4.wmv</Blob><Blob>Action/Rocky5.wmv</Blob><NextMarker></NextMarker>
Container “movies” has:
Action/Rocky1.wmvAction/Rocky2.wmvAction/Rocky3.wmvAction/Rocky4.wmvAction/Rocky5.wmvDrama/Crime/GodFather1.wmvDrama/Crime/GodFather2.wmvDrama/Memento.wmvHorror/TheBlob.wmv
Summary Of Windows Azure Blobs
Easy to use REST Put/Get/Delete interface Can read from any Offset, Length of Blob Conditional Put and Get Blob Max Blob size
50 GB using PutBlock and PutBlockList 64 MB using PutBlob
Blocks provide continuation for blob upload Put Blob/BlockList == Replace Blob for CTP
Can replace an existing blob with new blob/blocks
Future Windows Azure Blob Support
Update Blob Ability to replace, add, or remove blocks from
a blob
Append Blob Ability to append a block to a blob
Copy Blob Ability to copy an existing blob to a new
blob name
Fundamental Data Abstractions
Blobs – Provide a simple interface for storing named files along with metadata for the file
Tables – Provide structured storage. A Table is a set of entities, which contain a set of properties
Queues – Provide reliable storage and delivery of messages for an application
Windows Azure Tables
Provides Structured Storage Massively Scalable Tables
Billions of entities (rows) and TBs of data Automatically scales to thousands of servers as traffic grows
Highly Available Can always access your data
Durable Data is replicated at least 3 times
Familiar and Easy to use Programming Interfaces ADO.NET Data Services – .NET 3.5 SP1
.NET classes and LINQ REST - with any platform or language
Table Data Model
Table A Storage Account can create many tables Table name is scoped by Account
Data is stored in Tables A Table is a set of Entities (rows) An Entity is a set of Properties (columns)
Entity Two “key” properties that together are
the unique ID of the entity in the Table PartitionKey – enables scalability RowKey – uniquely identifies the entity within the partition
Partition Key And Partitions
Every Table has a Partition Key It is the first property (column) of your Table Used to group entities in the Table into partitions
A Table Partition All entities in a Table with the same partition key value
Partition Key is exposed in the programming model Allows application to control the granularity
of the partitions and enable scalability
Partition KeyDocument Name
Row KeyVersion
Property 3Modification Time
….. Property NDescription
Examples Doc V1.0 8/2/2007 ….. Committed version
Examples Doc V2.0.1 9/28/2007 Alice’s working version
FAQ Doc V1.0 5/2/2007 Committed version
FAQ Doc V1.0.1 7/6/2007 Alice’s working version
FAQ Doc V1.0.2 8/1/2007 Sally’s working version
Partition Example
Partition 1
Partition 2
Table Partition - all entities in table with same partition key value
Application controls granularity of partition
Purpose Of The Partition
Performance and Entity Locality Entities in the same partition
will be stored together Efficient querying and cache locality
Table Scalability We monitor the usage patterns of partitions Automatically load balance partitions
Each partition can be served by a different storage node
Scale to meet the traffic needs of your table
Performance And Scalability
Efficient retrieval of all of the versions of FAQ Doc Since we are accessing a single partition
The two partitions can be served from different servers to scale out access
Partition KeyDocument Name
Row KeyVersion
Property 3Modification Time
….. Property NDescription
Examples Doc V1.0 8/2/2007 ….. Committed version
Examples Doc V2.0.1 9/28/2007 Alice’s working version
FAQ Doc V1.0 5/2/2007 Committed version
FAQ Doc V1.0.1 7/6/2007 Alice’s working version
FAQ Doc V1.0.2 8/1/2007 Sally’s working version
Partition 1
Partition 2
Choosing A Partition Key
Use a PartitionKey that is common in your queries If Partition Key is part of Query
Fast access to retrieve entities within a single partition
If Partition Key is not specified in a Query Then every partition has to be scanned
Partition KeyDocument Name
Row KeyVersion
Property 3Modification Time
….. Property NDescription
Examples Doc V1.0 8/2/2007 ….. Committed version
Examples Doc V2.0.1 9/28/2007 Alice’s working version
FAQ Doc V1.0 5/2/2007 Committed version
FAQ Doc V1.0.1 7/6/2007 Alice’s working version
FAQ Doc V1.0.2 8/1/2007 Sally’s working version
Query With And Without PartionKey
Getting all entities with (PartitionKey ==“FAQ Doc”) is fast Access single partition
Get all docs with (ModifiedTime < 6/01/2007) is more expensive Have to traverse all partitions
Partition 1
Partition 2
Choosing A Partition Key
Use a PartitionKey that is common in your queries If Partition Key is part of Query
Fast access to retrieve entities within a single partition If Partition Key is not specified in a Query
Then every partition has to be scanned
Spread out load across partitions Partition Key allows you to control what goes into your
Table partitions More partitions – makes it easier to automatically
balance load At the tradeoff of Entity Locality
Primary Key For Entity
Primary Key for the Entity is the composite of PartitionKey
Unique ID of the partition the entity belongs to within the Table RowKey
Unique ID for the entity within the partition
Partition KeyDocument Name
Row KeyVersion
Property 3Modification Time
….. Property NDescription
Examples Doc V1.0 8/2/2007 ….. Committed version
Examples Doc V2.0.1 9/28/2007 Alice’s working version
FAQ Doc V1.0 5/2/2007 Committed version
FAQ Doc V1.0.1 7/6/2007 Alice’s working version
FAQ Doc V1.0.2 8/1/2007 Sally’s working version
Partition 1
Partition 2
Entities And Properties
Each Entity can have up to 255 properties Mandatory Properties for every Entity in Table
Partition Key Row Key
All entities have a system maintained version
No fixed schema for rest of properties Each property is stored as a <name, typed value> pair
No schema stored for a table 2 entities within the same table
can have different properties
Property Types Supported
Partition Key and Row Key String (up to 64KB)
Property Types String (up to 64KB) Binary (up to 64KB) Bool DateTime GUID Int Int64 Double
Table Programming Model
Provide familiar and easy to use interfaces Leverage your .NET expertise
Table Entities are accessed as objects via ADO.NET Data Services – .NET 3.5 SP1 LINQ – Language Integrated Query RESTful access to table and entities
Insert/Update/Delete Entities over the Table
Query over Tables Get back a list of structured Entities
Example using ADO.NET Data Services Table Entities are represented as Class Objects
Example Table Definition
[DataServiceKey("PartitionKey", "RowKey")]public class Customer{ // Partition key – Customer Last name public string PartitionKey { get; set; }
// Row Key – Customer First name public string RowKey { get; set; }
// User defined properties here public DateTime CustomerSince { get; set; }
public double Rating { get; set; }
public string Occupation { get; set; }}
Create Customers Table
Every Account has a master table called “Tables” It is used to keep track of the tables in your account To use a table it has to be inserted into “Tables”
[DataServiceKey("TableName")]public class TableStorageTable{ public string TableName { get; set; }}
TableStorageTable table = new TableStorageTable("Customers");
context.AddObject("Tables", table);DataServiceResponse response = context.SaveChanges();
// serviceUri is “http://<Account>.table.core.windows.net/”DataServiceContext context = new DataServiceContext(serviceUri);
Create And Insert Entity
Create a new Customer and Insert into TableCustomer cust = new Customer( “Lee”, // Partition Key = Last Name “Geddy”, // Row Key = First Name DateTime.UtcNow, // Customer Since 2.0, // Rating “Engineer” // Occupation);
context.AddObject(“Customers”, cust);DataServiceResponse response = context.SaveChanges();
// Service Uri is “http://<Account>.table.core.windows.net/”DataServiceContext context = new DataServiceContext(serviceUri);
Query A Table
LINQDataServiceContext context = new DataServiceContext(“http://myaccount.table.core.windows.net”);
var customers = from o in context.CreateQuery<Customer>(“Customers”)
where o.PartitionKey == “Lee”select o;
foreach (Customer customer in customers) { }
GET http://myaccount.table.core.windows.net/Customers? $filter= PartitionKey eq ‘Lee’
REST
Update And Delete Entity
context.DeleteObject(cust); DataServiceResponse response = context.SaveChanges();
cust.Occupation = “Musician”;context.UpdateObject(cust);DataServiceResponse response = context.SaveChanges();
Customer cust = ( from c in context.CreateQuery<Customer> (“Customers”) where c.PartitionKey == “Lee” // Partition Key = Last Name && c.RowKey == “Geddy” // Row Key = First Name select c) .FirstOrDefault();
Summary Of Windows Azure Tables
Built to provide Massively Scalable, Highly Available
and Durable Structured Storage
Automatic Load Balancing and Scaling of Tables Partition Key is exposed to the application
Familiar and Easy to use LINQ and REST programming interfaces ADO.Net Data Services
Not a “relational database” No joins, no maintenance of foreign keys, etc
Future Windows Azure Table Support At CTP
Single Index Query and retrieve
results sorted by PartitionKey and RowKey
Single Entity Transactions Atomically Insert,
Update, or Delete a Single Entity
Future Support for Secondary Indexes
Query and retrieve results sorted by other properties
Entity Groups Atomic
transactions across multiple entities within same partition
Fundamental Data Abstractions
Blobs – Provide a simple interface for storing named files along with metadata for the file
Tables – Provide structured storage.
A Table is a set of entities, which contain a set of properties
Queues – Provide reliable storage and delivery of messages for an application
Web + Worker Queue Example
Cloud Storage (blob, table, queue)
Web RoleLB
n
Worker Role
m
Windows Azure Queues
Provide reliable message delivery Simple, asynchronous work dispatch Programming semantics ensure that a
message can be processed at least once
Queues are Highly Available, Durable and Performance Efficient
Access is provided via REST
Account, Queues And Messages
An Account can create many Queues Queue Name is scoped by the Account
A Queue contains Messages No limit on number of messages stored in a Queue
But a Message is stored for at most a week
http://<Account>.queue.core.windows.net/<QueueName>
Messages Message Size <= 8 KB To store larger data, store data in blob/entity storage,
and the blob/entity name in the message
Queue Programming API
Queues Create/Clear/Delete Queues Inspect Queue Length
Messages Enqueue (QueueName, Message) Dequeue (QueueName, Invisibility Time T)
Returns the Message with a MessageID Makes the Message Invisible for Time T
Delete(QueueName, MessageID)
2 1
C1
C2
Dequeue And Delete Messages
1234
Producers Consumers
P2
P1
3
2. Dequeue(Q, 30 sec) msg 2
1. Dequeue(Q, 30 sec) msg 1
12
C1
C2
Dequeue And Delete Messages
34
Producers Consumers
P2
P1
1
2
2. Dequeue(Q, 30 sec) msg 23. C2 consumed msg 24. Delete(Q, msg 2)7. Dequeue(Q, 30 sec) msg 1
1. Dequeue(Q, 30 sec) msg 15. C1 crashed
12 1 6. msg1 visible 30 seconds after Dequeue
Benefit: • Insures that every
message can be processed at least once
3
Summary Of Windows Azure Queues
Provide reliable message delivery Allows Messages to be retrieved
and processed at least once
No limit on number of messages stored in a Queue
Message size is <=8KB
Durability, Availability And Scalability Of Windows Azure Storage
Storage Durability
All data is replicated at least 3 times Replicas are spread out over different fault
and upgrade domains in same data center Future support for geo-distribution and geo-replication
All of Storage (Blobs, Tables and Queues) is built on this replication layer
Dynamic replication to maintain a healthy number of replicas Recover from a lost/unresponsive Drive or Node Recover from data bit rot
Data continuously scanned against bit rot
Availability And Scalability
Efficient Failover Data served immediately elsewhere
within data center from available replicas
Automatic Load Balancing of Hot Data Monitor the usage patterns and load balance access to
Blob Containers, Table Partitions and Queues Distribute access to the hot data over the
data center according to traffic
Caching of Hot Blobs, Entities and Queues Hot Blobs are cached to scale out access to them Hot Entity and Queue data pages are
cached and served from memory
Windows Azure Storage Summary
Essential Storage for the Cloud Durable, Scalable, Highly Available,
Security, Performance Efficient
Familiar and Easy to Use Programming Interfaces REST Accessible, LINQ and ADO.NET
Rich Data Abstractions Service communication: queues, locks, … Large user data items: blobs, blocks, … Service state: tables, caches, …
Account
Container Blobs
Table Entities
Queue Messages
Windows Azure Data Storage Concepts
http://<account>.blob.core.windows.net/<container>
http://<account>.table.core.windows.net/<table>
http://<account>.queue.core.windows.net/<queue>
More Information
Deep Dive on Windows Azure Tables ES07 – Programming Cloud Table Storage
Niranjan Nilakantan and Pablo Castro Wednesday - 10/29/08 –
1:15pm to 2:30pm – Room 403AB
Links Windows Azure
http://www.azure.com/windows ADO.NET Data Services
http://blogs.msdn.com/astoriateam
Evals & Recordings
Please fill
out your
evaluation for
this session at:
This session will be available as a recording at:
www.microsoftpdc.com
Please use the microphones provided
Q&A
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.