SQLBits X Scaling out with SQL Azure Federations
-
Upload
michael-rys -
Category
Technology
-
view
1.396 -
download
3
description
Transcript of SQLBits X Scaling out with SQL Azure Federations
© 2012 Microsoft
SCALING OUT YOUR CLOUD DATABASE WITH SQL AZURE FEDERATIONS
Michael Rys, Microsoft Corp.@SQLServerMike
SQLBits X, March 2012
AGENDA
• Scaling out your business is important!• NoSQL and Scale-Out Paradigms• Introduction of SQL Azure Federations• SQL Azure Federation Application Patterns
• Multi-Tenancy• Map-Reduce/Fan-Out queries
OnlineBusinessApplicatio
n
Attract Individual Consumers:- Provide
interesting service
- Provide mobility
- Provide social
Monetize Individual:- Upsell service
- VIP- Speed- Extra
Capabilities
Monetize the Social:- Improve individual
experience- Re-sell Aggregate
Data (e.g., Advertisers)
THE “WEB 2.0” BUSINESS ARCHITECTURE
SOCIAL GAMING: THE BUSINESS PROBLEM• 10s of million of users
• millions of users concurrently• 100s of million interactions per day
• Terabytes of data• 90% reads, 10% writes
• Required (eventual) data consistency across users• E.g. show your updated high score
to your friends
SCALING DATABASE APPLICATIONS• Scale up
• Buy large-enough server for the jobo But big servers are expensive!
• Try to load it as much as you cano But what if the load changes? o Provisioning for peaks is expensive!
• Scale-out• Partition data and load across many servers
o Small servers are cheap! Scale linearly
• Bring computational resources of many to bearo Cluster of 100’s of little servers is very fast
• Load spikes not as problematico Load balancing across the entire cluster
SOLUTION• Shard/Partition user data across
hundreds of SQL Databases• Propagate data changes from one DB to
other DBs using async Fan-Out• Global Transactions would hinder scale and
availability• Able to handle failure with Quorum
• Provide HA• Replicas for DBs• Retry Logic
SHARDING PATTERN• Linear scaling through database
independence• No need for distributed tx in
common OLTP cases• Application-influenced partitioning
• Rather than complete transparency
• Local access for most• Connection routing• Query, transaction scoping
• Distributed access for some• Fan-out expensive computation
Users
1-1000
1001-2000
2001-3000
Clients
AppServer(s)
DataServers
read/update
item 2342
EXAMPLE ARCHITECTURE
Front Door Router
Services
250 instances
STSSTS
DBUser …
Partitioned over 100 SQL Azure DBs
Social Service
Gamer Services
Game Ingestio
n
Social Services
Gamer Services
Game Ingestio
n
Game Catalog
Find Friends’ Profiles
DBLeaderboard …
Partitioned over 298 SQL Azure DBs
Find Friends’ ProfilesGet my ProfilePublish feed, read feed
Last PlayedFavoritesGame PreferencesSocial Leaderboards
Disable/Enable Games from accessing services
Game binariesGame metadata
Get Friends highscores
DBUser …
Partitioned over 100 SQL Azure DBs
Write user specific game infos
250 instances
MANY LARGE SCALE CUSTOMERS USING SIMILAR PATTERNS
• Patterns• Sharding and fan/out query layer• Sharding and reliable messaging• Caching layer• Replica sets
• Customer Examples• MSN Casual Gaming• Social Networking: Facebook, MySpace, etc• Online electronic stores (cannot give names )• Travel reservation systems (e.g. Choice International)• etc.
LESSONS LEARNED FROM THESE SCENARIOS• Require high availability• Be able to scale out:
• Functional and Data Partitioning Architecture• Provide scale-out processing:
o Function shippingo Fanout and Map/Reduce processing
• Be able to deal with failures:o Quorumo Retrieso Eventual Consistency (similar to Read-consistent Snapshot Isolation)
• Be able to quickly grow and change:• Elastic scale• Flexible, open schema• Multi-version schema support
Move better support for these patterns into the Data Platform!
INTRODUCING: SQL AZURE FEDERATIONS• Scenarios
• Applications that need Elastic Scale on Demand• Grow beyond a single SQL Azure Database in Size (> 150GB)• Multi-tenant Applications
• Capabilities:• Provides Data Partitioning/Sharding at the Data Platform• Enables applications to build elastic scale-out applications• Provides non-blocking SPLIT/DROP for shards (MERGE to come later)• Auto-connect to right shard based on sharding key value• Provides SPLIT resilient query mode
SQL AZURE FEDERATION CONCEPTS
Federation “Games_Fed”
ShardedApplication
Azure DB with Federation Root
Federation Directories, Federation Users, Federation Distributions, …
Federation- Represents the data being sharded
Federation Root- Database that logically houses federations,
contains federation meta data Federation Key
- Value that determines the routing of a piece of data (defines a Federation Distribution)
Atomic Unit- All rows with the same federation key value:
always together! Federation Member (aka Shard)
- A physical container for a set of federated tables for a specific key range and reference tables
Federated Table- Table that contains only atomic units for the
member’s key range Reference Table
- Non-sharded table
Member: PK [min, 100)
Member: PK [100, 488)
Member: PK [488, max)
(Federation Key: userID)
AUPK=
5
AUPK=25
AUPK=35
AUPK=105
AUPK=235
AUPK=365
AUPK=555
AUPK=254
5
AUPK=356
5C
on
nectio
n
Gate
way
DEMOSQL AZURE FEDERATIONS
• Shard Social Gaming App using SQL Azure
Federations
CREATING A FEDERATION• Create a root database
CREATE DATABASE GamesDB
• Location of partition map• Houses centralized data
• Create the federation inside the root DBCREATE FEDERATION Games_Fed (userID BIGINT RANGE)
• Specify name, federation key type• Creates the first member, covering the entire range
GamesDB
Federation “Games_Fed”(Federation Key: userID)
Member: PK [min, max]
CREATING THE SCHEMA ON THE MEMBER• Federated tables
CREATE TABLE GameInfo(…) FEDERATE ON (userID=Id)
• Federation key must be in all unique indiceso Part of the primary key
• Range of the federation member constraints the value of customerId
• Reference tablesCREATE TABLE FriendId(…)
• Absence of FEDERATE ON indicates reference
• Centralized tables• Create in root database
Federation “Games_Fed”(Federation Key: userID)
Member: PK [min, max)
GameInfo
GamesDB
FriendId
FEDERATION DETAILS• Supported federation keys:
Single Column of type BIGINT, INT, UNIQUEIDENTIFIER or VARBINARY(900)
• Partitioning style: RANGE• Schema requirements:
• Federation key must be part of unique index• Foreign key constraints only allowed between federated tables and from
federated table to reference table• Indexed views not supported
• Data types not supported in members: rowversion (aka timestamp)
• Properties not supported in members: identity, sequence• Schemas are allowed to diverge between members
• Schema rollout use a fan-out approach
SPLITTING AND MERGING• Splitting a member
• When too big or too hot…ALTER FEDERATION Games_Fed SPLIT AT (userID=100)
• Creates two new memberso Splits (filtered copy) federated datao Copies reference data to both
• Online!
• Dropping a member• When Data is not needed anymore…
ALTER FEDERATION Games_Fed DROP AT (LOW|HIGH userID=100)• Drops member below or above split value• Reassigns range to sibling
• Merging members (not yet implemented)• When too small…ALTER FEDERATION Games_Fed MERGE AT (userID=200)
• Creates new member, drops old ones
Federation “Games_Fed”
(Federation Key: userID)Member: PK [min, max)
GamesDB
GamesInfo FriendsId
Member: PK [min, 100)
GamesInfo
FriendsId
Member: PK [100, max)
FriendsIdGamesInf
o
CONNECTION MODES• Connection string always points to root.
• Prevents connection pool fragmentation.
• Filtered ConnectionUSE FEDERATION Games_Fed (userid=0) WITH FILTERING=ON, RESET• Scoped to Atomic Unit• Masks dangers of repartitioning from the app
• Unfiltered ConnectionUSE FEDERATION Games_Fed (userid=0) WITH FILTERING=OFF, RESET• Scoped to a Federation Member• Management Connection
Member: PK [min, 100)
AUPK=5
AUPK=25
AUPK=56
App
FriendsId
Federation “Games_Fed”
(Federation Key: userID)
GamesDB
AUPK=7
5
AUPK=85
AUPK=96
FriendsId
FILTERED CONNECTIONS• Why use a filtered connection?
• Aid in multi-tenant database development.• Safe model for programming against federation repartitioning.
• How does it work? • Filter injected dynamically at runtime for all federated tables.• Comes with a warning label;
o Safe coding requires checking the filtering state of the connection in code
IF (SELECT federation_filtering_state FROM sys.dm_exec_sessions WHERE session_id=@@spid)=1-- connection is filtering
ELSE-- connection isn't filtering
UNFILTERED CONNECTION
• Required for Member Scoped operations such as• Schema changes or DDL• DML on reference tables
• Best Performance for querying across atomic units• Iterating many atomic units is too expensive with
o Fan-out queries o Bulk operations such as data inserts, bulk updates, data pruning etc
FEDERATION MANAGEMENT - SYSTEM METADATA• Root has the metadata about federation• Federation Member has metadata about itself
select * from sys.federations;select * from sys.federation_distributions;select * from sys.federation_members;select * from sys.federation_member_distributions;
• Watch progress on repartitioning operationsSELECT percent_complete FROM sys.dm_federation_operationsWHERE federation_operation_id=?
MAP-REDUCE ON FEDERATIONS
FedMember 1
FedMember 2
FedMember N
Result
Reducer 1 Reducer 2 Reducer 3 Reducer M
Map JobMap Job
Reduce Job
Reduce Job
Reduce Job
Collection
Shuffle
Map Job
Reduce Job
• 1 T-SQL Map Job per Federation Member
• Fixed upper number for T-SQL Reducers
• 1 Database for M Reducer tables
DEMOMAP-REDUCE SCALE-OUT OVER SQL AZURE FEDERATIONS
• Sharded GamesInfo table using SQL Azure
Federations
• Use a C# library that does implement a
Map/Reduce processor on top SQL Azure
Federations
• Mapper and Reducer are specified using SQL
MAP-REDUCE ON FEDERATIONS: REPARTITION RESILIENCE
• Support for hot splits and merge/drops of Federation members
• Hot Split Resilience:• First in Mapper: Check if partition range is still the same• If not: Add new Mapper Jobs for missing ranges
• Hot Merge Resilience:• Add partition range to the predicate
25
MAP-REDUCE ON FEDERATIONS: TOOLS
• Other Fan-Out and Map-Reduce Online Sample at:• http://federationsutility-weu.cloudapp.net/
• This library will be made available as a code sample (hopefully) soon
26
EXAMPLE: SCALING OUT MULTI-TENANT APPLICATION1) Put everything into one DB? Too big…2) Create a database per tenant? Not bad, but what if millions of tenants?3) Sharding Pattern: better, app is already prepared for it!
T1 T2 T3 T4 T5
T6 T7 T8 T9T10
T11
T12
T13
T14
T15
T16
T17
T18
T19
T20
T1 T2 T3 T4 T5
T6 T7 T8 T9 T10
T11 T12 T13 T14 T15
T16 T17 T18 T19 T20
All my data is handled by one
DB on one server
MULTI-TENANT APPLICATION WITH FEDERATIONS• Use SQL Azure Federations:
• Federation Key = Tenant ID• USE FEDERATION WITH FILTER=ON
• But what if:• Some tenants are too big?• We may not know which ones are too big and they may grow and shrink
• Solution:• Multi-column Federation Key to split very large tenants• but currently only one key column allowed
• Needs:• Hierarchical Federation Key• Fanout/MapReduce Queries
HIERARCHICAL FEDERATION KEY
• Use varbinary(900) as Federation key Type
• Use HierarchyID as the actual key values• Provides depth-first byte ordering
• Split at appropriate Subtree node
\
\1
\1\1 \1\2
\2 \3
\1\3
30
DEMOHIERARCHYID AS FEDERATION KEY
SQL AZURE FEDERATIONS ROADMAP• Merge operation for federation members• Fan-Out queries
• E.g., allow single query that can process results across large number of federation members
• Schema management• Multi version schema deployment & management across federation members
• Policy-based Auto Repartitioning• SQL Azure manages the federated databases through splits/merges based on policy (e.g., query
response time, db size etc.)
• Multi column federation keys• E.g., federate on enterprise_customer_id + account_id
• Wider support for multi-tenancy (e.g. backup/restore atomic unit)• Fill out survey
http://connect.microsoft.com/BusinessPlatform/Survey/Survey.aspx?SurveyID=13625
OnlineBusinessApplicatio
n
Attract Individual Consumers:- Provide
interesting service
- Provide mobility
- Provide social
Monetize Individual:- Upsell service
- VIP- Speed- Extra
Capabilities
Monetize the Social:- Improve individual
experience- Re-sell Aggregate
Data (e.g., Advertisers)
THE “WEB 2.0” BUSINESS ARCHITECTURE
Primary Shard
Replica
Replica
Primary Shard
Replica
Replica
Primary Shard
Replica
Replica
OLTP Workloads
Highly AvailableHigh ScaleHigh Flexibility
mostly touching 1 to low number of shards
Dynamic OLAP Workloads
Scale-out queries, often using Map-Reduce or Fan-Out Paradigms
SCALE-OUT DATA PLATFORM ARCHITECTURE
Federations
SUMMARY
• Scaling out your business is important!• SQL Azure Federations provides
• Data Platform Support for Elastic Data Scale-Out
• SQL Azure Federation Application Patterns• Multi-Tenancy• Map-Reduce/Fan-Out queries
RELATED RESOURCES• Scale-Out with SQL Databases
• Windows Gaming Experience Case Study: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=4000008310
• Scalable SQL: http://cacm.acm.org/magazines/2011/6/108663-scalable-sql• http://www.slideshare.net/MichaelRys/scaling-with-sql-server-and-sql-azure-federations•
• SQL Federations• http://blogs.msdn.com/b/cbiyikoglu/• http://blogs.msdn.com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azure-federations.aspx • http://
blogs.msdn.com/b/cbiyikoglu/archive/2011/12/29/introduction-to-fan-out-queries-querying-multiple-federation-members-with-federations-in-sql-azure.aspx
• http://blogs.msdn.com/b/cbiyikoglu/archive/2012/01/19/fan-out-querying-in-federations-part-ii-summary-queries-fanout-queries-with-top-ordering-and-aggregates.aspx
• http://federationsutility-weu.cloudapp.net/
• Contact me• @SQLServerMike• http://sqlblog.com/blogs/michael_rys/default.aspx