04-DocumentDB-SQL

51
Dezvoltarea aplicațiilor de tip Cloud STORE DATA IN AZURE s.l. dr. ing. Daniel Iercan

description

04-DocumentDB-SQL

Transcript of 04-DocumentDB-SQL

Page 1: 04-DocumentDB-SQL

Dezvoltareaaplicațiilor de tip CloudSTORE DATA IN AZURE

s.l. dr. ing. Daniel Iercan

Page 2: 04-DocumentDB-SQL

DocumentDB

Page 3: 04-DocumentDB-SQL

What is DocumentDB?

schema free

+

non-trivial queries

+

transactional processing

Page 4: 04-DocumentDB-SQL

What is DocumentDB?•NoSQL document database

•JavaScript and JSON

•Rich query (SQL-like) and transactions overschema-free JSON data• JSON documents are indexed automatically

• can be queried

Page 5: 04-DocumentDB-SQL

What is DocumentDB?•Reliable and configurable performance• SSD storage• Tune and trade-off consistency• strong• bounded-stateless• session• eventual

•Data automatically replicated

•RESTful API

Page 6: 04-DocumentDB-SQL

What is DocumentDB?•Fully managed by Azure

•Elastically scale throughput and storage (database units)

•Open by design (JavaScript, JSON, RESTfulHTTP)

Page 7: 04-DocumentDB-SQL

Democonfigure DocumentDB in Azure portal

Page 8: 04-DocumentDB-SQL

DocumentDB resources

Page 9: 04-DocumentDB-SQL

Database account and administrative quota•100 databases

•500.000 users

•2.000.000 permissions

Page 10: 04-DocumentDB-SQL

Elastic collections•One database can contain any number of

collections (limited by number of capacity units bought)

•Transaction domain for documents

•Scope for document storage and query execution

Page 11: 04-DocumentDB-SQL

What is a Capacity Unit (CU)•a way to buy resources (CPU, RAM, IO, storage)

•1CU = 3 elastic collections, 10GB of SSD, 2000 request units

Page 12: 04-DocumentDB-SQL

What is a request unit (RU)• request unit = CPU + memory + IO

•measured as rate per second

Page 13: 04-DocumentDB-SQL

What is a request unit (RU)document size 1KB, 10 properties, session consistency, all documents indexed

DATABASE OPERATIONSNUMBER OF OPERATIONS PER SECOND PER CU

Reading a single document 2000

Inserting/Replacing/Deleting a single document

500

Query a collection with a simple predicate and returning a single document

1000

Stored Procedure with 50 document inserts

20

Page 14: 04-DocumentDB-SQL

Developing Against Azure DocumentDB•RESTful API

•.NET (LINQ provider)

•Node.js

•JavaScript

•Python

•Support for CRUD operation and SQL syntax

Page 15: 04-DocumentDB-SQL

DEMOOverview of .Net Client

Page 16: 04-DocumentDB-SQL

DocumentDBresource model

Page 17: 04-DocumentDB-SQL
Page 18: 04-DocumentDB-SQL
Page 19: 04-DocumentDB-SQL

QueryingSQL like syntax (sub-set of ANSI SQL)

User Defined Functions (JavaScript) can be used in queries

Page 20: 04-DocumentDB-SQL

Transactions and JavaScript executions•SPs

•Triggers

•UDF

• JavaScript replaces T-SQL

• JavaScript logic executes in ACID transactions (snapshot isolation)

•Entire transaction is aborted in case of JavaScirptexception

Page 21: 04-DocumentDB-SQL

Demo- create SP

- create trigger

- create UDF

http://azure.microsoft.com/en-gb/documentation/articles/documentdb-resources/

Page 22: 04-DocumentDB-SQL

Documents• JSON objects

• Free schema

• Stored in collections

• Can be inserted, replaced, deleted, read, enumerated and query

Page 23: 04-DocumentDB-SQL

Attachments and Media• binary blobs/media

• can be stored in DocumentDb or externally:

• special JSON document that captures the metadata of the media stored in a remote media store

• attachments stored in DocumentDB have _media property to point to the resource URI

• attachments in DocumentDB are GC automatically

• for media stored externally developer has to manage it

Page 24: 04-DocumentDB-SQL

Users• Logical names for grouping permissions

• Implement multi-tenancy (one user for each actual application user)

• Shard data:• each user maps to database

• each user maps to a collection

• all documents for a user stored in a collection

• documents from different user stored in various collections

Page 25: 04-DocumentDB-SQL

Permissions• administrative resources vs. application resources

• master key vs. resource key• master key – access to everything

• resource key – granular access to specific resources

Page 26: 04-DocumentDB-SQL

Optimistic concurrencyETag and If-Match header attributes

Page 27: 04-DocumentDB-SQL

DocumentDBtuning performance

Page 28: 04-DocumentDB-SQL

Configuring Indexing Policy of a Collection• Choose whether the collection automatically indexes all of

the documents or not

• Choose whether to include or exclude specific paths or patterns in your documents from the index

• Choose between synchronous (consistent) and asynchronous (lazy) index updates

Page 29: 04-DocumentDB-SQL

Configure consistencytrade-offs between consistency, availability and latency

•Strong

•Bounded-stateless – total ordering of writes and maximum staleness

•Session – read your own writes

•Eventual

Page 30: 04-DocumentDB-SQL

Referenceshttp://azure.microsoft.com/en-us/services/documentdb/

http://azure.microsoft.com/en-us/documentation/services/documentdb/

http://azure.microsoft.com/en-us/documentation/articles/documentdb-introduction/

http://azure.microsoft.com/en-gb/documentation/articles/documentdb-resources/

http://azure.microsoft.com/en-gb/documentation/articles/documentdb-interactions-with-resources/

Page 31: 04-DocumentDB-SQL

SQL in the Cloud

Page 32: 04-DocumentDB-SQL

Need for scalabilityAs data grow performance degrade

Page 33: 04-DocumentDB-SQL

SQL Partitioning (horizontal scalling)• Master/Slave: one (master) SQL server for write operations (CRUD),

and one ore more SQL server for read operations• master can be a bottleneck

• replication is near-real-time

• master single point of failure

• Cluster Computing: multiple server that act as nodes and uses a centralized shared disk facility• all nodes can be used for read, only one for write, if the node that does the

write fails another one takes its place (shared disk can be a bottle neck, write does not scale)

• advanced clustering uses real-time memory replication so that all nodes can do writes (network traffic between node can be a bottle neck + shared disk)

Page 34: 04-DocumentDB-SQL

SQL Partitioning (horizontal scalling)• Table Partitioning• data in single large tables can be split across multiple disks to improve I/O,

partition can be done both horizontally (by rows) as well as vertically (by columns), issues with join operations

• Federated Tables• tables can be access across multiple servers (complex to administrate, good

for reporting but not for general read/write transactions), federation key is very important

Page 35: 04-DocumentDB-SQL

SQL Partitioning (horizontal scalling)• Sharding (Shared-Nothing)• Independent servers (CPU, memory and disk)

• Smaller databases are: easier to manage and maintain, faster, and reduce costs

• Challenges: reliability, distributed queries, avoidance of cross-shard joins, auto-increment key management, support for multiple shard schema (session-based, transaction-based, statement-based) , determine optimum method for sharding data (by primary key, by modulus of a key, maintain a master shard index table)

Page 36: 04-DocumentDB-SQL

Ways to use SQL in the Cloud

• SQL as a service

• dedicated SQL VM

Page 37: 04-DocumentDB-SQL

SQL Azure• Old – federation

• New - sharding

Page 38: 04-DocumentDB-SQL

Multi-tenancy• a single instance of the software runs on a server, serving

multiple tenants

• has to ensure data separation

Page 39: 04-DocumentDB-SQL

SQL Azure Database is

Get started quickly

Ready to get started?

Page 40: 04-DocumentDB-SQL

Provision Your ServerServer defined

Service head that contains databases

Connect via automatically generated FQDN (xxx.database.windows.net)

Initially contains only a master database

Provision servers interactively

Log on to Windows Azure Management Portal

Create a SQL Azure server

Specify admin login credentials

Add firewall rules and enable service access

Automate server provisioning

Use Windows Azure Platform PowerShell cmdlets (or use REST API directly)

wappowershell.codeplex.com

Page 41: 04-DocumentDB-SQL

Build Your DatabaseUse familiar technologies

Supports Transact-SQL

Supports popular languages

.NET Framework (C#, Visual Basic, F#) via ADO.NET

C / C++ via ODBC

Java via Microsoft JDBC provider

PHP via Microsoft PHP provider

Supports popular frameworks

OData (REST data access)

Entity Framework

WCF Data Services

NHibernate

Supports popular tools

SQL Server Management Studio (2008 R2 and later)

SQL Server command-line utilities (SQLCMD, BCP)

CA Erwin® Data Modeler

Embarcadero Technologies DBArtisan®

Differences in comparison to SQL Server Focus on logical vs. physical administration

Database and log files automatically placed

Three high-availability replicas maintained for every database

Databases are fully contained

Tables require a clustered index

Maximum database size is 50 Gb

Unsupported SQL Server featuresBACKUP / RESTORE

USE command, linked servers, distributed transactions, distributed views, distributed queries, four-part names

Service Broker

Common Language Runtime (CLR)

SQL Agent

Page 42: 04-DocumentDB-SQL

Database

Thin client database development

Rich client database development

Page 43: 04-DocumentDB-SQL

Database

Data-tier Application Framework (DAC Fx)

How to get the latest DAC Fx

Page 44: 04-DocumentDB-SQL

Database

Interactive approach for dacpac v1 and v2

Interactive approach for bacpac v2

Upgrading a dacpac or bacpac

Page 45: 04-DocumentDB-SQL

Secure Your DatabaseServer identity and access control

SQL authentication supported

Integrated authentication not supported

Connect to master to administer logins and create / drop databases

The admin login (configured during service provisioning) is like sa

The admin login has full rights on the server (and all databases) and should only be used for administration

Manage logins with CREATE / ALTER / DROP LOGIN commands

Membership in the loginmanager server role grants CREATE / ALTER / DROP LOGIN priveleges

Membership in the dbmanager server role grants CREATE / DROP DATABASE privileges

Database identity and access controlLogins must have an associated user account to connect to a database

The admin login is automatically associated with a special user known as dbo (database owner)

The dbo has full rights in the database and should only be used for administration

Manage users with CREATE / ALTER / DROP USER commands

Add users to system or user-defined database roles to grant privileges via sp_add_rolemember

Organize database objects into schema containers based upon common access control requirements

Grant privileges to schema containers instead of individual objects for better productivity

Page 46: 04-DocumentDB-SQL

Connect Your ApplicationConnecting to SQL Azure

TDS (Tabular Data Stream) protocol over TCP/IP supported

SSL required

Use firewall rules to connect from outside Microsoft data center

ASP.NET example:

Special considerations

Legacy tools and providers may require special format for login: [login]@[server]

Idle connections terminated after 30 minutes

Long running transactions terminated after 24 hours

DoS guard terminates suspect connections with no error message

Failover events terminate connections

Throttling may cause errors

Use connection pooling and implement retry logic to handle transient failures

Latency introduced for updates due to HA replicas

No cross-database dependencies, result sets from different databases must be combined in application tier

<connectionStrings><addname="AdventureWorks"connectionString=

"Data Source=[server].database.windows.net;Integrated Security=False;Initial Catalog=ProductsDb;User Id=[login];Password=[password];Encrypt=true;"

providerName="System.Data.SqlClient"/></connectionStrings>

Page 47: 04-DocumentDB-SQL

DemoSQL db from laboratory enrol.

Page 48: 04-DocumentDB-SQL

No-Sql vs. Sql

Page 49: 04-DocumentDB-SQL

Performancecertain types of queries can be slow

for business application SQL most likely are better

for fetching few bits of information but high traffic and concurrency NO-SQL is better

Page 50: 04-DocumentDB-SQL

Business Intelligencebest works with SQL

NO-SQL (wide-columns) works good with “BIG DATA”

Page 51: 04-DocumentDB-SQL

Referenceshttp://codefutures.com/database-sharding/

http://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/