DBI332 ilikesql brianwmitchelll UNSTRUCTURED UNBALANCED UNPREDICTABLE.
-
Upload
britney-hill -
Category
Documents
-
view
224 -
download
0
Transcript of DBI332 ilikesql brianwmitchelll UNSTRUCTURED UNBALANCED UNPREDICTABLE.
All Up Data Warehouse:From SMP to Parallel Data Warehouse
DBI332
Dandy WeynSr. Technical Product Manager
Brian MitchellSr. Premier Field Engineer
ilikesql
brianwmitchelll
UNSTRUCTURED
UNBALANCED
UNPREDICTABLE
Take 1 big SANAdd a little ServerAdd a bigger ServerAdd more networking
POTENTIAL PERFORMANCE BOTTLENECKS
FCHBA
A
B
FCHBA
A
B FC
SW
ITC
H
STORAGECONTROLLER
A
B
A
BCA
CH
E
SE
RV
ER
CA
CH
E
SQ
L S
ER
VE
R
WIN
DO
WS
CP
U C
OR
ES
CPU Feed Rate HBA Port Rate Switch Port Rate SP Port Rate
A
B
DISK DISK
LUN
DISK DISK
LUN
SQL Server Read Ahead Rate
LUN Read Rate Disk Feed Rate
It’s all about …. SIZING
One SHOE does not FIT ALL
Transaction processing simplifies and
accelerates data capture for accurate business decisions
Data warehousing enables common data
model for single version of the truth
Analysis leads to optimized business processes and improved performance
Data Warehouse Scope
Dat
a P
ath
Data Warehouse
Analysis Services Cubes
PerformancePoint
Dedicated SAN, Storage Array
Reporting Services
Web Analytic Tools
Integration Services ETL
SharePoint Services
Microsoft Office SharePoint
Data Staging,Bulk Loading
Supporting Systems BI Data Storage Systems Presentation Layer Systems
Data Warehouse Scope (dashed)
Excel Services
Pre
sen
tati
on
Dat
aP
rese
nta
tio
n D
ata
Data Warehouse Scenarios
No longer exclusive to large enterprises and specialists analystsGrowth of affordable self-service BI tools such as PowerPivot and Reporting Services has created a DW requirement for smaller businesses and individual departments
Microsoft Data Warehousing Offerings
Scalable and reliable SMP platform for data warehousing on any
hardware
Scalable and reliable platform for data
warehousing on any hardware
Reference architectures offering best price
performance for data warehousing
Appliance for high end MPP Data Warehousing delivering highest
scalability and performance
Ideal for data marts or small to mid-sized
enterprise data warehouses (EDWs)
Ideal for large data marts or mid-sized EDWs
Ideal for data marts or small to mid-sized data warehouses with scan-
centric workloads
Ideal for high scale or high performance data marts and
EDWs
Software onlyIntegrated Appliance
(Software and Hardware)
Reference Architectures(Software and Hardware)
DW Appliance(Fully integrated Software
and Hardware)
Scale-Up DW Scale-Up DW Scale-Up DW Scale-Out DW with MPP
10s of terabytes <5 terabytes 5–80 terabytes 10s - 100s of TB
Software Assurance; Premier Mission Critical
Support3-Year Support Plus 24
Software Assurance; Premier Mission Critical
Support
Mission Critical Advantage Program
Enterprise Fast Track Data Warehouse RABDW Appliance Parallel Data
Warehouse
Microsoft Data Warehouse Offerings
Effort to Build Very High Very Low Moderate Moderate Moderate Moderate Very Low
Capacity Variable 5 TB 14 TB 20 TB 40 TB 40 TB 500 TB
Concurrency Variable Light Light Medium Medium High Very High
Query Complexity Variable Medium Medium Medium Medium High Very High
Business Data Warehouse Appliance
Business Data Warehouse Appliance
Agile• Deploy in hours/days, not in
months• Easy to use through built-in
dedicated tools to load and manage your data warehouse
• Designed for up to 5TB data warehouses
• Fast Track 3.0 compliant, license path to Fast-Track
Complete• Hardware + Software +
Services• Pre-tuned, pre configured,
pre-installed. Turn on and go!
• Single point of contact for support
Optimized• Specifically for small to
medium data warehouse workload
• Designed for performance, energy efficiency, and value by HP and Microsoft’s best engineers
• Security and reliability built in
Scenarios
Small/Departmental Data Warehouse
Spoke in EDW Hub and Spoke Architecture
What’s in the Box
ServerHP ProLiant DL370 G6 x 5670
4U rack-mount unit or pedestal
2x Westmere processors (12 cores) 96 GB of RAM
Storage24 x internal SFF SAS disks 2 x Smart Array controllers and SAS expander2 TB physical user storage
2.7 – 5 TB compressed data
DDL and Data Load Wizard
Appliance Quick Deployment Wizard
• Name the appliance• Join a domain• Specify appliance administrators
• Create staging and production databases
• Load data• Check data fragmentation
HP Appliance-Specific Software
What’s in the Box: optimized software
Reference Architectures
Fast Track Data Warehouse Components
Software:• SQL Server 2008 R2 Enterprise• Windows Server 2008 R2
Configuration guidelines:• Physical table structures• Indexes• Compression• SQL Server settings• Windows Server settings• Loading
Hardware:•Tight specifications for servers, storage and networking•‘Per core’ building block
Processing
Networking
Server
Storage
Fast Track Solution Example
Intel Based Reference Architectures
2 Processor Configuration• Server: HP ProLiant DL380 G7 with 2x 6-core Intel
Xeon® 5680 Series CPUs• Storage server: MSA Storage• Scalability: 4 – 20 TB
4 Processor Configuration• Server: HP ProLiant DL 580 G7 with 4X 8-core Intel
Xeon® 7560 Series CPUs• Storage server: MSA Storage• Scalability: 20 – 40 TB
8 Processor Configuration• Server: HP ProLiant DL 980 G7 with 8X 8-core Intel
Xeon® 7560 Series CPUs• Storage server: MSA Storage• Scalability: 40 – 80 TB
Fast Track Hardware Partners
Solution Partners
SQL Server Parallel Data Warehouse
SQL Server Parallel Data Warehouse
Tier-1 Enterprise Data Warehouse Appliance OfferingHigh scalability from tens to hundreds of terabytesHigh performance through the MPP system
Flexibility and Choice Choice of deployment options through distributed architecture
Most Comprehensive SolutionComplete data warehouse solution spanning desktop, enterprise data warehouse, and data marts
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
CONTROL RACK DATA RACK
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
CONTROL RACK
Client connections always go through the control node
Contains no persistent user data
Parallel Data Warehouse advantages:
o Processes SQL requests
o Prepares execution plan
o Orchestrates distributed execution
Local SQL Server processes final query plan and aggregates results
Provided by DataDirect
o Open database connectivity (ODBC), object linking and embedding database (OLE DB), Java Database Connectivity (JDBC), and ActiveX® Data Objects (ADO.net) client drivers
o Wire protocol (SeQuel link)
o Drivers are available for 32 bits and 64 bits
CONTROL NODE
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
CONTROL RACK
Provides Support and Patching for the Appliance
Holds image for re-deployment of compute node
Holds Active Directory
MANAGEMENT NODE
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
CONTROL RACK
Provides high-capacity storage for data files from ETL processes
Is available as a sandbox for other applications and scripts that run on the internal network
Provides SQL Server Integration Services
LANDING ZONE
SourceLanding
Zone Files
Data Loader
Compute Nodes
DWLoader or SQL Server Integration
Services
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
CONTROL RACK
Provides Integrated Backup Solution
Integrates with 3rd party backup option
Orderable in different sizes
BACKUP NODE
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
DATA RACK
• Data Rack Servers 10 active + 1 passive
• HP ProLiant DL360 G7 compute nodes
• InfiniBand, FC and Ethernet switching, 42U rack
• Expansion Grow from 1–4 data racks, storage options, test/dev system
• Storage 10x HP StorageWorks MSA P2000 G3
• Consists of COMPUTE NODES and STORAGE NODES
SQL
DATA RACK
• Data Rack Servers 10 active + 1 passive
• HP ProLiant DL360 G7 compute nodes
• InfiniBand, FC and Ethernet switching, 42U rack
• Expansion Grow from 1–4 data racks, storage options, test/dev system
• Storage 10x HP StorageWorks MSA P2000 G3
COMPUTE NODE
Each MPP node is a highly tuned symmetric multi-processing (SMP) node with standard interfaces
Provides dedicated hardware, database, and storage
Runs SQL Server
Spare Node provides failover in case of node failure Drives are configured as RAID 1
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
Client Drivers
ETL Load Interface
Support/Patching
Corporate BackupSolution
CONTROL RACK DATA RACK
PDW – Client Connectivity
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
QUERY
CONTROL RACK DATA RACK
??????
???
??????
???
???
???
???
???
???
???
PDW – Query Processing
Replicated
A table structure exists as a full copy within each discrete Parallel Data Warehouse node.
Data Layout Approaches
Distributed
A table structure is hashed on a single column and uniformly distributed across all nodes on the appliance. Each distribution is a separate physical table in the database management system (DBMS).
Ultra Shared-Nothing
Provides the ability to design a schema of both distributed and replicated tables to minimize data movement between nodes. Small sets of data can be more efficiently stored in full (replicated). Certain set operations (such as single-node operations) are more efficient
against full sets of data.
Ultra Shared-Nothing Architecture
Extends Traditional Shared-Nothing Design Pushes shared-nothing architecture into the SMP node—there is IO and CPU affinity within SMP
nodeso Eliminates contention for user querieso Uses full resources for each user query
Provides multiple physical instances of tableso Distributes large tableso Replicates small tables
Redistributes rows as needed
Provides Fault Tolerance All hardware components have redundancy (including CPUs, disks, networks, power, and storage
processors) Control and compute nodes use failover clustering Management nodes have active and standby states
Administrative Console
https://controlnodeipaddress
Dashboard Query activity Load activity Backup and restore Active locks Active sessions Alerts Appliance state
Parallel Data Warehouse Configuration Manager
Appliance topology Services status Network configuration Privileges
Parallel database copy technology enables rapid data movement and consistency between EDW and data marts
Create SQL Server 2008 R2, Fast Track Data Warehouse,and SQL Server Analysis Services Data Marts
Supports user groups with very different service-level agreements (SLAs):• Performance• Capacity• Loading• Concurrency
Flexible Business Alignment
A distributed architecture gives you the flexibility to add or change diverse workloads or user groups while maintaining data consistency across the enterprise
Landing Zone
ETL Tools
Distributed Data Warehouse ArchitecturesDepartmental
Reporting
RegionalReporting
High-Performance ReportingCentral EDW Hub
RegionalReporting with
Business Decision
Appliance
Third-PartyRDBMS
Third-PartyData
Integration
Mobile Application
s
Determining the Right Solution
What is the workload? Number of concurrent users Query complexity Query mix Load processing Performance requirements
What is the customer looking for in a solution? Simplicity in the appliance 100 percent compatibility with SQL Server 2008 R2 Enterprise scalability Economical hardware Incremental expansion and high availability by default
Parallel Datawarehouse
Enterprise-class scalability to hundreds of terabytes High performance Interoperability with leading BI products Mission critical support and maintenance Mature SQL Server platform with high security and robust engineering
process Strong data warehouse vision and roadmap that includes industry-leading
technologies
Value to Customer
Supporting Features MPP with ultra shared-nothing architecture Distributed query optimization Balanced hardware with pre-tested and pre-tuned appliances optimized for data
warehousing Third-party product integration (for example, Microstrategy, Business Objects, and
Informatica) Mission critical support and maintenance Road map includes column store, petabyte scalability, real-time data warehousing, MDM, and
data quality
Related Content
Breakout SessionsDBI312 - ColumnStore Indexes UnveiledDBI332 – All Up Data WarehouseDBI320 – Upsizing and Modernizing with the Microsoft BI Stack and FT Data WarehouseDBI308 – Fast Track Data Warehouse 3.0 new Features and Best PracticesDBI301 – Microsoft SQL Server Reference Architecture and Appliances
Database & Business Intelligence Product Booths
Go to Microsoft Learning Booth for Certification and Training Guidance
Find Me Later At…Sessions, Booth, …
Database Platform (DAT) Resources
Try the new SQL Server Mission Critical BareMetal Hand’s on-Labs
Visit the updated website for SQL Server® Code Name “Denali” on www.microsoft.com/sqlserver and sign to be notified when the next CTP is availableFollow the @SQLServer Twitter account to watch for updates
Visit the SQL Server Product Demo Stations in the DBI Track section of the Expo/TLC Hall. Bring your questions, ideas and conversations!
• Microsoft® SQL Server® Security & Management • Microsoft® SQL Server® Optimization and Scalability• Microsoft® SQL Server® Programmability • Microsoft® SQL Server® Data Warehousing• Microsoft® SQL Server® Mission Critical • Microsoft® SQL Server® Data Integration
Resources
www.microsoft.com/teched
Sessions On-Demand & Community Microsoft Certification & Training Resources
Resources for IT Professionals Resources for Developers
www.microsoft.com/learning
http://microsoft.com/technet http://microsoft.com/msdn
Learning
http://northamerica.msteched.com
Connect. Share. Discuss.
Complete an evaluation on CommNet and enter to win!
To access more details
on this session,
capture this TAG
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to
be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS
PRESENTATION.