Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data...
-
date post
20-Dec-2015 -
Category
Documents
-
view
212 -
download
0
Transcript of Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data...
![Page 1: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/1.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Building and ImplementingIntegrated Data ModelsNancy Wills, Director, Access, Query and Data MgmtRalph Hollinshead, Manager, Solutions Data Integration
![Page 2: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/2.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Overview
Part One: Building an Integrated Data Model
Part Two: Deploying and Scaling the Data Architecture
![Page 3: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/3.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
SAS® Banking Intelligence Solutions Framework
Customer Retention
Customer Retention
X SellUp sellX Sell
Up sell
MarketingAutomationMarketing
Automation
CreditScoringCredit
Scoring
Credit RiskCredit Risk
Banking Intelligence ArchitectureBanking Intelligence Architecture
Strategic Performance Management
Strategic Performance Management
INTEGRATED EXTENDABLE ARCHITECTURE
FOCUSED ON BUSINESS ISSUES
BASED ON EXPERIENCE
INTEGRATED EXTENDABLE ARCHITECTURE
FOCUSED ON BUSINESS ISSUES
BASED ON EXPERIENCE
New Solutions
New Solutions
![Page 4: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/4.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
SAS® Cross-Sell and Up-Sell for BankingSAS® Customer Retention for Banking
SAS® Credit Scoring for Banking
Solution Data MartsExtract and Cleanse Files
EnterpriseSource
Systems
Independent Solutions
Solutions
SAS® Credit Risk Management
![Page 5: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/5.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Integrated Data Model: Not All Customers are the Same
Customer A: No Data Warehouse• Interested Multiple SAS Solutions
Customer B: With Data Warehouse• Adverse to Data Replication Issues
Customer C: With Data Warehouse• No Data Marts allowed – Active Data Warehousing Approach
![Page 6: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/6.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Customer A: Full SAS Data Architecture
1
2
2
Solution Data Marts
Extract and Cleanse Files
EnterpriseSource
Systems
Solutions
SAS® Cross-Sell and Up-Sell for Banking
SAS® Customer Retention for Banking
SAS® Credit Scoring for Banking
SAS® Credit Risk ManagementSAS Banking Detail Data Store
Flexible Options to Meet Customer Needs!
![Page 7: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/7.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Customer B: Partial SAS Data Architecture
1
2
2
Solution Data Marts
Extract and Cleanse Files
EnterpriseSource
Systems
Solutions
SAS® Cross-Sell and Up-Sell for Banking
SAS® Customer Retention for Banking
SAS® Credit Scoring for Banking
SAS® Credit Risk ManagementCustomer Enterprise Data Warehouse
Flexible Options to Meet Customer Needs!
![Page 8: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/8.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Customer C: Customer Data Architecture
Information Maps
Extract and Cleanse Files
EnterpriseSource
Systems
Solutions
SAS® Marketing Automation
Customer Enterprise Data Warehouse
![Page 9: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/9.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Scorecard for Data Architecture ApproachData Management Issue Score
Sensitivity to Data Replication -0-5
Sensitivity to H/W processor and storage budget -0-5
Existing warehouse quality -0-5
Implementation time constraints -0-5
Intentions to implement >1 SAS solution +0-5
Historical data requirements +0-5
Score Decision
-25 No DDS. Marts only if absolutely necessary. Information maps may be appropriate.
0 Use DDS to persist current extract from source systems. Marts hold multiple extracts up to full history.
+25 Implement full warehouse, persist history in DDS and as much as wanted in the marts.
![Page 10: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/10.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Techniques for Data Model Integration
Detail Data Store• Varying Industries
• General Standards
• Warehousing Techniques
Data Marts• Approach Compared to DDS
![Page 11: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/11.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Banking- Accounts
- Account Transactions, etc.
Telco- Subscriptions
- Equipment- Networks-Calls, etc.
Insurance- Premiums
- Claims- Benefits, etc.
CustomerSupplier
EmployeeGL
AccountProduct
etc.
Integrating Models at the Industry Level
![Page 12: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/12.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Detail Data Store Standards Needed for Integration
Data Types / Lengths / Classifier Codes
Naming Conventions
Standards for Data Structures• Hierarchies
• Subtypes
• Reference Data
![Page 13: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/13.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Data Administration StandardsDomain
Data Type
Width
Applicable Class Codes
Comment/Example
Identifier Varchar 32 ID Typically the identifier from the source system.
Small Code Varchar 3 CD Short length codes such as ADDRESS_TYPE_CD
Medium Code Varchar 10 CD Medium length codes such as EXCHANGE_SYMBOL_CD
Large Code Varchar 20 CD Long length codes such as POSTAL_CD
Standard Count Code Numeric 6 CNT Standard counts such as AUTHORIZED_USERS_CNT
Name Varchar 40 NM Proper name. For example, LAST_NM, FIRST_NM, etc.
Short Length Text Varchar 20 TXT Short freeform text.
Medium Length Text Varchar 100 TXT, DESCLonger freeform text and descriptions associated with code tables.
Indicator Field Character 1 FLG Binary indicatory flag (Y or N).
Surrogate Key Numeric 10 RK, SK Generated surrogate keys.
Currency Amount Numeric 18,5 AMT Standard currency amount.
Rates and Percentages
Numeric 9,4 PCT, RT For example, exchange rates.
DateTime Date DT, DTTM Accommodate dates as well as date/time.
![Page 14: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/14.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Detail Data Store: Data Warehousing StandardsSurrogate Keys, Point-in-Time, and Rapidly Changing Data
CUSTOMER_RK VALID_FROM_DT VALID_TO_DT ACCOUNT_RK MARITAL_STATUS_CD FIRST_NM LAST_NM
100 01JAN1999 29FEB2000 201 S John Smith
100 01MAR2000 31DEC4747 201 M John Smith
ACCOUNT_RK VALID_FROM_DT VALID_TO_DT CUSTOMER_RK FINANCIAL_ACCOUNT_TYPE_CD OPEN_DT
201 01JAN1999 31DEC4747 100 SAVINGS 01JAN2000
CUSTOMER
FINANCIAL_ACCOUNT
ACCOUNT_RK VALID_FROM_DT VALID_TO_DT BALANCE_AMT CURRENCY_CD
201 01JAN1999 31JAN1999 2500.75 USD
201 1FEB1999 28FEB1999 4300.25 USD
FINANCIAL_ACCOUNT_CHNG
![Page 15: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/15.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Conformed Dimensions
![Page 16: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/16.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Tools: Extending ModelsCUSTOMER
EXTERNAL_ORG
SUPPLIER
INTERNAL_ORG
INTERNAL_ORG_ASSOC
INTERNAL_ORG_ASSOC_TYPE
COMPETITORS
![Page 17: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/17.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Change Analysis Tool
![Page 18: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/18.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Deploying the Integrated Data Architecture
![Page 19: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/19.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Option A: Full SAS Data Architecture
1
2
2
Solution Data Marts
Extract and Cleanse Files
EnterpriseSource
Systems
Solutions
SAS® Cross-Sell and Up-Sell for Banking
SAS® Customer Retention for Banking
SAS® Credit Scoring for Banking
SAS® Credit Risk ManagementSAS Banking Detail Data Store
Flexible Options to Meet Customer Needs!
![Page 20: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/20.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Populate DDS and Data Mart
Flat File
Step 1 - Extract cleanse and transform from source data into flat file
Data WarehouseDDS
Step 2 – ETL processing to load data warehouse•data validation•key creation•slowly changing dimensions
Banking Data Mart
Step 3 - Transform into data mart model
ExcelExcel
SASSAS
SAPSAPOracleOracle
PeopleSoftPeopleSoft
Source Data
![Page 21: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/21.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Deployment Focus
Scalability and Performance
ETL flows
Physical data model
![Page 22: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/22.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Deployment What did We do?
Create and Generate Data
Deploy Hardware and Software
Populate DDS
Populate Data Mart
Analyze ETL Flows
Analyze DDS Model
Change Management
![Page 23: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/23.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
It All Starts with Data
Bought and Built Data Generators
Built Simulated Data
Applied Business Rules
Scaled - 5 gig -> 50 gig -> 500 gig -> 1TB
![Page 24: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/24.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Deploy Hardware and Software
Choose Software Components• SAS for the DDS or Data Warehouse
• Databases for the DDS or Data Warehouse
• SAS for the Data Marts
Install and Configure SAS Software
Configure Hardware
Design for Progressive Larger Deployment Growth
![Page 25: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/25.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Windows Server
*Dell PowerEdge 1600SC
Windows 2003
DualHyper-threaded 2.8 Ghz processors
4 GB RAM
4 internal IDE drives60 GB C drive 275 GB D drive
Single I/O channel
5gig -> 50gig of Data
![Page 26: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/26.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
AIX UNIX Servers
IBM P630 eServer
AIX 5.3
4 processors
4 I/O channels
8 GB RAM
4x72 GB disks
14-drive SCSIS storage array
IBM P670 eServer
AIX 5.3
16 processors
8 - 1gig fiber I/O Channels
Dynamic logical partitioning
2 TB disks
50gig -> 500gig 5500gig -> 1TB of Data
![Page 27: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/27.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Populate DDS and Data Mart
Ran ETL Flows• Registered in SAS Metadata Repository
• Loaded Data into Tables
• Use Slowly Changing Dimension Load Process
Analyze ETL Flows
![Page 28: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/28.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Example of SAS ETL Studio Flow Analysis
![Page 29: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/29.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Change Management
Loaded New Release of DDS in TST Repository
Compared PRD Repository to TST Repository
Ran Batch Reports to Examine Differences.
Ran Impact Analysis on Column and Table
![Page 30: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/30.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
What Did We Find
Specific Techniques that Work Best
Recommendations
Tremendous Performance Gains!
![Page 31: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/31.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Specific Techniques Examples
ETL Flows
Parallel ETL flows
SAS coding techniques to use
Use hash table instead of look up
Make sure the I/O buffer size is tuned
Drop constraints
![Page 32: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/32.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Specific Techniques Examples
DDS Model
Indexes – when and when not to add
Denormalized some tables
Separate tables for data with high volume changes
Partition data by usage (date ranges)
![Page 33: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/33.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Recommendations
Debugging techniques
Sorting and memory usage
Joins
Understand disk requirements
I/O optimization
Compression and performance
![Page 34: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/34.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Above All
Write ETL
Test, Tune
Test, Tune
Test, Tune!!!!
![Page 35: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/35.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Summary and Conclusions
Data integration is key
Different approaches for customers
Change management is vital
Performance tuning is vital
Technology evolving
![Page 36: Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.](https://reader030.fdocuments.net/reader030/viewer/2022032800/56649d435503460f94a20036/html5/thumbnails/36.jpg)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Questions?