The Importance of the ETL Process
-
Upload
learnitfirstcom -
Category
Technology
-
view
646 -
download
1
description
Transcript of The Importance of the ETL Process
p. 1 1
1
Chapter: SQL Server 2012 Integration Services Course: SQL Server 2012 - A Comprehensive Introduction Course ID: 170 Instructor: Scott Whigham
Chapter 16: Video # 2
The Importance of the ETL Process
p. 2 2
2
SQL Server 2012 Integration Services (SSIS) is Microsoft’s ETL tool
– Extract, Transform, and Load
p. 3 3
3
Most businesses have data in more than one format
–How does one business happen to use so many different databases?
p. 4 4
4
Let’s walk through a likely scenario and see how this happens:
–2001: The “AdventureWorks” company launches a web store to complement its brick-and-mortar stores
• ASP-based website
• SQL Server 2000 backend
• Customers are encouraged to phone questions in or to send an email
p. 5 5
5
Things change... – 2001: Launch with SQL 2000
–2003: AdventureWorks buys a competitor
• Competitor used a PHP/MySQL ticketing system
• AW mgmt chooses to adopt this system for customer ticketing rather than build/buy an alternative
p. 6 6
6
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
p. 7 7
7
Needs change... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
–2004: The company is growing – time for more “stuff”:
• A PHP/MySQL project management system is installed
• A marketing mailer application with contact mgmt is purchased
p. 8 8
8
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
2004 Project Management MySQL 4.0
2004 Marketing mailer MS Access
p. 9 9
9
Markets change... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
– 2004: PHP/MySQL 4.0 project management
–2005: A new ASP.NET website is rolled out with a SQL Server 2005 backend
• Major upgrade from SQL Server 2000 -> 2005
p. 10 10
10
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
2004 Project Management MySQL 4.0
2004 Marketing mailer MS Access
2005 Website upgrade MS SQL Server 2005
p. 11 11
11
Trends change... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
– 2004: PHP/MySQL 4.0 project management
– 2005: Upgraded website to SQL 2005
–2008: Website sales popularity causes “growing pains”
• A new supply chain management app purchased
• A new employee management/HR/payroll package is purchased
p. 12 12
12
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
2004 Project Management MySQL 4.0
2004 Marketing mailer MS Access
2005 Website upgrade MS SQL Server 2005
2008 Supply chain mgmt MS SQL Server 2008
2008 Employee/HR/Payroll DB2
p. 13 13
13
The world grows smaller... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
– 2004: PHP/MySQL 4.0 project management
– 2005: Upgraded website to SQL 2005
– 2008: Added supply chain mgmt and HR/payroll packages
–2010: Website sales continue to gain popularity, particularly overseas
• A new shipping database is purchased
• Employee expenses are now tracked in custom MS Excel spreadsheets
p. 14 14
14
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
2004 Project Management MySQL 4.0
2004 Marketing mailer MS Access
2005 Website upgrade MS SQL Server 2005
2008 Supply chain mgmt MS SQL Server 2008
2008 Employee/HR/Payroll DB2
2010 Shipping *.csv file downloaded monthly
2010 Employee expense tracking MS Excel
p. 15 15
15
It’s 2012 and company executives + management have been playing a game lately...
– You know this one, don’t you?
p. 16 16
16
p. 17 17
17
The world grows smaller... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
– 2004: PHP/MySQL 4.0 project management
– 2005: Upgraded website to SQL 2005
– 2008: Added supply chain mgmt and HR/payroll packages
– 2010: New shipping database, employee expense tracking
–2012: Executives want a B.I. solution
• You name it, they want it
• But... – there’s no budget for software purchases...
p. 18 18
18
No budget for new software = more opportunities for you!
– You decide:
• ... to create a relational OLAP data warehouse to store all the company’s historic data in a unified way
• ... to create a multidimensional database with multiple cubes (to facilitate fast browsing of analytics)
• ... to install Excel 2013 on all CxO and management machines, and to teach them how to build pivot tables and pivot charts
• ... to investigate Reporting Services as a way to build internal web dashboards and subscription-based reporting
– On-the-job experience, here we come!
p. 19 19
19
The company data is all “loosely connected”
– A customer makes a small order via the website
– The same customer submits a “Help!” ticket
– Customer rep. has to make an order for a replacement part
– Sales person takes customer to an entertainment event
– Customer now makes a large order
– Key question: how did we acquire this customer?
p. 20 20
20
Integration Services is your ETL tool
1. You Extract the data from the source to a staging area • Optional, but typically an MS SQL Server relational
database
2. You make any changes to the data (a.k.a. a Transformation) • Either in motion or in the staging area
3. You Load the data into the relational data warehouse
4. You process the cube(s)
– SSIS is your “one stop shop” for all of this!
p. 21 21
21
Your final step is to build a dashboard
– Reporting Services or PowerPivot?
– Power View or Excel?
– SharePoint or email?
– On-demand or subscription-based?
p. 22 22
22
Your dashboard is a hit!
p. 23 23
23
In the next video…
–How to Install and Configure SSIS 2012
“A painter paints pictures on canvas. But musicians paint their pictures on silence.”
- Leopold Stokowski