STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05...
Transcript of STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05...
![Page 1: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/1.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
1
STRATEGIC INFORMATION SYSTEMS IV
STV401T / B
BTIP05 / BTIX05 - BTECH
DEPARTMENT OF INFORMATICS
LECTURE: 05 (A)
DATA WAREHOUSING (DW)
By: Dr. Tendani J. Lavhengwa
![Page 2: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/2.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
Inspirational Quotes
• My personal quote:
“Always be a thought ahead. Do not fear the blank page, everything started somewhere”
• Quotes to consider as inspiration:
"Errors using inadequate data are much less than those using no data at all" ~ Charles
Babbage
“One is too small a number to achieve greatness” ~ John C. Maxwell
• Your quotes?
???
LECTURE: 05 - DATA WAREHOUSING (DW)
![Page 3: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/3.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
1. Literature in context and evolution
2. Data Warehousing (DW) - multiple definitions
3. Data Warehousing concepts
4. Data Warehousing fundamental characteristics
5. Key points from business on Data warehouses (IBM, 2018)
6. Traditional integration, Data Warehouse vs. Operational DBMS
7. OLTP systems vs. Data Warehouse
8. Application-Orientation vs. Subject-Orientation
9. Data Warehouse Models
10. Modelling of Data Warehouse – dimensions and measures
11. Organising data for Data Warehouses
12. Data Mart Centric
13. Data Warehouse Architecture - Base
14. Extract, Transform and Load (ETL) process
#. Start-up Items to discuss
LECTURE: 05 - DATA WAREHOUSING (DW)
![Page 4: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/4.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
1. Literature in context and evolution
data warehouse architecture
• -was born in the 1980s as an architectural model designed to support the flow of data from operational systems to decision support systems.
1988...
• Data Warehousing really saw its genesis in the late 1980s. An IBM Systems Journal article published in 1988, An architecture for a business information system, coined the term “business data warehouse,” although a future progenitor of the practice, Bill Inmon, used a similar term in the 1970s.
Later in the 1990s
• Inmon developed the concept of the Corporate Information Factory, an enterprise level view of an organization’s data of which Data Warehousing plays one part.
Russom (2015)
• Data warehouse architecture is being influenced by business practices and goals that continue to evolve. The reason: a well-aligned data warehouse reflects the business it serves.
LECTURE: 05 - DATA WAREHOUSING (DW)
![Page 5: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/5.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
2. Data Warehousing (DW) - multiple definitions
Turban et al. (2011)
• -a pool of data produced to support decision making
• -a repository of current and historical data of potential interest to managers througout the organisation
Bocij et al. (2015)
• Large database systems containing current and historical data that can be analysed to produce information to support organisational decision making
IBM.com (2018)
• databases provide a decision support system (DSS) environment in which you can evaluate the performance of an entire enterprise over time
William H. Inmon...
• -a subject-oriented, integrated, time-variant and nonvolatile collection of data that supports management's decision-making process.
Others...
• -a collection of corporate information and data derived from operational systems and external data sources.
• -designed to support business decisions by allowing data consolidation, analysis and reporting at different aggregate levels.
• -a federated repository for all the data that an enterprise's various business systems collect.
• -the Data Warehouse repository may be physical or logical.
LECTURE: 05 - DATA WAREHOUSING (DW)
![Page 6: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/6.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
3. Data Warehousing concepts
LECTURE: 05 - DATA WAREHOUSING (DW)
Data Warehouses are aimed at decision making
Data is populated into the DW through the processes of extraction, transformation and loading.
Data warehouse databases are optimized for data retrieval.
extraction, transformation and loading (ETL) - add figure
![Page 7: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/7.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
4. Data Warehousing fundamental characteristics (1 of 2)
LECTURE: 05 - DATA WAREHOUSING (DW)
• -- data organised by detailed subject, only relevant for decision support
• --eg. sales, products, customer
-subject orientated -
• --is closed related to subject orientation
• -- DW must place data from different sources into a consistent format
• --presumed to be totally integrated
-integrated -
• --maintains historical data
• -the data does not necessarily provide current status (except for real-time systems)
• -they detect trends, deviations and long term relationships for forecasting and comparisons leading to decision making
-time variant -
• --once data is on the DW, users cannot change or update the data
• --Obsolete data are discarded and changes are recorded as new data
-non-volatile -
![Page 8: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/8.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
4. Data Warehousing fundamental characteristics (2 of 2)
LECTURE: 05 - DATA WAREHOUSING (DW)
Additional characteristics
-Web based
-Relational / Multidimensional
-client / server
-Real-time
-Include metadata
![Page 9: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/9.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
5. Key points from business on Data warehouses (IBM, 2018)
LECTURE: 05 - DATA WAREHOUSING (DW)
-A database that is optimized for data retrieval to facilitate reporting and analysis.
-A data warehouse incorporates information about many subject areas, often the entire enterprise.
-Typically you use a dimensional data model to design a data warehouse.
-The data is organized into dimension tables and fact tables using star and snowflake schemas.
-The data is denormalized to improve query performance.
![Page 10: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/10.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
6. Traditional integration, Data Warehouse vs. Operational DBMS
LECTURE: 05 - DATA WAREHOUSING (DW)
Data Warehouse vs. Operational DBMS
Traditional heterogeneous DB integration A query driven approach
Data Warehouse: update-driven, high performance
![Page 11: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/11.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
7. OLTP systems vs. Data Warehouse
LECTURE: 05 - DATA WAREHOUSING (DW)
![Page 12: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/12.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
8. Application-Orientation vs. Subject-Orientation
LECTURE: 05 - DATA WAREHOUSING (DW)
![Page 13: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/13.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
9. Data Warehouse Models
• Collects all the information about subjects in the entire organisation/enterprise
Enterprise Warehouse
• A subset of corporate-wide data that is of value to a specific group of users
• Example: Marketing, Sales, Finance Data Mart
• A set of views over operational databases
• Only some of the possible summary views may be materialised
Virtual Warehouse
LECTURE: 05 - DATA WAREHOUSING (DW)
![Page 14: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/14.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
10. Modelling of Data Warehouse – dimensions and measures
LECTURE: 05 - DATA WAREHOUSING (DW)
• --a fact table in the middle connected to a set of dimension tables
Start schema
• --a refinement of star schema where some dimensional hierarchy is normalised into a set of smaller dimension tables, forming a shape similar to snowflake
Snowflake schema
• --multiple fact tables share dimension tables, viewed as a collection of stars, therefore called galaxy schema or fact constellation
-Fact constellations
![Page 15: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/15.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
11. Organising data for Data Warehouses
LECTURE: 05 - DATA WAREHOUSING (DW)
sample snowflake schema with
DAILY_SALES table as the fact table data mart with the
DAILY_SALES fact table
The data is organized into: -dimension tables -fact tables (using star and snowflake schemas)
![Page 16: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/16.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
12. Data Mart Centric
LECTURE: 05 - DATA WAREHOUSING (DW)
![Page 17: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/17.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
13. Data Warehouse Architecture - Base
LECTURE: 05 - DATA WAREHOUSING (DW)
Two-Tier Data Warehouse Architecture
Web-based Data Warehouse Architecture
Three-Tier Data Warehouse Architecture
![Page 18: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/18.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
14. Extract, Transform and Load (ETL) process
LECTURE: 05 - DATA WAREHOUSING (DW)
a process in database usage and especially in data warehousing
Extracts
• data from homogeneous or heterogeneous data sources
• Extracting the data from different sources – the data sources can be files (like CSV, JSON, XML) or RDBMS etc
Transforms
• the data for storing it in proper format or structure for querying and analysis purpose
• Transforming the data – this may involve cleaning, filtering, validating and applying business rules
Loads
• it into the final target (database, more specifically, operational data store, data mart, or data warehouse)
• Loading – data is loaded into a data warehouse or any other database or application that houses data
Cleaning (e.g. “Male” to “M” and “Female” to “F” etc.)
Filtering (e.g. selecting only certain columns to load)
Enriching (e.g. Full name to First Name , Middle Name , Last Name)
Splitting a column into multiple columns and vice versa
Joining together data from multiple sources
Some activities carried out at "Transforming" stage:
![Page 19: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018 · LECTURE: 05 - DATA WAREHOUSING (DW) -A database](https://reader036.fdocuments.net/reader036/viewer/2022070906/5f768b7b3a90b5301d0278f4/html5/thumbnails/19.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
19
QUESTIONS & ENQUIRIES
---
LECTURE: 05 (A) - DATA WAREHOUSING (DW)