Data Warehouse Project
-
Upload
tanu-srivastav -
Category
Data & Analytics
-
view
93 -
download
0
Transcript of Data Warehouse Project
DATA WAREHOUSING
CHSI PROJECT MODULE
Team 1:
Abhinav Garg (11761380)
Tanu Srivastav (11772446)
Tejbeer Chhabra (11756746)
Table of Contents
Executive Summary ............................................................................................................... 1
MDX Queries and their output ............................................................................................... 2
DMX Queries: Mining Model: ................................................................................................ 7
Appendix ............................................................................................................................. 10
1
Executive Summary With increasing amunt of Data, the need to store and find information from the data becomes
very cruical. With the use of On-Line Analytical Processing (OLAP) technology, we can not only
store large amout temporal data but also perfrom several business intelligence operations. CHSI
data provided has more than 5 and half million records. The aim of the project is to find
meanigful insight which could bring innovation in rural health systems.
1. OLAP Cube and Queries By executing MDX queries on OLAP Cube, We found several interesting figures and facts,
which are of buiness importance. We discovered how Formulatory type effects charge
quantity, wich year is making most profit on medication, what are profitable caresettings,
what is infusion time for various IV route medication and what are the most frequently
ocuuring discontinue reasons. These findings could be of mangerial importance to the
Hospital administration. As one of the objective of our project was to learn MDX queries,
we have implemented several different MDX functions such as TOPCOUNT, CROSSJOIN,
NON EMPTY, HEAD, FILTER, and SUBSET in our queries.
2. Data Mining We have build mining structure using DMX queries as well as using Visual Studio
Analytical Services. Data Mining Models could be build on these structure, which allows
us to predict what would be the discontinue reason for the medication. This could give us
information about the effectiveness of the medication. We have implemented three Data
Mining Algorithm namely, Decision Tree, Neural Network and Regression.
2
MDX Queries and their output
3
4
5
6
7
DMX Queries: Mining Model: 1. Creating Mining Structure
2. Creating Mining Model
8
3. Variable Importance:
Most important variable is found to be Infusion Time. Therefore, we can say that discontinuing a
medication could be predicted by the infusion time of medication.
4. Lift Chart Model comparison:
9
From the above table we can say, the models fits well to the data set and are reasonably of similar
strength.
10
Appendix 1. Partitions
Data partition is based on Date ID. We have made 4 partitions for the entire dataset.
2. Aggregations
3. Calculated Measure:
1. Since we have Unit Cost and Unit Price, we have made a calculated measure “NET_PROFIT”,
which is Unit Price – Unit Cost.
2. For data modeling, we have created a calculated measure called “Target_Discontinue”. This is
coded in binary format of 0’s and 1’s. When the discontinue reason is for a positive change in
Patient’s health then it is coded as 1 else 0.