Betsy Blythe and Lore Balkan Virginia Tech

Post on 10-Jan-2016

36 views 0 download

Tags:

description

Enterprise Data in Jail. A Problem with a Solution. Betsy Blythe and Lore Balkan Virginia Tech. 0010111001. The Problem. Data in “ERP Jail”. Data structures difficult to understand and inefficient to access for analysis and reports Data values change so point-in-time data lost - PowerPoint PPT Presentation

Transcript of Betsy Blythe and Lore Balkan Virginia Tech

Betsy Blythe and Lore BalkanVirginia Tech

Enterprise Data in Jail

A Problem with a Solution

The Problem

0010111001

– Data structures difficult to understand and inefficient to access for analysis and reports

– Data values change so point-in-time data lost

– Growing backlog of report requests

• Data in “ERP Jail”

The Solution

• Initial charge – Build a data warehouse

• Initial vision – Create business view of administrative data for Virginia Tech

Transactional ERP System

The Solution

Data Warehouse

User

A Data Access Architecture

The Solution

Laying the Foundation

• Staffing

– DBA– Data Administrator– Data Warehouse Architects– Training Coordinators– Web Application Developers

• Other Resources

– Hardware– Software

Laying the Foundation

• Planning

– Surveyed other institutions

– Did site visits and interviews

– Established scope

– Identified first subject area

– Drafted project plan

– Delivered management briefings

Laying the Foundation

• Staff Education and Training

– Data Warehouse Institute

– Ralph Kimball

Getting Started

• Focused on Finance

• Delivered Finance Reports for ERP

• Learned Finance data

• Built relationships and trust

• Evolved a shared vision for warehouse

Building the Data Warehouse

• Strategy

– Build by subject area

– Develop iteratively

– Design for enterprise

Building the Data Warehouse

• Design

– Star Schema

– Time Dimension

– Transaction Detail

– Surrogate Keys

– Conformed Dimensions

– Slowly Changing Dimensions

FactsDimensions

Org Fiscal Qtr Account Amount Encumbered BalanceDept Y 1 Travel $1,000 $0 $60,000Dept Y 1 Supplies $500 $0 $59,500Dept Y 2 Software $10,000 $10,000 $49,500Dept Y 2 Phones $250 $0 $49,250Dept Z 1 Travel $8,000 $0 $100,000Dept Z 1 Supplies $100 $0 $99,900Dept Z 2 Software $80,000 $80,000 $19,900Dept Z 2 Phones $500 $0 $19,400

The Design: Multidimensional

The Design:Dimensions

The Design:Facts

The Design:Star Schema

Conformed Dimensions

PayrollFacts

FinanceFacts

TimeDimension

OrgDimension

EarningsType

Dimension

EmployeeDimension

AccountDimension

FundDimension

3 Techniques

• overwrite changed attribute

• add new dimension record

• use field for ‘old’ value

The Design

Managing Change … Slowly Changing Dimensions

The Design

• Standards

– Names meaningful andstandardized

– Indicators simplify queries

– Code descriptions storedwith codes

– Business descriptions available with data

The Design

• Special Features

– External data may be included

– Derivations and calculations included

– Summary and aggregations may be included

– History is built by design

• Project Agreement

– Signed “blueprint” for the data mart

– Explains sponsorship and roles

– Details data requirements

– Identifies development team

– Identifies pilot users

– Lists key tasks and dependencies

Building the Data Warehouse

Building the Data Warehouse

• Data Mart Development Team *

– 2 IWA developers

– Functional area technical expert

– Functional area business/data expert

– Functional area key user

* Meets Weekly– Meeting Minutes document the process

• Development Process

– Data model design (ERwin)

– Source-to-target mapping

– Business definitions

– ETL development / testing (DataStage)

Building the Data Warehouse

• Development Process

– Data verification

– Process control checks

– Pilot user training

Building the Data Warehouse

• Data Access Strategy

– Stewardship same as ERP

– ERP security definitions leveraged

– Warehouse security built as part of ETL

– Training precedes access

Building the Data Warehouse

The Result

ERPData

Other DataSources

DataWarehouse

ReadyFor

Access &Query

ExtractTransform

Load

RunsEveryNight

ProcessChecks

The Result41 ERP Tables 1 Warehouse Table

EMPLOYEE_STATUS_DIMENSION

Provost’s Request:

Report showing employee id, name, current hire date, gender, ethnicity, rank and tenure for all full-time minority faculty

The Result:Query Example

select spriden_id, concat(spriden_last_name,concat(', ',concat(spriden_first_name,concat(' ', spriden_mi)))), to_char(pebempl_current_hire_date,'DD-MON-YYYY'), decode(spbpers_sex,'M','Male','F','Female'), stvethn_desc, ptrrank_desc, ptrtenr_desc from spriden, spbpers, pebempl, stvethn, perrank a, ptrrank, perappt c, ptrtenrwhere pebempl_empl_status = 'A' and pebempl_ecls_code in ('2A','2B','2C','2F','2G','2H','2K','2L', '3A','3B','3C','3D','3H','3I','3J','3M') and pebempl_pidm = spbpers_pidm and (spbpers_sex = 'F' or spbpers_ethn_code != '1') and pebempl_pidm = spriden_pidm and spriden_change_ind is null and spbpers_ethn_code = stvethn_code and pebempl_pidm = a.perrank_pidm and a.perrank_action_date = (select MAX(perrank_action_date) from perrank b where b.perrank_pidm = a.perrank_pidm) and a.perrank_rank_code = ptrrank_code and pebempl_pidm = c.perappt_pidm and c.perappt_action_date = (select max(perappt_action_date) from perappt d where c.perappt_pidm = d.perappt_pidm) and perappt_tenure_code = ptrtenr_code

The ERP Query

select ssn_fin_num, current_full_name, salary_hire_date, gender_desc,

ethnicity_desc, rank_desc, tenure_descfrom employee

where current_record_ind = 'Y' and active_employee_ind = 'Y' and faculty_ind = 'Y' and full_time_ind = 'Y' and (gender_code = 'F' or ethnicity_code != '1')

The Warehouse Query

The Result

Human ResourceHuman Resource

General PersonGeneral Person

FinanceFinance

StudentStudent AlumniDevelopment

AlumniDevelopment

Futuredata mart

Futuredata mart

Finance– Operating Ledger– General Ledger– Foundation– Accts Receivable

Human Resource– Employee– Job– Job Funding– Position– Position Allocation– Payroll

Alumni– Alumni Giving

The Result

• Metadata System– Business definitions maintained by Data Experts

– Business definitions stored with the data

– Data models and business definitions on the Web

The Result

• The VT Data Warehouse Users

– 900 Finance

– 400 VT Foundation

– 67 HR

– 10 Alumni

* See fact sheet handout

The Result

• A Data Architecture

– Structured for query

– Access by any ODBC or Oracle client

– Designed to include history

– Focus on the user

– Provides a stable business view of the data

The Result

• Query and Reporting Tools

– Web Enabled / Client Server

– Metadata stored with the data

– Appropriate to the user skill set

– Appropriate to the user need

Lessons Learned

• Functional area sponsorship is critical

• Analysis paralysis can be a problem

• The devil is in the details

• Let the ERP settle first

• Data verification is time consuming

• Canned reports sell the warehouse

• 24/7 availability is expected

• Success breeds demand

Lessons Learned

• Don’t lose sight of the reasons for creatingthe data warehouse

– Empower users to become self-sufficient

– Prevent users from impacting the production system

– Reduce interrupt-driven informationrequests to IS

– Summarize data for trend analysisand data retention

• Hardware – SUN E4500 w/ 8 CPUs, 8 GB RAM, 480 GB Disk Space

• Software – Solaris 2.7 Oracle 8.1.7 Erwin – data models Ascential DataStage – ETL Brio – web reports, ad hoc query SQR – reporting Perl – metadata interface

Resources